Predicting Stock Prices with LSTMs: One Mistake Everyone Makes (Episode 16)

Поделиться
HTML-код
  • Опубликовано: 16 май 2024
  • VIP DISCOUNT for Financial Engineering and Artificial Intelligence in Python: deeplearningcourses.com/c/ai-...
    VIP DISCOUNT for PyTorch: Deep Learning and Artificial Intelligence: deeplearningcourses.com/c/pyt...
    VOTE for my next course (note: if you already got this survey, you don't have to do it again):
    forms.gle/iEWwGjGsYGc4hLaA9
  • НаукаНаука

Комментарии • 97

  • @MZak-js7oy
    @MZak-js7oy Год назад +18

    You pinpointed exactly what I was wondering about. As a person who worked in the financial market for more than a decade and just learned ML. Min-Max scaling was a big question mark for me. First you never know what is the max of a certain price especially in a market (like gold) is always doing higher highs. Also, minimum price is not actually known throughout the data set, so unless you are a bank who has 80-150 years of recorded data for minimum, your data set size will never reflect the true lows or true highs. this leads that most ML models in YT tutorials just plainly panic and fails when the price is doing new historical highs or lows (according to the data set it was trained on, not actual historical highs or lows). scaling and standardization is crucial no doubts, but MinMax technique is fundamentally wrong and reflects absolutely an ignorance about the market dynamics and the core principles of training a ML model.

    • @digitalnomad2196
      @digitalnomad2196 4 месяца назад +3

      just normalize the data, also to his point if anyone has an actual edge why post it online. If he has an edge and makes millions why youtube and why courses eh ?

    • @sanchuro7140
      @sanchuro7140 4 месяца назад +1

      On top of your points, some tutorials even fit min-max scaler to both train and test . This is sooooo wrong that using future data to fit the scaler. Most of the tutorials are really just trash.

  • @adwaitdathanr4040
    @adwaitdathanr4040 2 года назад +17

    This video is a gem. I saw a lot of blogs and tutorials repeating the mistakes you had mentioned.

    • @azrulfyz1162
      @azrulfyz1162 2 года назад

      could you please add a link to where I could find those topic, much appreciated. Thanks

  • @masterquiroga
    @masterquiroga 2 года назад +7

    This video is so underrated. I happen to see the same errors. I would also advise that instead of using a min-max scaler, to use a power transform and standardize.

  • @plasmaflare5217
    @plasmaflare5217 5 месяцев назад

    Thank you so much for explaining these concepts properly, it can be seen that you have a lot of experience in this subject. I started learning machine learning techniques for analyzing economical data but I could not figure out the best method in order to forecast stock prices.

  • @kartikpodugu
    @kartikpodugu 2 года назад +1

    awesome...i saw so many examples, with this mistake, but always, i felt what they are doing has some flaw. But, was unable to reason it myself. Thanks for the clarification.

  • @kyloren2093
    @kyloren2093 Год назад

    Hey, thanks for the great content,
    i am using R, do you think it's as good as python for this kind of analysis ?

  • @anoriginalnick
    @anoriginalnick 3 года назад +10

    I believe it's important to keep things in the same scale because the algorithms apply the same learning rate to all feature dimensions.

    • @ahmedhamza3939
      @ahmedhamza3939 4 месяца назад

      it's important for models who based in estimating weights using gradient because you will most likely get low weight for features that have high range and high for the opposite

  • @anandtiwari1281
    @anandtiwari1281 3 года назад +1

    What happen if we use rolling returns instead of just returns?

  • @grabngoinfo
    @grabngoinfo Год назад +1

    LOVE this video! Cannot help laughing when watching the virus part, but it is so true! I am really glad that I didn’t use min max scaler in my time series tutorials. Thank you for your contribution to the machine learning community, sir!

  • @cenobit0815
    @cenobit0815 3 года назад +2

    sequence length of 1 is fine if the lstm is stateful (hidden state from prev period is used as input aswell). if the lstm is statless, you need to pass the whole sequence (and zero hidden state as input). so it basically depends what kind of lstm you are training. (stateful or stateless) but lstms are still useless for price prediction, because they tend to output the last price of the input sequence. thats what i learned when playing around with lstms for stock price prediction.

    • @cenobit0815
      @cenobit0815 3 года назад

      pytorch.org/tutorials/beginner/nlp/sequence_models_tutorial.html (shows both approaches aswell)

    • @LazyProgrammerOfficial
      @LazyProgrammerOfficial  3 года назад +1

      "Stateful" is just another way of saying that you are passing in a sequence... the hidden state h(t) is derived from the inputs x(1), ..., x(t).
      If you are holding the state from past samples then there's something seriously wrong.

    • @cenobit0815
      @cenobit0815 3 года назад

      @@LazyProgrammerOfficial did you had a chance to evaluate the performance of transformes (multiheadattention) on stock price prediction? i am thinking on giving it another try :)

  • @linknero1
    @linknero1 2 года назад +1

    Thank you, I was going to make this as a final project in a course :V thank you so much, I'll definitively go to the course

  • @gamingsaloon7731
    @gamingsaloon7731 2 года назад +2

    You’re right but we can use price and minmaxscaling locally to find patterns I usually apply it locally when sampling data and not on the whole data

    • @digitalnomad2196
      @digitalnomad2196 4 месяца назад

      ya exactly, there is a local min max if you define a timeframe

  • @0xVantwoutMaarten
    @0xVantwoutMaarten 3 года назад

    It took me a while to grasp, but thank you a lot. Mistake number 5 should be all over the internet! Everybody, if you are using a training window does not mean you are using a sequence, it is about the sequence of training windows!!!!!

  • @MansourAlAkeel
    @MansourAlAkeel 2 года назад +3

    I cannot find you other videos about other mistakes.
    I agree about using the return value instead of price as input. However this will result in input range between -1,1. What activation function would you use then ?

    • @pimpXBT
      @pimpXBT 3 месяца назад

      use a moving average to continually plot a time series graph. theres indicators, theres greeks to measure total risk. returns aint normal, they are lognormal, so you'll obv have skew. Point of the video is form sequences of multiple models that create your strategy, then feed it in so it evolves over time (ML is basically used for parameter optimization, so you can get the best timeframe for a strategy, or use a timeframe to figure out the best moving avg window, rsi levels bla bla, so sequences are supposed to be multiple dimensional, not a scalar)

    • @pimpXBT
      @pimpXBT 3 месяца назад

      ../and all of these are already derived from the underlying returns and standard deviations, so they are normalized to fit the mean/variance of the underlying position/portfolio. you can just add them to your expected profit at face value, so theres your expected returns.

    • @LazyProgrammerOfficial
      @LazyProgrammerOfficial  2 месяца назад

      Check my website for a link to all videos. Using returns would not limit the range. The range of returns is unlimited.

  • @linknero1
    @linknero1 2 года назад +1

    I have a question: is it possible to use that idea to find patterns in hours instead of days? I mean, there are some observable patterns, like: "some stocks gain or lose right before they close and begin the day up (and lose) or low and increase over the day". Is it possible?

  • @danieledicesare9447
    @danieledicesare9447 3 года назад +1

    Nice video. Keep up this series :)

  • @gastonvilches5851
    @gastonvilches5851 2 года назад

    Thank you very much for this video, I was starting to think I was worng until I saw this video. There are tons of mistakes out there, specifically on this topic.

  • @emmang2010
    @emmang2010 Месяц назад

    Thank you very much. I recommend not saying that when scaling the ideas is to have values be "small". People who might take you literally will think you mean very small values (ex. 1.2x10^-20). I would also introduce stationarity at your timestamp for "Stock returns instead of ..." since this is a step towards that.

  • @The_Mindful_Scholar
    @The_Mindful_Scholar Год назад

    I'm working on imports and exports data. I'm using Time Series Generator-LSTM . my training data prediction has r2 error = 0.99 while the testing prediction has -0.39. what parameters you suggest for better results on testing predictions?

    • @metehan9185
      @metehan9185 5 месяцев назад

      Ask chatgpt

    • @Pvtmovies4384
      @Pvtmovies4384 Месяц назад

      did you find any solution ? facing the same issue

  • @hughdbrown
    @hughdbrown 3 года назад +2

    In this video you comment that using prices is wrong but using returns is correct. Does using logs of prices have the same problem? (I ask because logs of prices are commonly used in finance because they have the property that adding logs gives the return over a period of time.) Logs of prices have no min or max, so I imagine they are similarly wrong.

  • @mastermind2362
    @mastermind2362 Год назад

    Where are the other videos? are they coming?

  • @VonDutchyy
    @VonDutchyy 2 года назад

    Really good breakdown, nice one!

  • @ariisichoix5795
    @ariisichoix5795 3 года назад +2

    I really like your video. I can not agree more with all of those Video / Code example they share on youtube like it really works xD. Thank you for this video tho ! I am currently creating a real AI Trading bot using Deep RNN and I wanted to use LSTM Cells and maybe GRU Cells as well but I ended up not having good results during my training process. Hopefully your video will help me understand a little bit more why I am not able to have a better recall. (yes i am doing a classification prediction)

    • @Yasinzaii
      @Yasinzaii 2 года назад

      any luck with your project ?

    • @rob9207
      @rob9207 Год назад

      Hi Arii, please reach out to me if you're still working on this project. Would love to talk with you.

  • @TheDeatheater3
    @TheDeatheater3 3 месяца назад

    I am a little confused. If I am about to standardize the data then it is i.i.d data no longer sequential. In this case this case does it make sense to use LSTM at all?

  • @onceappuonatime
    @onceappuonatime 3 года назад +5

    Thanks for this video. I finally took action and bought the course on Udemy. I am broke so I usually find a way to get stuff for free so this was a big step for me. I have been trading for more than 2 years now and wanted to apply ML in ways different than what I have seen online. So, thank you for making this course!

    • @YaShaheed
      @YaShaheed 2 года назад +2

      How is trading going with deep learning?

    • @reedoken6143
      @reedoken6143 Год назад

      @@YaShaheed it's a grift, unfortunately.

  • @alaincheong7275
    @alaincheong7275 2 года назад +4

    This is just a promotion video, if you think carefully.

  • @axe863
    @axe863 4 месяца назад

    Integer Differencing is excessive and may significantly erode memory content. There exists some degree of tempered fractional differencing that has minimun information destruction with "good enough" stationary

    • @AbhishekML
      @AbhishekML 4 месяца назад

      Ha, I think you're trying to sound smart, but first differencing is standard in time series, whilst "tempered fractional differencing" shows not even 1 page of search results.

    • @axe863
      @axe863 4 месяца назад

      @AbhishekML Overstationarizing is one of the single greatest deteriments to predictability via reduction in a time series memory content. Dr Marcos de Prado highlights the tradeoff by building models on the weakest degree of fractional differencing that rejects the null of nonstationarity (ADF statistc). The differencing-memory tradeoff is not a universality ( it doesn't hold for all processes)

    • @axe863
      @axe863 4 месяца назад

      ​@AbhishekML Financial time series (especially fragile assets) exhibit semi-long range dependency in the cmeans but especially in cvol even when one accounts for spurious fd via structural breaks. Integer Differencing destroys an excessive amount of predictability to ensure stationarity

  • @anshanshtiwari8898
    @anshanshtiwari8898 2 года назад +1

    Are you planning to explain more about the other mistakes?

  • @priyanshukumawat4142
    @priyanshukumawat4142 3 года назад +1

    one of the best mentor I had ever seen !!!!!! RESPECT from INDIA

  • @aravindkolli
    @aravindkolli 4 месяца назад

    Is it same with lag of prices as inputs?

  • @saatviksingh
    @saatviksingh 2 года назад +1

    Ooof please post the other videos soon

  • @fitybux4664
    @fitybux4664 Год назад

    6:25 "Some people are using a sequence length of 1... Nor is it funny or entertaining"

  • @M1911Original
    @M1911Original Год назад

    Holy shit I just standardized the data on one of my LSTM models and I instantly got over 10x less loss

  • @spinLOL533
    @spinLOL533 3 года назад

    6:05 lmao

  • @russnagel1
    @russnagel1 3 года назад

    Why is this episode 16? Where is episode 15? Is this part of your paid for course?

    • @LazyProgrammerOfficial
      @LazyProgrammerOfficial  3 года назад

      These are not part of a course, these are part of RUclips. You can click on my RUclips channel to see all the videos I've uploaded as usual on RUclips.

  • @muntedme203
    @muntedme203 11 месяцев назад

    Stationarity with heteroskedasticity....LN rets is fine. How can you normalise with a window that extends beyond the lookback being used??? Lol

  • @jonfe
    @jonfe 3 месяца назад

    I'd discover a better way of normalize the data for stock prediction.

  • @51nibbler
    @51nibbler Год назад

    is not importent to keep it in same scale. but i made not a prediction of the next N steps of price. i made only buy sell or wait in CFD forex :)
    then when you understand you can normalize the input data i have 25200 ticks as input data AUDUSD. but the normalization i m not use a formel from statistic or internet i have my own formel to calculate normalize input data :)
    yes you have right. not copie a code.
    understand how it work and write it self. and test it. and test it.. and test it.. and when your later version are better you can made more version^^ when not start at begin and learn to understand how Q-learning work^^
    greeze from switzerland and yes my englisch is bullshit xD
    i made it since 1 year as a hobby.. first version was on 23% off all trades are win trades and atm 32% of all trades are win trades and when i had 34% the ai made win with 50pips TP 20pips SL :P

    • @51nibbler
      @51nibbler Год назад

      and NEVER EVER NEVER sue the SAME input DATA for 2 times!! you not want that your KI only can trade only YOUR INPUT data look YT videos 99.9999% only train with the same INPUT DATA so long that the KI the input data KNOWS xD thats bulshit
      i use test data from AUDUSD different times USDCAD EURUSD AUDCAD etc etc etc
      4 years data and more... in 1 train step to see is this version a version with potenzial or crap but you NEVER know how LONG U must TRAIN to KNOW that it work YOU NEVER KNOW :P

    • @vinniehuish3987
      @vinniehuish3987 Месяц назад

      @@51nibblerWtf are you saying you Indian.

  • @dineshkrishnasamy1628
    @dineshkrishnasamy1628 Год назад

    Hi. How to get VIP materials please

  • @barrard
    @barrard Месяц назад

    Discount?

  • @125errorz
    @125errorz 2 года назад +1

    why arent priests talking about this?

  • @dzel774
    @dzel774 3 года назад +2

    I just discovered your channel and I’m interested in the VIP course.

    • @LazyProgrammerOfficial
      @LazyProgrammerOfficial  3 года назад +3

      Welcome! You can find links to the VIP versions of my courses via my website, lazyprogrammer.me

    • @dzel774
      @dzel774 3 года назад

      @@LazyProgrammerOfficial thank you

  • @jdaniele
    @jdaniele 3 года назад +1

    50 euros? I will wait for a 9.99 offer, thanks.... :)

    • @spinLOL533
      @spinLOL533 3 года назад

      not every course goes to $9.99 lols

    • @jdaniele
      @jdaniele 3 года назад

      @@spinLOL533 that's true, as much as, not every course will sell... :)

    • @datascienceprofessor
      @datascienceprofessor 2 года назад +1

      @@jdaniele
      $9.99 course: predict stock prices with LSTMs!
      $50 course: pointing out all the mistakes in the $9.99 course.
      I rather pay more at least the instructor is honest ;)

    • @jdaniele
      @jdaniele 2 года назад

      @@datascienceprofessor
      Yes, maybe. So we need a $150 course pointing out "all" the mistakes of the $50 course. 😋
      And we'll need a $500 course pointing out "all" the mistakes of the $150 course.... OMG😮.. 😋😋
      If you go through the process, it's a asymptotic curve.
      Then, if we can afford it, the best is to buy a $10.000 course. hahahah😂
      Will it cover all the errors? Who knows....🤔
      So a $50 course could still have many errors, right?
      If, for example, a $9.99 course offers 85% of right information and a $50 course reaches the 92%, that 7% more (actually 8.2% more if compared to 85%), costs me 500% ($50/$10), a bit too much.
      So I should pay +400% to have just +8.2% more. Will it worth?
      Anyway, most of the discounted ($9.99 for just few days at year) courses on Udemy, are usually sold between $30 and $200.
      Following your reasoning, a $200 course should be better than a $50, right?
      So I think if we buy a $100-$200 course discounted to $9.99, it has the best value for that money, even if it is not perfect yet, for sure better than a FIXED $50 priced course!
      Fixed priced courses just pissed me off... sorry! 😅

    • @datascienceprofessor
      @datascienceprofessor 2 года назад +1

      @@jdaniele You're just reaching and making up fictitious examples. Lazy is well known for having actually studied this type of material and applying it day to day. The others are obviously just marketers trying to capitalize on trends like ML and crypto. If you can't tell the difference, then you're probably not the target audience for this kind of course.

  • @oberstvontoffel
    @oberstvontoffel 2 года назад

    lstm is old. use transformers

  • @calendr13
    @calendr13 Год назад

    I am in the sector, The only useful video about stock price prediction !

  • @Rvl734
    @Rvl734 3 года назад

    Sir i need a project of stock price prediction lstm model (back propagation algorithm) and maa website or web app or using streamlit i will pay you reply to this comment

  • @kilocesar
    @kilocesar 3 месяца назад

    I'm creating my own library with GPT now, I don't have to rely looking for scrapes of others coders.

    • @LazyProgrammerOfficial
      @LazyProgrammerOfficial  3 месяца назад +1

      Unbeknownst to you, GPTs are trained using Github code and therefore make the exact same mistake. I covered examples in one of my courses.

    • @kilocesar
      @kilocesar 3 месяца назад

      I use it to implement the initial structure to same time, but I'll know what you mean.@@LazyProgrammerOfficial

  • @petemoss3160
    @petemoss3160 Год назад

    heh heh heh

  • @kabokbl2412
    @kabokbl2412 2 года назад +1

    hmu when he makes the course free, i dont have money to buy it

  • @MasamuneX
    @MasamuneX 4 месяца назад

    tldr just use min/max on indicators on bounded quantities like the outputs of some indicators not on the price itself and dont use it on price action because you "Cap" it at the maximum value that Could easily be growing still