How to Spot and Avoid Overfitting and Underfitting in Algorithmic Trading

Поделиться
HTML-код
  • Опубликовано: 3 дек 2024

Комментарии • 56

  • @makeyourich992
    @makeyourich992 5 месяцев назад +1

    Ur just amazing 🤩

  • @ShaneZarechian
    @ShaneZarechian 4 месяца назад +1

    incredible video

  • @wayne7936
    @wayne7936 6 месяцев назад +4

    I really like that you showed the variance between the different sl-tp backtests. You're doing an analysis using cross-validation techniques, which is usually a good thing. 👏👏

    • @CodeTradingCafe
      @CodeTradingCafe  6 месяцев назад

      Glad you liked it, thank you for your support!

  • @itryforwhat
    @itryforwhat 3 месяца назад +1

    hi, I have a question: in the plot, the vertical axes is stoploss coefficient, but what does it mean?

    • @CodeTradingCafe
      @CodeTradingCafe  3 месяца назад

      Hi, it's the coefficient factor we used to multiply and get the stop loss distance, remember we were optimizing 2 parameters simultaneously one of them is this coefficient.

  • @edp908
    @edp908 3 месяца назад +1

    Where's the Jupyter Notebook code for this please sir?

    • @CodeTradingCafe
      @CodeTradingCafe  3 месяца назад

      There is no specific strategy I used code from previous videos.

  • @jaydy71
    @jaydy71 6 месяцев назад +1

    This idea of not being afraid of overfitting is very interesting. What I have seen many times when I am running into overfitting issues is that my ML model performs really well in the future, but it only does so very briefly before everything absolutely tanks.
    So my approach for the last year or so has been to avoid this as much as possible, which on one hand made my trading bot much more consistent, but on the other hand misses a lot of opportunities (it doesn't trade as much as I'd like).
    Anyway, what I got from this video is that overfitting is not *necessarily* a big issue as long as you know where the dangers are. Those heatmaps illustrated that quite effectively to me. Did I understand that correctly?

    • @CodeTradingCafe
      @CodeTradingCafe  6 месяцев назад

      Spot on, exactly what I wanted to show through the heatmaps. Thank you for sharing your thoughts.

    • @jaydy71
      @jaydy71 6 месяцев назад +1

      @@CodeTradingCafe Thank you, I've learned something :)

  • @Youtoobe0Guy0Awesome
    @Youtoobe0Guy0Awesome 7 месяцев назад +1

    It depends on the market environment. It's good to overfit on conditions where liquidity is flowing in the Biz cycle.

    • @CodeTradingCafe
      @CodeTradingCafe  7 месяцев назад +2

      It also works well on low volatility markets, since change of patterns is usually slow.

  • @RayKol0114
    @RayKol0114 6 месяцев назад

    First thanks for making so many usefull video ,I have some question to ask .A stable strategy is relies on the majority of variable not overfit, also while some where of the variable have overfit ,why does it means it can't stable working in a long period ,what's concept is it?You can just tell me the noun of that similar concept I can study my self, thankyou!

    • @CodeTradingCafe
      @CodeTradingCafe  6 месяцев назад

      Hi, the fitting doesn't work over a long period because the market changes and conditions change according to news and economy, so we need to update the parameters accordingly to follow up.
      Thank you for your support!

  • @5мажоров
    @5мажоров 6 месяцев назад +1

    Is there a way to apply same concept for params selection?
    Applying the idea to set params using heatmap to make them more stable to market conditions seems good for me, however, when param grid is 3+ we cannot plot them to find best values vizually. Would really help if you tell about approaches on how to find the best combination in this situation and how to ensure that it is good. Thanks!

    • @CodeTradingCafe
      @CodeTradingCafe  6 месяцев назад

      Hi, the easiest way when you optimize on multi parameters, is to plot the results on multiple heatmaps, this way you have all the parameters displayed and you can make the link between different heatmap results to guess the best set of parameters. I hope this makes sense.

    • @5мажоров
      @5мажоров 6 месяцев назад +1

      @@CodeTradingCafe i tried to do this, but noticed, that pairwised comparison can perform bad thus having best params in terms of “heat”. Though about solving it as equations trying to apply difference between neighboors as metric, but not succeeded :(

    • @CodeTradingCafe
      @CodeTradingCafe  6 месяцев назад

      Well you can always find the maximum value (returns) and get the full params set but this will not show you the "smoothness" to visualize the fitting/overfitting.

  • @Khari99
    @Khari99 7 месяцев назад +10

    People misunderstand the concept of overfitting. All optimizations fit themselves to the data. What matters is that the optimizations work out of sample. So you should always overfit in sample data and test that on out of sample data to validate that performance doesn’t break

    • @CodeTradingCafe
      @CodeTradingCafe  7 месяцев назад +2

      I partially agree, just overfitting on sample data might almost guarantee that new data will not work with the model, since the model complexity/variance is too high so we don't expect generalization. But yes I get your point that we will always try to fit on training data first.

    • @Khari99
      @Khari99 7 месяцев назад

      @@CodeTradingCafe yeah it’s a weird thing conceptually where we end up optimizing on the entire dataset anyway. One could argue that you’re just overfitting to the entire dataset when you do that, while other people who do that say if a strategy works on the entire dataset across multiple years, the likelihood of it continuing to work are higher. I personally haven’t been able to forward test this theory for years to know for myself. The only thing I’ve found to be a clear sign of when overfitting is bad is when it comes to variance like you said. If a strategy can have wildly different results with one small parameter change then it’s likely something that won’t work at all in forward testing. For my personal project I’m splitting half my dataset into IS and the other half into OOS data and only allowing it to trade if performance in OOS is similar to how it is in IS results by a certain threshold.

    • @riccardoronco627
      @riccardoronco627 7 месяцев назад

      the question is how much data you need, how much you split between in sample and out of sample, how many trades you need in each...

    • @Khari99
      @Khari99 7 месяцев назад +1

      @@riccardoronco627 I typically use 20 years. and I split the data 50/50. As for trades, that comes from the strategy itself but I try not to go for a low number of trades just for sampling purposes

    • @GodX36999
      @GodX36999 7 месяцев назад

      You may be wrong or right because everything is random so underfit or overfit is the same. I believe with a set of date big enough and I love overfitting

  • @wanga10000
    @wanga10000 7 месяцев назад +1

    I think fit the parameter on indicator is fine. Just not do it too much. Lots of strategies that could work only based on stop-reverse exit. There's no S&P ratio for fitting.

    • @CodeTradingCafe
      @CodeTradingCafe  7 месяцев назад

      Yes it is fine but so far we haven't done it for any indicator it was just for TP and SL distances... I will fit/optimize an indicator in a video soon.

  • @MasamuneX
    @MasamuneX 7 месяцев назад +1

    hey for your automated system some things that really helped my system was good position sizing and also automated position time outs so that if you get a losing trade it waits a bit and disregards any entry that happens too quick letting the market conditions change or reset. you can have exponential position time outs, in my system on the daily timeframe im waiting 5 days after 1 loss 14 days after 2 losses in a row 90 days for 3 in a row 180 for 5 in a row and the full trading year or suspending trading of that security for the rest of a trading year keep in mind i trade the top 2000 stocks by market cap and the strategy win rate also effects the exponential position time out but its an idea you could adapt. or you could just trade with a smaller position size

    • @CodeTradingCafe
      @CodeTradingCafe  7 месяцев назад

      Helpful, thank you, and I agree dynamic position sizing is crucial and I didn't show this on the channel yet, next video...

  • @kenzamejhed4163
    @kenzamejhed4163 7 месяцев назад +2

    Sir its me again ....please can you do the following the strategy "
    ""If the (Close Price > Upper Bollinger Band) AND (Stoch RSI K line > Stoch RSI D line) we get a LONG signal.
    If the (Close Price < Upper Bollinger Band) AND (Stoch RSI K line < Stoch RSI D line) we get a SHORT signal."""

    • @CodeTradingCafe
      @CodeTradingCafe  7 месяцев назад

      Hi, I can do it but what about SL and TP ?

    • @kenzamejhed4163
      @kenzamejhed4163 7 месяцев назад

      @@CodeTradingCafe Optimization

  • @tnthompson81
    @tnthompson81 6 месяцев назад +1

    Great stuff. I'm almost finished with your Udemy course. I have a question for you. Have you had any success with trading algorithms for day trading crypto? Thanks!

    • @CodeTradingCafe
      @CodeTradingCafe  6 месяцев назад

      Hi, nice! to be honest didn't focus on crypto since it started crashing few years back, so my answer is no for crypto, however don't get discouraged I had this on my todo list because volume in crypto is centralized and you can leverage volume strategies in this case.

  • @apogounte8239
    @apogounte8239 6 месяцев назад +1

    Hi
    Excellent video once again! Just a comment, with no intention to criticize.
    U say that the indicators used as entry conditions, are not tuned, BUT:
    1. rsi uses a range of 10, instead of the classic 14,
    2. bollinger band uses a range of 15 and a standard deviation of 1.5, instead of the default 20 and std of 1 or 2,
    3. the atr range is 7, instead of the classic 14.
    So, I think that, you have tuned the above, before finding the optimal tp/sl ratio range, using your heatmap.
    Isn't that the case?

    • @CodeTradingCafe
      @CodeTradingCafe  6 месяцев назад

      Hi, I actually changed them based on my observation/opinion/experience but I didn't try different values and chose the best set, I haven't ran the optimization on these either, this is why I was confident in the indicator. I only tried changing the standard deviation of the bollingers between 1.5 and 2.5 just to see how it influences the number of trades.
      On this matter however I was intending to make a video where we optimize/fit the indicators, someone requested this.
      My reasoning was for example, I need the ATR to define SL and TP so I just want to consider let's say the last 30 min data for this, and so limit the ATR to 7, The RSI = 10 probably dragged this value from another code (copying and pasting), ... The values are somehow approximate or random except the std of the bollingers.
      I hope this clarifies a bit. Don't worry criticizing or commenting sometimes I brain storm here in the comments, lots of ideas coming from viewers. Thank you.

  • @Rohit_behind_DecadelTrend
    @Rohit_behind_DecadelTrend 7 месяцев назад +4

    Can you elaborate more on last part where u said , we will run it for a week and then we again back test and chnage criteria based on this new heatmap . Isnt this considered as overfitting as this way you will keep changing your values everyweek due to short term trend changes ? A great video indeed thanks

    • @CodeTradingCafe
      @CodeTradingCafe  7 месяцев назад +3

      Exactly what I meant by "overfitting might be good sometimes", and this is what we are doing in this example, we keep fitting every week to follow market conditions promptly.

  • @Kentrilek
    @Kentrilek 7 месяцев назад +2

    what if we change code to only open position during trading sessions as spread then is smallest?
    and how about your live trading bot? is it still bringing loses?

    • @CodeTradingCafe
      @CodeTradingCafe  7 месяцев назад

      Hi, I dealt with the spread a bit differently, I added a condition where trades were opened only if the spread is less then a threshold. The bot I stopped it a month (or more) ago, it wasn't worth keeping it running as it is, it needs refining.

  • @kevinli522
    @kevinli522 5 месяцев назад +1

    The heatmap you show here they are not variance. Your results are likely extremely overfit to the sample you “draw”. The stop loss & take profit ratios you have here all depend on similar paths to one another.

    • @CodeTradingCafe
      @CodeTradingCafe  5 месяцев назад

      The itself heatmap is the returns for each backtest with couple of parameters, but you can see/estimate the variance by comparing neighboring returns mainly checking the difference/differential. In general overfitting isn't as smooth, it presents sharp variations between results, hence in ML we compare high variance low generalizations and low variance high generalization models.

  • @rverm1000
    @rverm1000 7 месяцев назад +2

    i gone completely off the beaten path (your path) and found a better way to find stocks at the right time.

    • @CodeTradingCafe
      @CodeTradingCafe  7 месяцев назад

      That's excellent, I will be making more videos on stocks rather than Forex, this was requested by some contributors.

  • @tmyersf4
    @tmyersf4 7 месяцев назад +1

    I would comment but YT keeps deleting them.

    • @CodeTradingCafe
      @CodeTradingCafe  7 месяцев назад

      Sorry about this, yes if you leave links or emails in the comments they are flagged by YT.