The BIGGEST Backtest & Optimization on YouTube [OVER 500 stocks]
HTML-код
- Опубликовано: 8 июн 2024
- In this video series we are going to backtest and optimize a given Trading Strategy on over 500 stocks with Python.
In this part we are setting up the fundament by pulling the data, doing the necessary data manipulations and setting up the Backtest.
Get the Notebook/Source code by becoming a Tier-3 Channel member:
/ @algovibes
Please be so kind supporting the channel by hitting like, subscribe and leaving a comment. Thanks a lot!
How stock returns are calculated and cumulated:
• How To Calculate Stock...
Survivorship bias:
• Momentum Trading Strat...
Python for Finance playlist:
• Python for Finance
Interested in automated trading? Check out the Cryptobot playlist here:
• Cryptocurrency Bots / ...
Disclaimer: This video is not an investment advice and is for educational and entertainment purposes only!
00:00 - 01:04 Introduction / Disclaimer
01:04 - 02:17 Getting stock price data
02:17 - 07:24 Data handling (e.g. MultiIndex)
07:24 - 08:36 Moving average calculation
08:36 - 13:44 Backtest
13:44 - 15:55 Running it for all stocks
15:55 - 18:28 Analyzing the results
18:28 - 19:07 UP NEXT! Hit like & comment
#Python #Backtest #Optimization - Наука
Thank you for sharing.
Looking forward to next one.
Thanks a lot mate!
Super! Thank you very much!
thanks bud
Thanks in advance for the series
As an aside, in the backtest function you could check if still in position after the loop, and do another calculation vs the close price to get a mark to market for open positions.
Welcome mate, thanks a lot for watching! Mark to Market open positions is quite a good idea in general, for this series I am probably ignoring them.
Hi, and thank you for the very well-structured videos. Can you please create a video based on walk forward optimization approach explained in Kevin J. Davey's book?
Thanks a lot for the suggestion! Appreciate it
Thank you sharing this important knowhow here...what is the most successful Algo trading strategy you found so far ? Thank you
Welcome buddy!
I covered your question in this one:
ruclips.net/video/Wem7PIhT95U/видео.html
Hi @Algovibes after I try to run the backtest, I get this error: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.
gains = (pd.Series(profits) + 1).prod()
How would I fix this? Thanks
@Algovibes - enjoying your videos as always. Looking forward where you're taking this series.
A word of caution: I followed along with your code and I get a massive amount of trades. You didn't seem to check for that. Also, there's an enormous amount of held positions (where the buy triggered, but not the sell yet) that would massively skew the final result. Would be nice to at least count these things somehow.
Thanks a lot Andre. The open positions would not be fed into the profits list so I see no problem with that.
@@Algovibes there are two problems with that:
1. What you pointed out is actually the biggest problem. You may end up with hundreds if not more open position, many of which can be severely under water. Thus when you report a big profit in the end, it doesn't account for the amount of open positions.
2. From a more practical point of view, I've counted over 3000 positions total with hundreds open in the end. So this would be highly impractical at best and require a huge amount of capital outlay up front.
I think it would be good to at least mention these things and when you run the profit calculations to point out the open positions and total number of trades in a given period.
When you're now going to try to optimize this, you'll probably optimize only on the final profit percentage. If you don't account for open positions, you'll be missing an important variable.
Let's say algorithm 1 has 10% profit but 1000 open positions, and algorithm 2 has 5% profit, but 0 open positions. On paper algorithm 1 has higher profit, yet may have 1000 positions under water. Also, when you'd want calculate actual profit value, you'd have to account for the capital locked up in the open positions.
@@int2str Hi Andre, generally not wrong yet in this case I am not backtesting a strategy which is trading the assets with an amount of X splitting the capital to different assets and having a timeline of certain occurrences. I have a lot of videos on topics of that nature.
The optimization is finding out which assets for which combination of 2 parameters is the best one and whether it's robust when splitting the dataset. Anyhow I consider mark to market open positions as a good idea.
@@Algovibes yes, but the problem for this video mini series is exactly your definition of "best". And that applies both to tuning parameters and comparing training with actual.
Let's imagine you're tuning take profit and stop loss percentages. At what point you test 100% take profit and 100% stop loss. So the position is only sold when it doubles or gets delisted. This would mean that this version of the pretend algorithm will likely be by far the "best" it will have massive profits with nearly no losses. But of course it will have huge amounts of open orders.
So you can probably see how ignoring open orders at minimum will skew whatever result you deem "best" and make comparisons unfair.
Glad you're considering it. I think at least indicating it as a simple number of open positions would be helpful for comparison/parameter tuning sake.
Very helpful but why are you looping through a Pandas dataframe rather than using a vectorized operation?
Thanks buddy. For using a vectorization approach check out this one:
ruclips.net/video/nAfnAuvRpqE/видео.html
Decided to keep it iterative for this one. No specific reason behind that.
❤thqu
Welcome mate! Thanks a lot for watching :-)
Nice video. What about this strategy for an upcoming video? Scrape the amount of newspaperarticles published to a specific topic. If let´s say the amount is 20% higher than the average of the last 200 days then check if the 50 days trading average is above the 100 days trading average. If so then buy the stock.
Cool idea but unfortunately NLP topics didn't get any traction in the past :-( I might give it another chance tho! Thanks for your suggestion.
Thanks for video, can we have a video where we buy relatively strong stocks with spx
Welcome mate! I have a ton of videos covering Momentum strategies. Be very kindly invited to check out my Python for Finance playlist. Let me know if you need any support if you don't find something.
@@Algovibes ok 🙏
Wow So nice video, But i wold be more useful sir, if share code with us, Request you also share the py code
Thanks Navikaran!
Source code is available here:
ruclips.net/channel/UC87aeHqMrlR6ED0w2SVi5nwjoin
Looking forward to hear from you!
Please share your code, thank you!
yes please add the code
Great news for you both! Source code is available:
ruclips.net/channel/UC87aeHqMrlR6ED0w2SVi5nwjoin
Looking forward to hear from you.
I cant get anything to download
503 Failed downloads:
- MOS: No data found for this date range, symbol may be delisted
- POOL: No data found for this date range, symbol may be delisted
- CHTR: No data found for this date range, symbol may be delisted
- ZBH: No data found for this date range, symbol may be delisted
- LUV: No data found for this date range, symbol may be delisted
- MDLZ: No data found for this date range, symbol may be delisted
Same happened to me. I was working on PyCharm - Jupyter, so I saw in a forum somebody suggested to switch to Google Collab. I did so, and it worked. You should do the same.
What is this i dont undstand
How can I help you to understand it better?
had to, that backtest with iterrows makes me cringe now. Werent you the one last summer complaining about how we shouldn't do that?
What? No I would never do that. My attitude is: Whatever gets the job done and more importantly code readability/understandability > unnecessary flexing.
Anyhow my next video will be basically tailor suited for you as I am transforming the code using Vectorization. Do not miss that!
No Thanks!! with so many backtesting tools, modules & libraries, why would i do this?
Wasted 5 mins of my life. :(
Sad to read! However it is definetly worth it to code something like this from scratch instead of using a library. I have a ton of videos on different Backtesting libraries so be invited to check them out.
thaks for the content - I add some coloms to compare all the result to justo have a long position.
sololong = []
for i in tickers:
longdf = slice_df (i)
print("result for" + i)
sololong.append((longdf.Close.pct_change()+1).prod())
sololongdf = pd.DataFrame( {"longsolo":sololong}, index = tickers)
merged_df = sololongdf.merge(profits, left_index=True, right_index=True)