Advanced pair trading in Python: beta loading, optimal entry, and stop-losses
HTML-код
- Опубликовано: 6 окт 2021
- How to implement the logic of pair trading, cointegration, and statistical arbitrage strategies in Python while accounting for trading fees, optimal entry, and stop-losses? And how to make your strategy truly market-neutral using beta loading or beta rotation? Today we are going to investigate that and build our own advanced trading algorithm in Python.
Don't forget to subscribe to NEDL and give this video a thumbs up for more videos in Python!
Please consider supporting NEDL on Patreon: / nedleducation
You can find the .ipynb for this video and some additional materials here: drive.google.com/drive/folders/1sP40IW0p0w5IETCgo464uhDFfdyR6rh7
Please consider supporting NEDL on Patreon: www.patreon.com/NEDLeducation
wow a real algo trading channel that isnt just about RSI and indicators
As always, impressive content and superior execution Savva :) Love it!
Parabéns Savva, obrigado por compatilhar!👏👏👏
As always thanks so much on posting such high quality content w/ detailed walkthrough!
Just a minor suggestion on your coding style that you shall consider decouple functions / utilities for easier troubleshooting and future re-factoring :)
question. I noticed you use prices in this methodology. Would using log returns be more relevant to maintaining the relationship? I just implemented this strategy in digital assets and my results were fairly poor for the pair I choose and I was wondering whether this could be the reason why.
вы супер, начну спонсировать со след зп !
Огромное спасибо :)
is it possible to combine this with a Kalman filter? if so can you show how? ive seen similar approaches online when its applied to z score of the rolling hedged spread
Savva, thanks for all your efforts! Your channel makes some advanced econometrics and statistics concepts easier to understand.
A couple of bugs in the above video:
fee = 0.0001 isn't realistic enough. You used 0.001 in previous videos. And if you use that value here, the results won't be ok anymore.
20:10 you mistyped the "signal" variable identifer. That's why strictly typed languages which require to declare variables first still rule the world :)
Hi John, and thanks for the kind words! Glad you liked the video. As for your questions: 1) yes, this strategy is quite sensitive to trading costs, to do well with high fees you have to select very high optimal entry thresholds and monitor multiple pairs simultaneously for trading opportunities; 2) thanks for pointing this out, will correct the typo, but funnily this does not affect the logic of the code as in this very step signal has to be kept equal to the previous value of the signal :)
Love this, thank you so much, this is exactly what I was looking for :) Quick question, i was wondering if i could do a regression with multiple independent variables, for example how would I see how the yield curve moving as a whole, or growth and inflation as a whole vs. my dependent variables like Equities, Commodity and Currency asset classes and subcomponents and constiuents? Thank you so much. I'm a huge fan of your work.
Hi Tanber, and glad you liked the video! As for your question, you could get the data on your macroeconomic aggregates on the highest frequency possible and then match these with returns of your asset classes.
Good
Great content Savva. Are you thinking about expanding the code as to take in to consideration more than 2 tickers? For example analyzing more tickers (10/20?) and choosing the ones with lowest KPSS?
Hi Nicola, and glad you are enjoying the channel! It is possible to do that by implementing a loop and recording KPSS statistics in a dataframe, I was considering doing that, but the tutorial was simply too bulky and long. Might come back to it at some point with a fresh mind :)
Какие брокеры дают работать через библиотеку Python?
Hi Nedl. Can you explain a bit more when it comes to real trading of your case, How many lot is going to long/short for each stocks? Many thanks!
Hi, and thanks for the question! It depends on portfolio value. The weights represent which proportion of it goes towards a particular position. For example, if $10,000 being traded, and you have a weight of 1.2 for the first stock and -0.8 for the second stock, you open a long position worth $12,000 for the first and a short position worth $8,000 for the second.
Dear Savva, thanks you for your videos. I have watched ur last pair trading video and currently this one I still don't understand how the (t) sample goes in the loop u are calling the (t) of the same day using the same day closing price to calculate models and trading on the same day closing price used in the model. Correct me if am wrong dear. Because I still don't understand how u make decision of the next trading day if is overvalued or undervalued
Hi, and thanks for the question. Here, the model does look onto the next trading day to calculate the return, in the same loop, but to form a prediction, it does only use the prices that are available as of today.
Why do you put function definitions inside a loop?
Hi Maciek, and thanks for the excellent question! This has to do with the use of scipy.optimize function, as it has to be coded inside the loop and the function has to use data that is both defined inside a loop and cannot be given as an argument of a defined function (since scipy.optimize requires all arguments to be variable). Hope this helps!
@@NEDLeducation Of course it works as expected, but it would be more elegant if both functions were defined separately and called in the loop with all required inputs passed as parameters.
NEDL I think there is an error in the code on beta loading when you're taking the return.
see below:
rets0 = np.array(raw_data[tickers[0]][t-window:t-1])/np.array(raw_data[tickers[0]][t-window+1:t]) - 1
If we agree that returns equal p1-p0/p0 = p1/p0-1 Then it seems that you're taking p0/p1 -1 instead of p1/p0 -1.
let t = 8, window = 7:
this code would yield the following:
np.array(raw_data[ticker[0]][8-7: 8-1])/np.array(raw_data[ticker[0]][8-7+1:t]) - 1 = np.array(raw_data[ticker[0]][1: 7])/np.array(raw_data[ticker[0]][2:8]) - 1
here you can clearly see if you're plotting this out in jupyter labs you're actually calculating the returns as p0/p1 - 1. Did you mean to do this?
I'm now pretty sure you have to re-organze that one snippet in beta loading. I tested it on a couple time series and the returns are being taken backwards. I can send you my .py file if you'd like.
You can even verify this by expanding your #calculate returns snippet just below and see that you take the returns (p1-p0/p0 ) but inversely above if that helps.
#calculating returns
gross = position0*(raw_data[pair1[0]][t+1]/raw_data[pair1[0]][t] - 1) + position1*(raw_data[pair1[1]][t+1]/raw_data[pair1[1]][t] - 1)
net = gross - fee*(abs(position0 - old_position0) + abs(position1 - old_position1))
market = raw_data['market'][t+1]/raw_data['market'][t] - 1