LEAST SQUARES MOVING AVERAGE - Machine Learning 😯 Technical Trading Indicator in Python

Поделиться
HTML-код
  • Опубликовано: 14 июн 2024
  • Hi everyone,
    A very interesting technical indicator built with a linear regression (OLS) as an alternative for Moving averages.
    I am going to show you how the regression is set up (you will also learn how to run a linear regression in the stock market (or whatever asset you are interested in), then we are running hundreds of OLS regressions to estimate the Least Squares Moving Average values and finally we set up a simple Trading strategy with the LSMA and backtest it.
    Someone ask me to look into this - thanks a lot for the suggestion!
    Get the Notebook/Source code by becoming a Tier-3 Channel member or get other perks like Discord to exchange with like minded people here:
    / algovibes
    More in depth consideration of the LSMA indicator (maybe in combination with other technicals)? Like the video and let me know below!
    Interested in automated Cryptocurrency Trading?
    Check out the videos in the Cryptobot playlist here:
    • Cryptocurrency Bots / ...
    Disclaimer: This video is not an investment advice and is for informational and educational purposes only.
    Check out my age old video on setting up a linear regression here:
    • Linear regression in P...
    00:00 - 01:20 Introduction / Disclaimer
    01:20 - 02:06 Libraries / pulling stock prices
    02:06 - 08:52 Regression for the very first date
    08:52 - 15:38 Running ALL regressions
    15:38 - 17:25 Data Handling (merging etc.)
    17:25 - 21:20 Simple Trading Strategy using the indicator
    21:20 - 24:35 Results (profit, win rate, etc.)
    24:35 - 25:01 MORE ON LSMA?!
    #python #trading #machinelearning #regression
  • НаукаНаука

Комментарии • 64

  • @MegaLukyBoy
    @MegaLukyBoy Год назад +5

    The walue of this vidjo is series

  • @davidcooper3871
    @davidcooper3871 Год назад +3

    I used this method in the 90s onward in excel and later a C++ program, along with another concept you’ve covered, to predict buy and sell signals, with percent allocations for a retirement portfolio made up of mutual funds. Great to see this being modernized.

    • @Algovibes
      @Algovibes  Год назад

      That's awesome :-) Thanks a lot for your comment!

  • @bryan-9742
    @bryan-9742 Год назад +4

    THIS IS GREAT FOUNDADTIONAL:
    OK before main thoughts, I'm not being a stats snob, but we should probably only be running an OLS on a returns vector and not on price vector. Prices are not stationary, (essentially highly unstable in predictability). Furthermore price series are random walk stochastic processes. Said differently, Martingales thus naturally whatever we predict in the time series via an OLS is going to closely mimic holding the asset in a long position. The unconditional Expectation of a Price/random variable X_t at X_t+1 = X_t. ($9 today expected value will be $9 tomorrow). This essentially takes a 24 day period of this expectation remapped in a linear fashion into the next period with no handling of the stationarity concerns which is another rabbit hole. if you wanted to avoid this, you'd probably want to get the return series and then proceed in the prediction if you believed in fact that there was a linear predictive pattern on a 24 day window.
    Main thoughts:
    AMAZING code work as usual. I feel this is the foundational piece for us to be now be able to take information from other datasets and apply its information to a return vector that we are trying to predict. This is a much more professional way than hoping some random technical indicator pattern would work. I feel that's what all these retail amateurs do with traderview. This manner of coding allows us to conduct our analysis and see if there is any predictive information that we can test out. (SAY BNB 3 DAY window OLS return = X_t to predict Y_t_1 BTC) (Another Idea is to take a z-standardized X matrix of ,say, 3 different continuous variables and see whether the multiple information of the X Matrix is predictive of BTC returns at t +1.) tardis.dev has BTC/Perp csv files I'm wondering if that would be interesting to you to see if the funding rates are predictive perhaps?
    Lastly, I have a con-integration strategy i built in functional programming where I use a 21 day window OLS return with some conditionals to assess whether the reversion between the two pairs is large enough to place a long/short trade. I guess you'd call that the signal. This video helps me in my mind to understand how I can clean it up within the loops. I'm actually trying to think about how best to break up what you do in block 29 and 48.
    Lastly, I see you handle OOP and functional programming differently in block 48 when appending the buydates, buyvalues, selldates, sell values and calculating the holding period returns with iterrows. I will review this more closely as I'd eventually like to turn some of my work into a class.
    Thank you again

  • @rajeevmenon1975
    @rajeevmenon1975 Год назад +1

    Awesome buddy!! U r a star !! Undeniably

  • @chrisarets2231
    @chrisarets2231 Год назад +1

    For me personally one of the most interesting videos so far!

    • @Algovibes
      @Algovibes  Год назад

      Cool, thanks a lot for your comment mate :-)

  • @misterd7405
    @misterd7405 Год назад +1

    Wauw very interesting, please do more with crypto and indicators!
    Youre a absolute boss, very clear video!

    • @Algovibes
      @Algovibes  Год назад

      Thanks a lot mate. Appreciate your kind words!

  • @jerrywang3225
    @jerrywang3225 Год назад +1

    Great content as always.

    • @Algovibes
      @Algovibes  Год назад

      Thanks a ton Jerry. Appreciate your comment!

  •  Год назад +1

    Wow. Heute mal richtig was zum nachdenken. Hab’s mir 3 mal ansehen müssen um halbwegs zu verstehen - wie, was und warum 😊

    • @Algovibes
      @Algovibes  Год назад

      Hoffe es hat im Endeffekt Sinn ergeben :-) Danke dir für deinen Kommentar!

  • @vaaronka
    @vaaronka Год назад +2

    Would be interested using multiple features for data ( like fear and greed, maybe analyzing bullish/bearish Tweets etc :) )

    • @Algovibes
      @Algovibes  Год назад

      Cool suggestion. Thanks!

  • @bryan-9742
    @bryan-9742 Год назад +1

    fyi, Merging is nice when you don't want to delete any rows or say you have an industry for a bunch of stocks and you want to drag down that industry b/c you're gonna make a categorical variable. In this case it doesn't make any sense as concat is an easier way.
    That said here is how the merge works:
    lsma_df2 = lsma_df.reset_index().rename(columns = {'index': 'date'})
    df2 = df.reset_index().rename(columns = {'Date': 'date'})
    all_df_1 = df2.merge(lsma_df2, how = 'left', on = ['date'])

    • @Algovibes
      @Algovibes  Год назад

      Cool input, I would prefer my way tho. Anyhow thanks a lot!

  • @nitugopinadh
    @nitugopinadh Год назад +1

    Wow.. this is great.. 👍

    • @Algovibes
      @Algovibes  Год назад

      thanks buddy, happy you like it!

  • @philippecolin151
    @philippecolin151 Год назад +1

    You are amazing!

    • @Algovibes
      @Algovibes  Год назад

      You are for leaving that comment. Thanks mate!

  • @PregmaSogma
    @PregmaSogma Год назад +1

    Great video as always, i have one suggestion, can you try using a Fast Fourier Transform (FFT) to basicaly have a model of the price with the lets say 24 hour window in the form of fundamental sin waves to try to predict the next open price using the sin waves generated with a FFT
    I think this can be a very interesting idea to test in a future video

    • @Algovibes
      @Algovibes  Год назад +1

      Thanks a lot mate. Also thanks a ton for the suggestion!

  • @int2str
    @int2str Год назад +1

    Thanks for the inspiration once more. I implemented the OLS in my C++ backtesting library and ran it for the Russel 1000. I can confirm the 60% win rate, but the algorithm is producing too many positions (30k+ positions in 1 year). Also, average profit is -.23% - not profitable. So yeah, would be nice to come up with a strategy that incorporates OLS, but doesn't produce that many positions and preferably is profitable :D
    Positions 30507
    Average profit -0.23 %
    Longest duration 11
    Average duration 4.87
    Wins 18818 (61.7 %)
    Average win 2.82 %
    Largest win 93.37 % (COIN)
    Consecutive wins 85
    Losses 11689 (38.3 %)
    Average loss -5.14 %
    Largest loss -59.84 % (AMC)
    Consecutive losses 47

    • @Algovibes
      @Algovibes  Год назад

      Thanks a ton Andre for sharing! The largest win tho - crazy. What I thought about was making it more strict using other indicators or even return windows (so e.g. LSMA cross - does the stock really rise over the last x minutes...).

  • @flydr2
    @flydr2 Год назад +2

    Ho Algovibes, here's a challenge: "supply and demand trading strategy"
    This could prove challenging to code...

    • @Algovibes
      @Algovibes  Год назад

      Thanks a lot for the suggestion Marc. Can you provide more details in what you would be interested in?

  • @guillermothestar84
    @guillermothestar84 Год назад +1

    thank you very much for using ML to predict values. perhaps you could add the twitter sentiment analysis to consider a fundamental analysis

    • @Algovibes
      @Algovibes  Год назад +1

      Thanks a lot for the suggestion!

  • @yanchobeats
    @yanchobeats Год назад +2

    You are the man! That was awesome work, I would love to see you play around with crypto using the binance api with the historical klines 🙏

    • @Algovibes
      @Algovibes  Год назад

      Thanks mate! I will. Thanks a lot for the suggestion.

  • @kapildeshpande2778
    @kapildeshpande2778 Год назад +1

    Amazing😃...can you post a video where we try and make it more sophisticated

    • @Algovibes
      @Algovibes  Год назад

      Thanks a lot for the suggestion, will see what I can do!

  • @arturKrzyzak
    @arturKrzyzak Год назад +3

    Would it be possible to have ML working with indicators like MACD, or Stochastics? What I mean, is to train the model to predict certain price behaviors in correlation with, for example divergences that are being given by these indicators?

    • @Alexander-pk1tu
      @Alexander-pk1tu Год назад

      don't invest too much effort into technical indicators unless you want to be food for Wallstreet. TI mean nothing by themselves. Ex-ante in historical data with a ML algorithm you can find historical patterns that don't repeat, or even if they repeat they don't repeat frequent enough to make money after transaction costs.

  • @Alexander-pk1tu
    @Alexander-pk1tu Год назад +1

    It would help if you backtested for more diverse assets and for a longer period. And instead of total profit report avg profit per trade. Even the most liquid companies in the US have a min of 15bps spread and commission.

    • @Algovibes
      @Algovibes  Год назад

      Hi Alexander, thanks a lot for your comment.
      Basically: Be invited to check out my other > 150 videos. I am showing how exactly you can do that there. I just cannot cover everything in every video. Let me know when you don't find what you are looking for!

  • @marcogelsomini7655
    @marcogelsomini7655 Год назад +1

    interesting tecnique the sliding window ols, I am wondering what are the advantages respect a simple moving avg

  • @tmyersf4
    @tmyersf4 Год назад +1

    Something also worth doing a video on is to check statistically the best time of day or day of week or month to buy or sell crypto. Like are some days better for buying than others, what time of day do we see the biggest or weakest candles etc.

    • @Algovibes
      @Algovibes  Год назад

      Cool suggestion, thanks a lot!

  • @chigstardan7285
    @chigstardan7285 Год назад +1

    Not really familiar with statsmodels, can I use scikit-learn?

    • @bryan-9742
      @bryan-9742 Год назад +1

      statsmodels is much more clearer for OLS. Scikit Learn a lot of stuff happens in the background. It will be easier to work with scikit-learn for RF and adaboost algos that are designed for it IMHO.

    • @Algovibes
      @Algovibes  Год назад

      Yes, you can also use sklearn ofc!

    • @chigstardan7285
      @chigstardan7285 Год назад

      @@Algovibes I have no idea on how to implement the 25 day window with scikit-learn 🥺. Ah well.

  • @yanchobeats
    @yanchobeats Год назад +1

    Can you also feature in your next video how to enter only at candle open and not re-enter if we get taken out from sl or tp until a new candle opens if that makes sense 😀

    • @Algovibes
      @Algovibes  Год назад

      Could you elaborate a bit on that? You mean like get a certain signal (e.g. LSMA) and then double check if the asset drops in the next tick and only then enter / don't enter?

    • @yanchobeats
      @yanchobeats Год назад

      @@Algovibes Hey, thanks for the response! I'm trying some things out on tradingview, but I cannot figure out how to do this in python with binance api (I get the data with get.historical.klines)😀 Basically on my tradingview script when I have my condition to look for a buy (lets say live price crossed upper Bollinger) my script enters on the next candle open, rather then instantly on the live price when the cross happens. And it only enters once at this now newly formed candle (when the open is higher then the now closed previous candle), even if the trade gets taken out bc of stoploss or takeprofit, it does not re-enter (that's the tricky part I think a lot of people would really like to see, when I run my script now on python it enters non-stop, it would be awesome to enter only once per candle open when the conditions are met). I figured this might be pretty cool to try and test as a strategy, it works pretty well on my tradingview script, it forces the script to look for more quality (no fakeouts) with less trades (bc the once per candle entry, even if get taken out, you cannot re-enter that candle, you have to wait for the new one). Thank you once again!

  • @bleacherz7503
    @bleacherz7503 Год назад +1

    The problem with this approach is least squares has assumptions on variance and error terms) it might make more sense to create a column of daily returns and Look at autocorrelations ?

    • @Algovibes
      @Algovibes  Год назад +1

      Good thoughts but the shown way is the "technically" correct way to calculate the LSMA. Is there a better way to use a regression for e.g. prediction use cases (in specific considering the feature properties)? Yes!

    • @bleacherz7503
      @bleacherz7503 Год назад

      @@Algovibes thanks again. Great /best applied financial stats RUclips channel

  • @bryan-9742
    @bryan-9742 Год назад +1

    This is how I would have used the return to predict price. It kinda makes sense why this wouldn't work. I guess I could use an AR model but the point here Is the code to show how to use time series information. I used the first-differenced returns and, no suprise, there is no memory in trying to use this data to predict the price. No memory in first differenced return series I suppose isn't surprising.
    t_stats = []
    p_values = []
    coef_stderrs = []
    lsma_arr = []
    dates_arr = []
    f_stats = []
    #Lets make this somewhat stationary.
    df = yf.download('AAPL', start = '2020-01-01')
    df['return'] = df.loc[:,'Adj Close'].pct_change()
    df = df[1:]
    # Iterate through returns, running OLS regression every 21 days
    for ii in range(0, len(df) - 24):
    df_subset = df[ii: 25 + ii]
    # break
    # Run OLS regression
    y = df_subset['Adj Close']
    X = df_subset['return']
    model = sm.OLS(y, sm.add_constant(X), axis=1)
    results = model.fit()
    # break
    # Store t-stat, p-value, f_stat, and coefficient standard errors
    t_stats.append(results.tvalues[1])
    p_values.append(results.pvalues[1])
    coef_stderrs.append(results.bse[1])
    f_stats.append(results.fvalue)
    # this is the actual y_hat prediction at the date that we append to lists
    lsma_arr.append(results.predict()[-1])
    dates_arr.append(df_subset.iloc[-1].name)
    # Create dataframe to store results
    results_df = pd.DataFrame({'y_pred': lsma_arr, 'F-Stat': f_stats, 'T_stat': t_stats,
    'p_values': p_values, 'coef_stderrs': coef_stderrs,}
    , index = dates_arr)
    # concat with original dataframe
    new_df = pd.concat([results_df,df], axis = 1)
    # Print results
    print(results_df)

  • @orielh9585
    @orielh9585 Год назад +1

    it looks as if you are using the last value of each window to predict the last value of each window.where is the part where you look ahead in time?

    • @Algovibes
      @Algovibes  Год назад

      I am calculating the least squares moving average and don't run a prediction. Technically you can also do that using a very similar approach but this is not what's done here.

  • @komsan7142
    @komsan7142 Год назад +1

    Do you have a discord?

    • @Algovibes
      @Algovibes  Год назад +1

      Yes! :-)
      Looking forward to welcome you:
      ruclips.net/channel/UC87aeHqMrlR6ED0w2SVi5nwjoin

  • @pastramiking
    @pastramiking Год назад

    This sounds like you are just trying to do local polynomial regression. I think that LOESS exists in Python just like it does in R.

    • @Algovibes
      @Algovibes  Год назад

      No, I am not doing a local polynomial regression. Can you elaborate on why do you think so? Thanks!

    • @pastramiking
      @pastramiking Год назад

      Local polynomial regression with degree 1 is just a local linear regression across a moving window. I think what you are doing is a great intuitive explanation of this type of nonparametric model but I just don't understand why you would do this when the loess function has been in the preinstalled stats package in R for decades. Also, if you just write out the Taylor expansion you can implement this in a much more computationally efficient way.

  • @Kr3m3rsVids
    @Kr3m3rsVids Год назад +1

    😂 You finally did it, no idea what is happening here. To much statistics. Lot of Googling to do for me.
    Still: thanks for the great work, love your channel!

    • @Algovibes
      @Algovibes  Год назад +1

      That wasn't my intention, sorry! Just look up an OLS regression again and it will make more sense.