ACF - Auto Correlation Function (TS E10)

Поделиться
HTML-код
  • Опубликовано: 30 сен 2024

Комментарии • 21

  • @GaoyuanFanboy123
    @GaoyuanFanboy123 2 года назад +6

    thank you so much, every fucking article or video has only blasted me with mathematical formulas instead of just starting with a simple calculation by hand

    • @DimitriBianco
      @DimitriBianco  2 года назад +2

      You're welcome. It's because most of those videos don't fully understand what they are teaching.

  • @thienquytrang4329
    @thienquytrang4329 8 месяцев назад +1

    Thank you so much, but can you help me figure out somethings...in min 1:48, how can I calculate lag(1) or lag(2), I just know that is the correlation between x1 and x2 (with lag(1)), how do we know the result is in 0-1, is there any equations? (sorry, my English is quite bad :( )

    • @DimitriBianco
      @DimitriBianco  8 месяцев назад

      Each lag is the time series shifted back in time. So lag_1 is shifted back once. Lag_2 is the original x shifted back in time twice.
      When you calculate the correlation it is better the original x (time 0) and the lagged x. So in the example of lag_1 it is the correlation of x and x shifted back one time period.
      If you are programming this, you'll have to drop the extra values at the end of the lagged variables. For example, if you lag x by 1, then you'll have one extra value that won't align with a value of the original x.

  • @juliangang2956
    @juliangang2956 2 года назад +1

    Thanks for showing. I think when you show conceptually how the ACF works (min 9:10) by drawing the yellow boxes around the adequate pairs of lags, that there is a mistake. For ACF(2) you just contrast the lag(1) again, dont you?

    • @DimitriBianco
      @DimitriBianco  2 года назад

      Correct. The yellow box should include the first column and you are comparing the first column and the t-2 column for an ACF lag(2)

  • @alex_8704
    @alex_8704 4 года назад +1

    A very important topic. Thanks!

  • @nooraldinalfares9567
    @nooraldinalfares9567 2 года назад +1

    Thanks man, You're the best

  • @sentralorigin
    @sentralorigin 4 года назад

    hoping to learn how to remove indirect correlations in your next vid

  • @chengpeng7791
    @chengpeng7791 4 года назад

    Thanks for sharing. I wonder how practicable ARIMA model is cuz when I apply it on monthly return data, the prediction isn't that good.

    • @DimitriBianco
      @DimitriBianco  4 года назад +2

      I haven't built them on stock prices however almost every bank I've seen use them for banking do it wrong. This series will eventual build ARIMA style models in the correct way. A fun fact is that RNN's are the neutral network equivalent.

  • @diamondcutterandf598
    @diamondcutterandf598 4 года назад

    ACF(2) is the correlation of t-1 and t-2?

    • @DimitriBianco
      @DimitriBianco  4 года назад

      That's an excellent question which I planned to cover during the model building videos but I'll answer it here. I'm guessing you tried to run an ARIMA model in Python or R which automatically adds the other AR lags before the one you specified. While there is nothing mathematically wrong with doing this, theoretically in most cases it makes no sense. Think about a cyclical pattern for monthly data that is quarterly (every 3 months). Imagine a wave that peaks every third data point. For an AR term that was only at t-3 you would always pick up the same part of the wave. Every three periods you would hit the peak or every three periods you would hit the trough depending on where you start your measurement.
      So why do the free languages not have this option? I'm guessing a lack of experience from those building the packages. SAS is in my opinion the best time-series software available and automatically provides you with all the plots required to fit a time-series model correctly. SAS also allows you to use specific AR terms as well as the layered approach with all the AR terms before it. Proc ARIMA is really amazing!
      From a quant finance industry perspective, models with previous lags will have insignificant p-values for the periods between your AR term and the previous ones, indicating that those AR terms should not be used. The models also fit horribly and have issues with serial correlation in the residuals (residuals are not White Noise) though they may be normally distributed.

    • @diamondcutterandf598
      @diamondcutterandf598 4 года назад

      Dimitri Bianco
      Thank you for the detailed explanation. But im just a beginner into this topic and what you mentioned was quite advanced 😂. Basically, I was just asking this to solidify my understanding of PACF.
      If ACF(2) gives me the correlation between t and t-2, how is there indirect correlation (presumably between t and t-1) “baked into” this ACF(2) value?
      I’m really struggling to understand this and any help would be appreciated! :)

    • @DimitriBianco
      @DimitriBianco  4 года назад

      I miss spoke thinking the AR ( PACF) question was being asked. For the ACF, which is the moving average term, you will use the errors with a coefficient. The error from t-1 will be used at time t which will impact the model error at time t. That error at time t will be impact the calculation for the prediction at time t+1. The error indicates that you are missing something in the model. If you plot the ACF you should see a slow decay from one lag to the next which indicates there is information from the errors in part lags however they slowly disappear. This is one sign that the errors have a "baked in" effect and that a MA term could fix it.

    • @DimitriBianco
      @DimitriBianco  4 года назад

      A simpler answer might be, since we know time series are serially correlated, there is informantion from t-2 impacting t-1 which is impacting t which will continue to impact t+1. The ACF will graphically show you the correlation decaying each period. If that decay isn't present, there could be something different going on with your data.

    • @DimitriBianco
      @DimitriBianco  4 года назад

      The correlation calculation also includes the means and standard deviations meaning that information is present in the calculation. It's why we have to use OLS as a way to control for the in between lags in the PACF.

  • @skydotel3168
    @skydotel3168 4 года назад

    what's a better tool, Time Series or Monte Carlo Methods?

    • @DimitriBianco
      @DimitriBianco  4 года назад +1

      They are two different concepts but Monte Carlo can be applied to time-series. Monte Carlo is a method that uses random sampling to generate data. You have to specify a specific distribution when sampling though. In hard sciences we assume the law of large numbers applies and sampling will converge to a normal distribution.
      For time-series we know theoretically that the data (which is a stochastic process) is generated from a distribution that is sampled across time. Stationarity testing ensures that there is a stable distribution. Often time-series change distributions over time (non-stationary) and using Monte-Carlo will provide bad results. If the distribution is stable across time we can use Monte-Carlo for generating data for a time-series prediction.