Basic Statistics - Applied Time Series Analysis in Python and TensorFlow

Поделиться
HTML-код
  • Опубликовано: 7 сен 2024
  • 👉Get the course at 87% off: www.udemy.com/...
    Link to the full notebook: github.com/mar...
    Link to the datasets: github.com/mar...
    Email me for a coupon if the one above expired: peixmarco@gmail.com
    --------------------------------------------------
    This might be review for some of you, but these concepts represent the building blocks of time series analysis using the statistical approach. In this lesson, we will differentiate between descriptive and inferential statistics.Descriptive statistics are a set of values and coefficients that summarizes a dataset. It provides us with information about tendency and variability.
    For example, the mean, the median, standard deviation, minimum and maximum values are all part of descriptive statistics.Visualizations are another important aspect of descriptive statistics. They allow us to quickly gain insights and steer the analysis in the right direction.
    Usually, histograms and scatter plots are often used in time series analysis, and you will see how important they are, once we start modelling.
    Inferential statistics are used to infer properties from a dataset. These properties will help us to forecast the future. Also, this is the time where we test different hypotheses.
    Now, hypothesis testing is a major component of inferential statistics. It allows to determine if the trend we observe is due to randomness or if there is a real statistical significance.To do so, we must define a hypothesis and a null hypothesis.
    The hypothesis is the trend we are trying to extract from the data, while the null hypothesis is its exact opposite.Then, we can run some tests to see if there is statistical significance or not. In this case, the F-statistic is large, and the p-value for our parameters is less than 0.05. Therefore, we can assume that there is statistical significance and reject the null hypothesis.
    Another way of evaluating our models will be to study the residuals and analyze the QQ plot. QQ plot stands for quantile-quantile plot. It is a scatter plot of 2 sets of quantiles. One is a theoretical set, which comes from a normal distribution, and the other set is from the residuals of the model. The QQ plot is useful, because if both sets come from the same distribution, or at least a similar one, then we get a straight line!Before moving on, let’s touch on the subject of residuals. The residual is simply the difference between the predicted value from our model, and the actual value in the data. Now, we want the residuals to be normally distributed. Why? Because that would mean that the errors between predicted and actual are due to randomness. Therefore, it’s a way for us to validate that our model is indeed explaining the variability in the data.

Комментарии • 4

  • @exstream_play9144
    @exstream_play9144 2 месяца назад

    Wait, just realized you are such a small RUclipsr. Thought you would have at least 200,000 subscribers with this quality video. Explaining everything in depth and very understandable with very helpful and educational videos!

  • @purecheese9012
    @purecheese9012 2 месяца назад

    Seriously this channel is amazing, you deserve so many more subscribers man!