How to Derive the Equation of the Normal Curve

Поделиться
HTML-код
  • Опубликовано: 22 дек 2024

Комментарии • 100

  • @danielc.martin
    @danielc.martin 9 месяцев назад +20

    You are not aware how much I appreciate this video

    • @BecauseMaths
      @BecauseMaths  9 месяцев назад

      ❤️❤️❤️🙏🙏🙏

  • @gilbertmiya4199
    @gilbertmiya4199 8 месяцев назад +2

    Excellent delivery. Well articulated.

  • @dysxleia
    @dysxleia 9 месяцев назад +10

    I can't express how much I appreciate this video

  • @aleph0540
    @aleph0540 9 месяцев назад +6

    Very well done. Rarely encounter something so fundamental so simply explained.

  • @EMC273
    @EMC273 9 месяцев назад +5

    you literally open my eyes. I was searching many explanations on the internet of how to derive this and i finally get it.

    • @BecauseMaths
      @BecauseMaths  9 месяцев назад

      ❤️❤️❤️

    • @SumriseHD
      @SumriseHD 6 месяцев назад

      Sounds very aggressive

  • @shuewingtam6210
    @shuewingtam6210 9 месяцев назад +3

    In video 15:05 there should be 1/2. (-1/2) appear only after integration of exponential (-u).

  • @pocojoyo
    @pocojoyo 9 дней назад +1

    Excellent!

  • @johnchristian5027
    @johnchristian5027 Месяц назад +1

    very thorough nice video!

  • @mtaur4113
    @mtaur4113 9 месяцев назад +4

    The differential equation is interesting, but to me, the defining feature is sum stability. Namely, the convolution of two normals is a normal. Stability is necessary for an attracting fixed point, and while not a proof of the CLT, it should definitely be a strong hypothesis by then.

  • @KryDu-lv3jk
    @KryDu-lv3jk 6 месяцев назад +1

    I really appreciate your effort

  • @KipIngram
    @KipIngram 9 месяцев назад +2

    The natural emergence of the normal distribution from practically any sort of underlying random process is one of the most important parts of really understanding how the world works. The "shape" of the individual low-level process doesn't matter much at all - it's the way the aggregation of large numbers of them behaves that leads to the normal distribution.

  • @alijoueizadeh2896
    @alijoueizadeh2896 9 месяцев назад +2

    Thank you for your precious time.

  • @KichereTheDataScientist
    @KichereTheDataScientist 7 месяцев назад +1

    God bless you

  • @هشامأبوسارة-ن7و
    @هشامأبوسارة-ن7و 10 месяцев назад +3

    Very insightful derivation of the Normal distribution. Thanks.

  • @26jcha
    @26jcha 9 месяцев назад +2

    may I ask why the integral of e^-u is not -e^-u? where did the negative sign go? Thank you for the great video!

    • @OOOOOIIOOOOOOIIO
      @OOOOOIIOOOOOOIIO 9 месяцев назад +1

      I do think there is actually an other video wich cancels out the one you are talking about. In fact at 15:01 I don't think there sould be any minus sign factprising by 1/2. Therefore when he integrates e^-u, he also forgets the minus there and so it perfectly compensates. I think doing his video he has mistaken while shooting it or whatever. Don't hesite to tell me if my explanation is wrong.

    • @BecauseMaths
      @BecauseMaths  9 месяцев назад

      Thanks

  • @beamshooter
    @beamshooter 9 месяцев назад +5

    Wow, that was surprisingly way simpler than I imagined. Next do boltzman distribution please!!

  • @abublahinocuckbloho4539
    @abublahinocuckbloho4539 9 месяцев назад +40

    how would you know this differential equation satisfies a normal distribution without knowing what the formula is for a normal distribution already. so you are not actually deriving the normal distribution from a differential equation, you are using circular reasoning by knowing in advance what a normal distribution is, to arrive at a differential equation it satisfies to justify the formula for a normal distribution that you already know. deriving it means without knowing in advance what the normal distribution formula is to arrive at a formula for a normal distribution

    • @korigamik
      @korigamik 9 месяцев назад +6

      No

    • @dysxleia
      @dysxleia 9 месяцев назад +31

      No, the differential equation comes from the desired qualitative definition. We want a function whose frequency falls off proportionally to it's distance from the mean. Without knowing the function, it can be reasoned that it will look something like a bell curve, with positive probability density everywhere, but rapidly falling away from a center peak value.
      If you let y be the distribution, M the mean, and x be the data values, then the question translates to: "For what function y is it true that the rate of change of y at a given point is proportional to the distance we currently are from M?"
      Directly writing this as a differential equation, you get what is in the video. Solving it gets you the specific equation.

    • @copernicus633
      @copernicus633 9 месяцев назад +11

      Silly. It is Not circular reasoning to show the solution to a particular differential equation results in the normal distribution. It’s provides much insight in fact. If someone “derives” Newtons laws from Lagrangian mechanics, it’s not circular reasoning to note it results in Newton’s laws which you know in advance. One has then shown equivalent formulations which are far from obvious.

    • @zaydmohammed6805
      @zaydmohammed6805 9 месяцев назад +7

      The differential equation was set up to interpolate binomial and Poisson distributions as n tends to infinity. The main reason this was done was because for large n, the binomial and Poisson distributions have factorial terms which are computationally intensive, so the mathematicians of that time wanted to approximate these distributions by a continuous function since as n is large the rectangles get finer. So the first time they ran into the normal distribution was by interpolation of these discrete distributions

    • @BecauseMaths
      @BecauseMaths  9 месяцев назад

      ❤️❤️❤️

  • @jimpim6454
    @jimpim6454 9 месяцев назад +2

    At the beginning when you do separation of variables and integrate why does -k(x-mu) become (-k(x-mu)^2)/2 ? Isnt mu just a constant? So it should just be -k(x^2 -mu*x)/2 ... ?

    • @jimpim6454
      @jimpim6454 9 месяцев назад

      Ah so you just add a constant term and absorb it into k???

    • @jimpim6454
      @jimpim6454 9 месяцев назад

      No actually don't understand why. Can someone explain?

  • @salkabalani1482
    @salkabalani1482 9 месяцев назад +1

    You are an excellent teacher

    • @BecauseMaths
      @BecauseMaths  9 месяцев назад

      Thank you! 😃❤️❤️❤️

  • @dank.
    @dank. 9 месяцев назад +2

    Nice! I loved the conversion to polar--it's much more succinct than methods of sovling the gaussian integral that I've seen before, though I think that says more about my mathematical experience than anything.

  • @alltronics1337
    @alltronics1337 9 месяцев назад +1

    16:06 Why isn’t is negativ e to the negativ u, instead of just e to the negativ u? When you differentiate e^-x you don’t get e^-x you will get -e^-x

  • @cameronspalding9792
    @cameronspalding9792 9 месяцев назад +2

    @ 14:38 Why is it -1/2, it should be +1/2 surely?

  • @Nibor999
    @Nibor999 9 месяцев назад +3

    Why have you used delta to represent the standard deviation rather than sigma?

    • @BecauseMaths
      @BecauseMaths  9 месяцев назад

      Thanks for noticing the greek symbols. Symbol Encoding issue

  • @TrevorKafka
    @TrevorKafka 9 месяцев назад +2

    This is a quite comprehensive video overall, but I do think it lacks a mention of /why/ that definition of normally distributed data appropriately describes the behavior of the balls-and-pegs simulation.

  • @lumina_
    @lumina_ 9 месяцев назад +1

    wow, this was very easy to understand, thank you!

  • @theupson
    @theupson 9 месяцев назад +1

    y' being proportional to y (edit: yx)is an interesting property of a normal distribution, but i don't think it's the a priori defining trait.
    probably the way to do it is to show that the characteristic function for sample average converges as n increases, and then compute the corresponding density. of course its far more usual to use MGFs, but the fourier transform is directly invertible, so you don't need to "already know" the normal distribution.

    • @BecauseMaths
      @BecauseMaths  9 месяцев назад

      Thanks for the insight❤️❤️❤️

  • @jamescao2008
    @jamescao2008 9 месяцев назад

    On the Definition, how does the differential equation apply the curve without any approval?

  • @jimpim6454
    @jimpim6454 9 месяцев назад +1

    I am not convinced by the step where you integrate at the separation of variables step and you say that (x-mu) should go to (1/2)(x-mu)^2. Applying the power rule would not give that result since mu is just a constant term, where did the mu squared come from out of thin air?

    • @lagnugg
      @lagnugg 7 месяцев назад

      you can use a substitution t = x - μ and get this result, or just with plain integration get k(x²/2 - μx) + C. here C is an arbitrary constant, so we can express C = kμ²/2 + C1 for some other constant C1, thus we get k(x² - 2xμ + μ²)/2 + C1 = k(x - μ)²/2 + C1

  • @dean532
    @dean532 9 месяцев назад +1

    Just because of the 2π in the formula, we can concur and confer that from all parts of the world that that somhow (ironically) the bell curve rests on the shoulders of a circle. Your approach is very informative and can benefit neurodiversity in the mathematical realm of academia with a simple and succinct explanation as this.

    • @BecauseMaths
      @BecauseMaths  9 месяцев назад

      Thanks❤️❤️❤️🙏🙏🙏

  • @hopefullysoonaweldingengineer
    @hopefullysoonaweldingengineer 8 месяцев назад +1

    Brilliant

  • @phyarth8082
    @phyarth8082 9 месяцев назад +1

    Galton board is based on Binomial theorem extraction of coefficients in you case C=(1+x)^13 expand. It has bell curve shape but Gaussian normal distribution is integral over real numbers lie. Dices, coin flips, Galton board pegs can be got also from Pascal triangle are integer numbers and has nothing to do with Gaussian norm. dist. formula.

  • @vaibhavpandey9779
    @vaibhavpandey9779 9 месяцев назад +2

    Beautiful explanation!

  • @ianrobinson8518
    @ianrobinson8518 9 месяцев назад +1

    I like this derivation starting from the differential equation. The differential equation provides an interesting insight into the nature of the curve that I had never appreciated. However I don’t think I’ve ever seen it before as a way to define the normal distribution. Any derivation I’ve seen is a practical one usually starting from the binomial distribution and applying the central limit theorem. Thus the differential equation would be a derived result.
    Can you point to a source which defines the normal distribution in this way? What would motivate it?

    • @BecauseMaths
      @BecauseMaths  9 месяцев назад

      Here is one link
      spoudai.unipi.gr/index.php/spoudai/article/download/853/932

    • @ritardstrength5169
      @ritardstrength5169 9 месяцев назад +1

      I’m currently taking diff eqs, which I’ve been told is not normally a prerequisite for statistics. But it looks like in this case, being able to start the derivation with dy/dx and then separate variables is a huge time saver.

    • @ianrobinson8518
      @ianrobinson8518 9 месяцев назад +1

      ⁠@@ritardstrength5169Perhaps, but my point is it’s not a definition of normality. It makes no connection to empirical results, namely the application of the central limit theorem to binomial experiments. It merely reverse engineers the derivative of the normal curve.

  • @Saahil-G3
    @Saahil-G3 9 месяцев назад +1

    This was awesome!

  • @exodus8213
    @exodus8213 9 месяцев назад +1

    Is this related to elliptic curves??

    • @BecauseMaths
      @BecauseMaths  9 месяцев назад +1

      Thanks

    • @exodus8213
      @exodus8213 9 месяцев назад

      @@BecauseMaths i mean is it related it seems to be related to

  • @SurinderKumar-os5il
    @SurinderKumar-os5il 9 месяцев назад +1

    Thanks

  • @albertopanocchi8861
    @albertopanocchi8861 9 месяцев назад +1

    How did you get the left hand side of the differental equation? How did you know it was of first degree ? Did you just guessed ?

    • @BecauseMaths
      @BecauseMaths  9 месяцев назад

      Thanks for the suggestion, I should have done that.

  • @TranquilSeaOfMath
    @TranquilSeaOfMath 9 месяцев назад +1

    Informative.

  • @suka_sukaGaming
    @suka_sukaGaming 9 месяцев назад +1

    why you using delta? instead of sigma??

    • @BecauseMaths
      @BecauseMaths  9 месяцев назад

      Should be sigma. Typing issue

  • @charlesvanderhoog7056
    @charlesvanderhoog7056 9 месяцев назад +1

    you should explain how the formula of dy/dx comes about. It is straightforward but a layman has no clue. You can talk, e.g. about the amount of change in the size of two consecutive surfaces underneath the curve. So you get the difference between (X1-m).(Y1-0) and (X2-m).(Y2-0). Any kid can understand that.

    • @BecauseMaths
      @BecauseMaths  9 месяцев назад

      Thanks for the suggestions❤️

  • @h1a8
    @h1a8 9 месяцев назад

    The true question is why a normal distribution (based off the differential equation definition you gave) is representative of many real world distributions (even the marble experiment you showed)?

    • @MrPixifan
      @MrPixifan 9 месяцев назад +1

      Given the tdefinition of a Gaussian distribution in the video (values further away from the mean have less height in the graph) gives a way to interpret the marble example:
      There is only a single path that a marble can take to land at the far right or far left, but many paths will land it in the middle. Another way to say it is the probability of a marble landing in a certain spot is given by the number of paths the marble can take to land in that spot out of the total number of paths.

    • @BecauseMaths
      @BecauseMaths  9 месяцев назад

      ❤️❤️🎁

  • @EricPham-gr8pg
    @EricPham-gr8pg 9 месяцев назад +1

    Need measure ultrasound force to see background biased

  • @esorse
    @esorse 9 месяцев назад +1

    Accepting Hubble's finding that space is expanding and hence, our universe has a spatio-temporal origin and boundary, is apparently the conventional world-view in empirically focused natural and sub-space social science, excluding any contingent statement of form, "this may happen", since it's temporal domain doesn't coincide with this, but you may be able to 'have your cake and eat it', through dedicated notation like < as the opposite of >, instead of concatenated non-number-numeral - < number-numeral say, because the law of non-contradiction : nothing is it's opposite, is irrelevant in an instrumentalism consistent predicted world.

  • @rujon288
    @rujon288 8 месяцев назад +1

    mindblown

  • @theodorostsilikis4025
    @theodorostsilikis4025 9 месяцев назад +6

    just a small thing to mention. [δ=delta= "δέλτα"] and [σ=sigma ="σίγμα"].

  • @Hemu_ArjunSrivastava
    @Hemu_ArjunSrivastava 9 месяцев назад +1

    Subscribed!😊

  • @emmanueldavid118
    @emmanueldavid118 9 месяцев назад +3

    the symbol you are using for sigma is actually delta 😂

  • @pawelpap9
    @pawelpap9 9 месяцев назад +1

    One does not “derive” normal equation. The presenter is misinformed.

    • @Hemu_ArjunSrivastava
      @Hemu_ArjunSrivastava 9 месяцев назад +3

      You're misinformed. The statement "one does not derive the normal equation" is not accurate. The normal equation can indeed be derived, and it is a crucial aspect of linear regression analysis. The normal equation provides an analytical solution to the linear regression problem, specifically for finding the values of the parameters that minimize the cost function.
      Here's a brief overview of the derivation:
      Given a hypothesis function (h_{\theta}(x) ) and a cost function ( J(\theta) ), the goal is to minimize (J(\theta) ). In matrix notation, this problem can be represented as minimizing the function ((X\theta - y)^T(X\theta - y)), where ( X) is the design matrix of input features, (\theta) ) is the design matrix of input features, (\theta) is the parameter vector, and (y) is the vector of output values.
      To find the minimum, one takes the derivative of the cost function with respect to (\theta ) and sets it to zero. This results in the normal equation: (X^TX\theta = X^Ty ). If ( X^TX ) is invertible, we can solve for (\theta) by multiplying both sides by ( (XTX){-1} ), yielding (\theta = (XTX){-1}X^Ty), which is the solution that minimizes the cost function.
      This derivation is a standard procedure in
      machine learning for obtaining the least squares estimates of the regression coefficients.

    • @BecauseMaths
      @BecauseMaths  9 месяцев назад +1

      ❤️❤️❤️