Simple Linear Regression: The Least Squares Regression Line

Поделиться
HTML-код
  • Опубликовано: 9 сен 2024
  • An introduction to the least squares regression line in simple linear regression.
    The pain-empathy data is estimated from a figure given in:
    Singer et al. (2004). Empathy for pain involves the affective but not sensory components of pain. Science, 303:1157--1162.
    The Janka hardness-density data is found in:
    Hand, D.J., Daly, F. , Lunn, A.D., McConway, K., and Ostrowski, E., editors (1994). The Handbook of Small Data Sets. Chapman & Hall, London.
    Original source: Williams, E.J. (1959). Regression Analysis. John Wiley & Sons, New York. Page 43, Table 3.7.

Комментарии • 90

  • @s2ms10ik5
    @s2ms10ik5 3 года назад +32

    why is finding information this clean and organized about statistics is so hard? even the textbooks have confusing languages, inconsistent notations etc. thank you so much for all your hard efforts. your videos are invaluable resources.

  • @Stalysfa
    @Stalysfa 9 лет назад +26

    OH MY GOD ! I FOUND THE HEAVEN ON RUclips !
    I was so scared to fail comm 215 as I was not understanding anything about this damn course and I found your chanel !
    You are genius ! You save me as you explain things so well ! I am french, I am not used to technical english so I wasn't following well in lectures and thanks to you, I CAN NOW UNDERSTAND THESE DAMN CHAPTERS FOR THE FINALS ! THANK YOU SIR ! YOU ARE MY HERO !

  • @probono2876
    @probono2876 8 лет назад +9

    Hi Dr Jeremy Balka,
    The entire JB Statistics video series is a truly outstanding work.
    Many thanks for making your work public so that people like me can benefit from it.
    Cheers

  • @catherinedumbledore
    @catherinedumbledore 7 лет назад +3

    Thank you very much for taking the time to do this, it is very much appreciated. All the concepts are perfectly explained and generally done much better than my 2 hour long university lectures!

    • @jbstatistics
      @jbstatistics  7 лет назад +2

      You are very welcome. I'm glad you found my video helpful!

  • @renjing
    @renjing 3 года назад +1

    I decide to stick to this channel. Very closely related to reality, logically explained, and useful. Wish I had found out sooner.

  • @puneetkumarsingh1484
    @puneetkumarsingh1484 9 месяцев назад

    Thank you for saving my Probability and Statistics course grades!

  • @valeriereid2337
    @valeriereid2337 Год назад

    Thank you so very much for making your lectures available. It is very helpful getting these excellent explanation at my own pace.

    • @jbstatistics
      @jbstatistics  Год назад

      You're very welcome. I'm glad to be of help!

  • @danielladamian1596
    @danielladamian1596 11 месяцев назад

    Good bless you Sir. U have d most easy to follow Explanatory Statistics channel on RUclips. Wish I could rate more than 5 stars. Thank you so much ❤️

  • @Jean-cu8if
    @Jean-cu8if 7 лет назад +25

    your voice is so cool

  • @ezquerzelaya
    @ezquerzelaya 8 лет назад +5

    Love your channel. Thank you for the hard work!

    • @jbstatistics
      @jbstatistics  8 лет назад

      +Jrnm Zqr You are very welcome. I'm glad I could be of help!

  • @bunmeng007
    @bunmeng007 9 лет назад

    It's wonderful to have your materials in addition to my lectures. It's simple to understand and very helpful indeed. Thank you

    • @jbstatistics
      @jbstatistics  9 лет назад

      You are very welcome! I'm glad you find my videos helpful.

  • @brodeurheaton
    @brodeurheaton 11 лет назад +2

    You're an awesome Professor! There are not many people out there who would take the time to do this for their students. Thanks for making statistics easier to understand!

  • @jojokaleido
    @jojokaleido 8 лет назад +1

    Thanks! It's so much easier to learn statistics with your help!

  • @duartediniz8255
    @duartediniz8255 9 лет назад +1

    Thanks man, love the simplicity of you videos! Cheers

  • @pesterlis
    @pesterlis 6 лет назад

    Thank you so much!! I didn't know error was assumed to be normally distributed so I was confused for the longest time

  • @juji432
    @juji432 9 лет назад +1

    Thank you for making these videos they were extremely helpful in my learning of the content.

    • @jbstatistics
      @jbstatistics  9 лет назад

      +juji432 You are very welcome! All the best.

  • @saifa9456
    @saifa9456 10 лет назад

    I was so confused about regression but now it seems very simple
    Thank you,

  • @CortezPro
    @CortezPro 7 лет назад

    Didn't know about the last part, great explanation!

  • @TheSoundGrid.
    @TheSoundGrid. 9 лет назад +1

    Thanks a lot sir. The most informative video i seen on entire you tube. Please provide video on "The Likelihood function" also.

    • @jbstatistics
      @jbstatistics  9 лет назад +1

      +Rupesh Wadibhasme Thanks for the compliment Rupesh! And thanks for the suggested topic. I do hope to get videos up on the likelihood function and maximum likelihood estimation, but time is a little short these days. All the best.

  • @cassini4052
    @cassini4052 4 года назад

    Great help for finals week

  • @waawaaweewaa2045
    @waawaaweewaa2045 11 лет назад

    Outstanding, very clear.

  • @jbstatistics
    @jbstatistics  11 лет назад +4

    Thanks! This project has just about killed me, but it seemed like a good idea at the time :)

  • @sarygirl4776
    @sarygirl4776 8 лет назад +1

    you are a big help! oh my goodness! thank you so much!!! :)

  • @ABo-jr8pg
    @ABo-jr8pg 5 лет назад +2

    _You don't yet know how to fit that line but I do_
    Thanks for making statistics kinda fun :)

  • @mlbbsea6446
    @mlbbsea6446 6 лет назад

    Your videos very helpful. Big thanks

  • @donatorolo2779
    @donatorolo2779 7 лет назад +2

    Could you explain why « (Sx)^2 », « (Sy)^2 » and « Cor(x,y) » are divided by « n-1 », and not just « n » ? and by the way your videos are the best explanation on this subject ! Definitely a life saver. Keep on the good work =D

    • @jbstatistics
      @jbstatistics  7 лет назад

      Thanks for the compliment! I have a video that discusses the one sample case:
      The sample variance: why divide by n-1. It's available at ruclips.net/video/9ONRMymR2Eg/видео.html

    • @donatorolo2779
      @donatorolo2779 7 лет назад

      Thank you very much for your reply. So kind ! I'll watch it =D

  • @blink11101
    @blink11101 9 лет назад

    Great video, cheers!

  • @jbstatistics
    @jbstatistics  11 лет назад

    Thanks!

  • @Cleisthenes2
    @Cleisthenes2 Год назад +1

    solidarity for Canadians who call zero 'nought'

  • @brodeurheaton
    @brodeurheaton 11 лет назад

    Doesn't everything seem that way at the time. Speaking of which, after all those hours of studying and review, guess who forgets to bring a calculator to the exam last night. This guy.

  • @jbstatistics
    @jbstatistics  11 лет назад

    Merci!

  • @myworldAI
    @myworldAI 3 года назад

    thanks

  • @jbstatistics
    @jbstatistics  11 лет назад

    You're welcome!

  • @pallavibhardwaj7465
    @pallavibhardwaj7465 11 лет назад

    very nice...thanx.

  • @mushtaqahmad8329
    @mushtaqahmad8329 7 лет назад

    Good videos,,,,-,,,,,I learn a lot and clear my concept easily

  • @mennaehab2409
    @mennaehab2409 4 года назад

    your videos are amazing may ALLAH bless you.

  • @StephenDoty84
    @StephenDoty84 5 лет назад

    7:13 and if two points determine a line, once you know x, y, the mean value of x,y, then just use the slope to determine the next value of y for a given change in x above x-mean value, such that using the y intercept is not needed to make a second point? Sometimes x with a value of zero is not practical to assume either, as when you use x to be the price of an ounce of gold and y to be the price of ten oz.'s of copper, in a scatterplot.

  • @ahmedabdelmaaboud3460
    @ahmedabdelmaaboud3460 4 года назад

    Excellent explanation, but What is the interpretation of model equation?

  • @ScilexGuitar
    @ScilexGuitar 7 лет назад

    Hi, im wondering how to approximate the unknown b in a Rayleigh-distributed random variable using least squares having some values that the random variable take. Is it possible to give a short explanation of that?

  • @kamzzaa7265
    @kamzzaa7265 Год назад

    4:36
    If we can solve for beta0 and beta1 using the equations beta0 = mean(y) - beta1(mean(x)) and beta1 = cov(x,y)/var(x). why should we use OLE instead?

    • @jbstatistics
      @jbstatistics  Год назад +1

      We're not solving for beta_0 and beta_1, as they are parameters whose true values are unknown. We are solving for the least squares estimators of beta_0 and beta_1. At 4:36 I'm referring to the sample covariance of X and Y, and the sample variance of X, and just giving another way of expressing the formula we just derived.

  • @jiaxiwang6457
    @jiaxiwang6457 9 лет назад

    hi jb, just curious, why all your videos dose not have ad? I love it !!!!!

    • @jbstatistics
      @jbstatistics  9 лет назад +12

      Lixi W I don't enable ads for a number of reasons. The main one is that I'm simply trying to help people learn statistics, and forcing people to watch 5 seconds of an ad before getting some help just feels wrong. And the amount of revenue would be pretty small (forcing people to watch a video ad 3 million times just so I can get $2k or so taxable dollars just doesn't add up to me).

    • @luisc212
      @luisc212 8 лет назад +1

      S / O to @jbstatistics for not being a sell out!

    • @jbstatistics
      @jbstatistics  8 лет назад

      +Luis C Thanks Luis!

  • @liftyshifty
    @liftyshifty 5 лет назад +1

    hi, I dont understand why B1 is the SPxy/SSxx, can you please explain?

  • @nyashanyakuchena786
    @nyashanyakuchena786 5 лет назад

    Hi i like your videos. I had a question. I know that the values you list for b1 and bo work when the errors follow N(0,var(x)). My question is what would the least squares estimators for b0 and b1 be if the errors follow N(0,2x)

    • @jbstatistics
      @jbstatistics  5 лет назад

      The least squares estimators are the least squares estimators -- they are the same formulas regardless of the distribution of errors.. The *properties* of the least square estimators depend on what the distribution of the errors is. Are you asking what would happen if the variance in the epsilons increases with X? If there is increasing variance, and we ignore that, then the resulting least squares estimators (the usual formulas) will still be unbiased, but the reported standard errors will be smaller than they should be.

  • @d3thdrive
    @d3thdrive 4 года назад

    Beauty

  • @sebgrootus
    @sebgrootus 3 года назад

    I love my jbstatistics, my superhero

  • @vivianandlin
    @vivianandlin 8 лет назад

    Hi, thank you for this video, one question, in 6:44, the videos gives residue must sum to zero for least square regression, why is that? The residue is just minimized that could be non-zero, can you explain that?

    • @randycragun
      @randycragun 6 лет назад +1

      Suppose that the average of the residuals was 2 (the sum would be 2 times however many points there are). That means you could move the line up vertically by 2 and have a better fit to the data points. For a simple example, imagine two points: one with a residual of 4, and another with a residual of 0 (it is on the regression line). Then the sum of the residuals is 4, and the mean of the residuals is 2. But we can do better than this by moving the regression line up to go between these points (rather than directly through one of them). In that case, the residuals would become -2 and 2, respectively, and their sum would be 0. You can see this also by looking at the sum of the squares of the residuals. In this case, the sum of the squares of the residuals is 0^2+4^2 = 16. That is large compared to what we get if we move the line up by 2 so that it goes between the two points. Then the sum of the squares of the residuals is (-2)^2+2^2 = 8. This is really easier to illustrate by drawing points and lines, so I hope you try that yourself.

  • @Delahunta
    @Delahunta 6 лет назад

    Hi i like your videos. I had a question. I know that the values you list for b1 and bo work when the errors follow N(0,var(x)). My question is what would the least squares estimators for b0 and b1 be if the errors follow N(0,2x).

    • @jbstatistics
      @jbstatistics  6 лет назад

      The least squares estimators are still the least squares estimators, regardless of whether the variance of y is constant or has some relationship with x. If we use our regular least squares estimators in a situation where the variance of y is non-constant, then the estimators are still unbiased but the standard errors will be off (and we thus may have misleading conclusions in our statistical inference procedures). If the assumptions of the model are all met, except for the fact that the variance of y is changing with x, then weighted regression will take care of that. In weighted regression, the notion is that points that have a high variance in the random variable y contain less information, and thus should receive less weight in the calculations. We typically weight by the inverse of the variance.

    • @Delahunta
      @Delahunta 6 лет назад

      Okay thanks, so under the weighted transformation the estimator for B1 would be (x'wx)^(-1) x'wy where the w matrix has (1/2xi)^2 for its diagonals?

  • @cife94
    @cife94 11 лет назад

    Can you do that in Excel?

  • @TreBlass
    @TreBlass 7 лет назад

    What's the difference between *Random error component* and *Residuals*?

    • @jbstatistics
      @jbstatistics  7 лет назад

      Epsilon represents the theoretical random error component (a random variable). The residuals are the differences between the observed and predicted values of Y.

    • @TreBlass
      @TreBlass 7 лет назад

      So epsilon is basically a Random variable which takes the disturbance (values) from the mean (the regression line) and a Residual is an element of a Random error component?
      In other words, a Residual is a subset of Random Error Component?
      Also, a Residual is one of the many disturbances from the regression line for a given X?
      pls correct me if I am wrong

  • @64FireStorm
    @64FireStorm 5 лет назад

    the sum of PRODUCTS

  • @acinomknip
    @acinomknip 5 лет назад

    we don't know how to fit the line but I DO. LOL

  • @nostro1940
    @nostro1940 3 года назад

    My teacher uses b0 & b1 as â and b^

  • @captainw6307
    @captainw6307 4 года назад

    I love your videos, which consists of concise knowledge structure and sexy voice >.

  • @BraveLittIeToaster
    @BraveLittIeToaster 7 лет назад

    so does the computer just guess at random?

    • @jbstatistics
      @jbstatistics  7 лет назад

      I don't know what you're asking. If you clarify, I might be able to answer. Cheers.

    • @BraveLittIeToaster
      @BraveLittIeToaster 7 лет назад

      @5:00 what formula does the computer use to identify the slope/intercept of y?

    • @jbstatistics
      @jbstatistics  7 лет назад +1

      The software calculates the sample slope and intercept using the formulas I discuss earlier in the video (at 4:09).

  • @doopydave
    @doopydave 10 лет назад

    hahaha today I found out we go to the same university professor Balca xD

  • @09ak31
    @09ak31 7 лет назад

    dude sounds like max kellerman lol

  • @vikasindoria4788
    @vikasindoria4788 11 месяцев назад

    tmkc

  • @minabotieso6944
    @minabotieso6944 3 года назад

    This was not your best

  • @jbstatistics
    @jbstatistics  11 лет назад

    Thanks!

  • @jbstatistics
    @jbstatistics  11 лет назад

    You're welcome!

  • @Malangsufi
    @Malangsufi 11 лет назад

    Thanks