Linear Regression in R, Step-by-Step

Поделиться
HTML-код
  • Опубликовано: 24 июл 2017
  • This video, which walks you through a simple regression in R, is a companion to the StatQuest on Linear Regression • Linear Regression, Cle...
    If you want to just copy and paste the R code, you can get it from the StatQuest GitHub site: github.com/StatQuest/linear_r...
    If you'd like to support StatQuest, please consider...
    Patreon: / statquest
    ...or...
    RUclips Membership: / @statquest
    ...buying my book, a study guide, a t-shirt or hoodie, or a song from the StatQuest store...
    statquest.org/statquest-store/
    ...or just donating to StatQuest!
    www.paypal.me/statquest
    Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:
    / joshuastarmer
    #statquest #regression

Комментарии • 239

  • @statquest
    @statquest  2 года назад +4

    Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/

  • @rohitekka2674
    @rohitekka2674 3 года назад +38

    3 years later, still saving lost souls! Intros are amazing as usual :D

  • @1typhon162
    @1typhon162 4 года назад +50

    Came for the intro, Stayed for the infos

    • @statquest
      @statquest  4 года назад +2

      That is an awesome rhyme! I love it. :)

    • @mistawess
      @mistawess 3 года назад

      The Intro made me subscribe

  • @petemurphy7164
    @petemurphy7164 5 лет назад +35

    I know I have written a few comments on your videos,,, but ,,, Bam,,,, they are good.
    I am doing a uni degree in statistics and having trouble understanding the books and lectures, these videos are really helping me out - THANK YOU!!!!!!!

  • @weichengshi3104
    @weichengshi3104 5 лет назад +26

    What a clear and awesome explanation! Thank you.

  • @muhammedhadedy4570
    @muhammedhadedy4570 2 года назад +3

    I think this is the best explanation I've ever seen of linear regression output, and in just 5 minutes.
    I'm really impressed. Thanks a lot for your great videos. Please, keep up the great work.

  • @111kgaogelo
    @111kgaogelo 3 года назад +16

    I have been struggling to understand simple regression and other statistics concepts. Reading books and trying to search some other You tube channels I came across your video. I must say I stopped searching because your videos are so great and explain things in simple terms for all beginners and advanced students to understand. Thank you so much.

  • @fmgirl99
    @fmgirl99 4 года назад +6

    I love this video!! explained everything super clearly and quickly, thanks so much!

    • @statquest
      @statquest  4 года назад +2

      Thank you very much! :)

  • @Get-A-Load-Of-This
    @Get-A-Load-Of-This 6 лет назад +3

    Another great video, Josh!. Can't wait for logistic and multiple linear/non-linear regression videos, hopefully with explanantion of coding for R if needed.

    • @kakusniper
      @kakusniper 6 лет назад

      I was reading about DEseq2 and edgeR which deal with design matrices, thats why I jump onto your videos abour GLMs. Would be waiting eagerly for that video. Thanks again.

  • @rohit_banga
    @rohit_banga 4 года назад +5

    Like StatQuest!! More like love StatQuest!! Thanks for all the work...

    • @statquest
      @statquest  4 года назад +1

      Hooray! Thank you very much! :)

  • @JoaoVitorBRgomes
    @JoaoVitorBRgomes 4 года назад +3

    Good, Josh, I am listening to your songs on your website. I liked specially Saturday and I Try My Best. Also we people from Brazil are learning a lot from you, you and some youtubers are like the neoEnlightment. God bless!

  • @foedeer
    @foedeer 3 года назад +1

    Thank you for explaining this in a simple way!

  • @vaibhavtyagi1379
    @vaibhavtyagi1379 5 лет назад +3

    Nice video again.., amazing Josh..., keep pushing your limits

  • @nicaisechristiane5369
    @nicaisechristiane5369 3 года назад +4

    You are precious!!! Thank for the brilliant explanation!!!

  • @epijournal8423
    @epijournal8423 2 года назад +1

    This is the best explanation or of regression I have ever seen.
    Made me a subscriber on just one vid, Bravo!

  • @MR-ej7kv
    @MR-ej7kv 3 года назад +1

    OMG such an exciting video. Thank You and have a nice day.

  • @jimothy221
    @jimothy221 3 года назад +2

    I like StatQuest! Thanks for the video - very helpful and concise

  • @heenaparveen8803
    @heenaparveen8803 2 года назад +1

    Thank you so much.....wid help of your video m able to finish out my assignment. Trust me I ended up watching almost 10 videos...but only got to do with yours ..thank you once again.

  • @wrjazziel
    @wrjazziel 2 года назад +2

    😁You got me with the intro song! 😂... Now I'll pay attention

  • @redherring0077
    @redherring0077 2 года назад +1

    I started tuning into this channel after i listening to your awesome explanation of gradient descent. Now i am a regular. BAAMMMM!!

    • @statquest
      @statquest  2 года назад +1

      BAM!!!! :)

    • @redherring0077
      @redherring0077 2 года назад +1

      @@statquest haha. Merry Christmas in advance!!

  • @danielmwabila8064
    @danielmwabila8064 Год назад +1

    Thank you very much, that was an uniquely clear explanation!

  • @mohammadalidastgheib2688
    @mohammadalidastgheib2688 Год назад +1

    What a clear explanation; Thank you.

  • @theforester_
    @theforester_ 2 года назад +1

    Thanks mate. Really good explanation!

  • @classroom_vipin
    @classroom_vipin 2 года назад +1

    Wonderfully explained !

  • @gabrielamarques7220
    @gabrielamarques7220 2 года назад +1

    statquest I love you! I will only pass stats bc of you thank you so much guys

  • @annali9577
    @annali9577 3 года назад +1

    Ur genius, literally saved my life

  • @hosseinhajinejad5324
    @hosseinhajinejad5324 2 года назад +1

    You are amazing! thank you for doing this!

  • @ramyasriundavalli2644
    @ramyasriundavalli2644 2 года назад +1

    You saved my day.!!! Thank youuu..!!

  • @sirfsimran482
    @sirfsimran482 3 года назад +1

    Thanks for this amazing video Buddy :)

  • @OpheliaSHolmes
    @OpheliaSHolmes 2 года назад +1

    this channel is the garden of paradise

  • @KD-zc7rv
    @KD-zc7rv 2 года назад +1

    Do I *like* StatQuest? I love StatQuest. I'm getting all the knowledge for my final project from you and feeling like I am starting to understand statistics a bit. Thank you so much for saving my ass

  • @aiuslocutius9758
    @aiuslocutius9758 2 года назад +1

    Thank you! This helped a lot.

  • @JoshuaDHarvey
    @JoshuaDHarvey 3 года назад +1

    Great video, thank you!

  • @cajamarcawarrior
    @cajamarcawarrior 2 года назад +1

    Thank you! A great video!

    • @statquest
      @statquest  2 года назад +1

      Glad you liked it!

  • @mToreno1992
    @mToreno1992 6 лет назад +1

    I admit i'm only subscribed because of the awesome intros.... I'm not even taking my statistics course anymore?!

  • @vikasbhandi4244
    @vikasbhandi4244 5 лет назад +1

    Going good keep it up

  • @kempisabel9945
    @kempisabel9945 5 лет назад +1

    THANK YOU

  • @abdullahlanggeng2128
    @abdullahlanggeng2128 4 года назад +1

    Thanks for all of the videos. They really help me a lot understanding statistics. I was always trying to avoid statistics, but your videos make me stay to learn.
    Btw, can I request GLMM on R please?

    • @statquest
      @statquest  4 года назад

      Thanks! I'll keep GLMM in R in mind.

    • @navjotsingh2251
      @navjotsingh2251 2 года назад

      Statistics is an easy and enjoyable subject, you just need a good teacher to show you that. Stat quest is amazing at presenting this in an easy and enjoyable way. It’s hard to find good content, but when you do it changes everything.

  • @forestsunrise26
    @forestsunrise26 3 года назад +2

    I don't like StatQuest - I LOVE StatQuest, I really do! Thank you so much Josh!

  • @minahabibi1008
    @minahabibi1008 2 года назад +1

    you are incredible thank you

  • @manuelargos
    @manuelargos 2 года назад +1

    Tu eres el Mejor!!!

    • @statquest
      @statquest  2 года назад

      Muchas gracias!!! :)

  • @Anonymous-ud8qz
    @Anonymous-ud8qz 2 года назад +1

    Thank you!

  • @annak448
    @annak448 3 года назад

    Very useful! :) Thank you! Maybe one day you could also do sth on Categorical regression, different types of coding and how to do it in R :)

    • @statquest
      @statquest  3 года назад

      Like this? ruclips.net/video/CqLGvwi-5Pc/видео.html and ruclips.net/video/Hrr2anyK_5s/видео.html

  • @luizbeniz
    @luizbeniz 6 месяцев назад +1

    amazing video! channel is also outstanding, makes learning stats a (little) less difficult haha I would suggest, if you have expertise, to do a series of simples videos in Stata (same topics as the ones you have specifically for R)… thanks!

    • @statquest
      @statquest  6 месяцев назад

      I'll keep that in mind.

  • @veereshmeti1
    @veereshmeti1 5 лет назад

    Thank you

  • @MaiSirry
    @MaiSirry 3 года назад +1

    I like StatQuest tuuuuuuu!!! la la la!

  • @cynthiacabales22
    @cynthiacabales22 4 года назад +6

    MY FAVORITE

  • @celsiusfahrenheit1176
    @celsiusfahrenheit1176 4 года назад +1

    I keep coming back here

  • @adelutzaification
    @adelutzaification 7 лет назад

    Hey Josh! Good job again. I think the addition of an applied/ how to companion video to the one presenting the theory looks really good. It is pretty much what I had in mind when I mentioned it to you. You really managed to keep the "how to" video very short but in the same time to add enough detail so that one can directly apply the theory to a dataset at hand. Me likes it ;)
    May I suggest an add-on video about doing multiple linear regression? As other topic, multivariate analysis with R companions. Maybe adding R companions to the topics you have already covered theoretically: PCA, heatmaps, clustering etc?
    Thanks a lot for your work. Maybe you could get compensated for it? Why not add the option to those who appreciate your work?

    • @adelutzaification
      @adelutzaification 7 лет назад

      I just realized that you actually have a website with more elaborate examples of R tutorials/applications that you don't advertise enough. If I were you, I would include discreetly the url on each of my slides. Like a header or footer...
      Re funding, from what I hear, youtube is not a moneymaker anymore due to changes in policies. I noticed that people with youtube channels have an associated website on which they accept donations or subscriptions via Paypal. Here's an example: freedomainradio.com/donate/
      That should be pretty easy to setup if you have a paypal account.
      Content producers also use Patreon. www.patreon.com/ Get yourself rich(er)!

    • @Get-A-Load-Of-This
      @Get-A-Load-Of-This 6 лет назад

      I totally agree with Ed. Patreon is a good platform for contribution that a lot of RUclipsrs use. This channel is going to be bigger as such videos keep coming up. There is lack of RUclips videos explaining statistical concepts in such clarity and also lack of linking them with practical applications.

    • @adelutzaification
      @adelutzaification 6 лет назад

      Joshua Starmer
      Hey Josh,
      As another possible platform for your content and for more exposure (the academish kind), you may want to check Udemy. I took a bunch of courses from there and there were pretty useful. Not sure how much money u can make because the courses seem to be discounted quite a bit, but you'll definitely have a more "formal" audience and you could put in your resume that u taught a course online.

  • @smika710
    @smika710 2 года назад +1

    I love stats quest

  • @ambrishsingh8326
    @ambrishsingh8326 4 года назад

    Hi Josh, It would be great if you can develop/upload the videos for linear, logistic, and other regressions (which you already have explained) with STATA as well.
    And yes, all your videos are super BAM!! please keep them coming.

    • @statquest
      @statquest  4 года назад +1

      Compared to R and Python, STAT is very expensive. If you can convince them to give me a free copy, I'll consider making videos for it.

    • @ambrishsingh8326
      @ambrishsingh8326 4 года назад

      ​@@statquest I was of this impression that a lot of universities have STATA available for the student and staff. Although I can't convince them for free licenses 🙂, one thing is there that STATA is easy to learn for users with no/little experience in programming (like me), and videos on STATA will be quite useful and quick to learn. 🙏
      But again it should be at your convenience.

    • @statquest
      @statquest  4 года назад

      @@ambrishsingh8326 Unfortunately I no longer work at a university. I'm doing StatQuest full time, so I would have to pay for it.

    • @ambrishsingh8326
      @ambrishsingh8326 4 года назад +1

      @@statquest Frakanly speaking, I have learned equally/(if not more) from you what I have from any of my professors. Thank a lot. 🙌

  • @karannchew2534
    @karannchew2534 3 года назад

    2:47 p-values of the estimated parameter value (i.e. slope and intercept) is obtained from t-test against null hypothesis (that the parameter was zero).

  • @MrMilesmile
    @MrMilesmile 2 года назад

    my master degree got saved by this video

  • @mrx2227
    @mrx2227 4 года назад +1

    very good explanation. there is not such a good video like this in german

  • @hankigoe829
    @hankigoe829 2 года назад

    1:11 example
    1:56 summary
    4:37 add abline

  • @fourhuang1148
    @fourhuang1148 4 года назад

    I LIKE STAT QUEST! IF YOU DO NOT LIKE STAT QUEST, YOU GET OUT HERE!

  • @SonSon-rq5dj
    @SonSon-rq5dj 2 года назад

    Hi Josh! Another great video as always! I just want to ask what if the P Value in my lm function shows p-value:

    • @statquest
      @statquest  2 года назад

      That means that your p-value is very close to 0.

  • @lubdu34
    @lubdu34 Месяц назад

    I am looking for the linear regression section where you discuss categorical or discrete predictors? (love your videos, btw!)

    • @statquest
      @statquest  Месяц назад +1

      If you are interested in discrete predictors, than you are interested in logistic regression, not linear regression. To learn more about logistic regression see: ruclips.net/p/PLblh5JKOoLUKxzEP5HA2d-Li7IJkHfXSe

    • @lubdu34
      @lubdu34 Месяц назад +1

      @@statquest Thankyou! I'm looking now!

  • @litfox5951
    @litfox5951 2 года назад

    Hi Josh, massive fan of the content. I had a question : Why is the R-Sq and Adj R-Sq different in this example (since we only have independent variable)? Appreciate your response!

    • @statquest
      @statquest  2 года назад

      Because the formula for the adjusted R^2 penalizes for any number of variables in the equation.

    • @litfox5951
      @litfox5951 2 года назад

      @@statquest But in this video, you only have variable (weight) to predict size , so shouldn't make any difference, right ?

    • @statquest
      @statquest  2 года назад +1

      @@litfox5951 Just google the formula for adjusted R-squared and you'll see that any number of variables > 0 will affect the adjusted r-squared value.

  • @abhishekbhatia6092
    @abhishekbhatia6092 5 лет назад +1

    At 2:54 you say we test whether the estimates for the intercept and the slope are 0 or not. I think our null hypothesis for the slope is that it is equal to 0 i.e. no (linear) dependence on X. But the null hypothesis for intercept should be that it is equal to the mean of Y's; since we also assess the relative variance explained against our simplest model that regardless of the value of X, the Y's are equal to the sample mean of Y. Correct me if I am wrong.

    • @statquest
      @statquest  5 лет назад +3

      While it is true that we are comparing the linear regression model to the model that consists of just the mean of Y values (the size of the mice), that is not what's going on when we assess the significance of the individual parameters. When we assess the significance of the individual parameters in the model with both a slope and an intercept, rather than comparing the slope intercept model to just an intercept model, we simply compare the model to one where the parameter is set to zero. For example, for the intercept, we are comparing the fit of the "full model", where we let both the intercept and the slope be optimized to the data, to a model where we force the intercept to be 0 and allow the slope to be optimized to the data, given the constraint that the line must go through the origin. When we test the significance of the value for the slope, we compare the full model to one where force the slope to be 0 and only allow the intercept to be optimized to the data, given the constraint that the line must be horizontal. Thus, this second test, testing if the slope is zero, is the same as testing whether the full model is better than just using the mean Y value, and you'll notice that the p-value for the t-test for the slope parameter and the p-value for the F-test for comparing the models (at 4:24) are the same - both are 0.0126.

    • @abhishekbhatia6092
      @abhishekbhatia6092 5 лет назад +1

      ​@@statquest Beautifully explained. I GET IT! THANKS!

    • @statquest
      @statquest  5 лет назад

      @@abhishekbhatia6092 Hooray! :)

    • @daltakid
      @daltakid 5 лет назад +1

      Thank you so much for asking this question. And
      @Josh for the detailed and clear answer! Do you have any videos on how the Standard Errors for the t-tests are estimated in this case by any chance?

    • @statquest
      @statquest  5 лет назад

      @@daltakid I don't, and it would take a whole new StatQuest video to explain how it all works. In the mean time, I found this webpage to be very useful: www.chem.utoronto.ca/coursenotes/analsci/stats/ErrRegr.html

  • @jamesly__vi1di
    @jamesly__vi1di 2 года назад

    Hello Josh, thanks for the great videos! I'm a fan!
    I was wondering if you could explain
    1. The difference between Pr(>|t|) for intercept and weight and p-value at the bottom.
    2. What's F? You said "the square root of the denominator in the equation for F."
    Thank you!

    • @statquest
      @statquest  2 года назад +1

      1) The p-values for the intercept and weight compare the estimated parameter values to 0. The p-value at the bottom is the same as the p-value for weight (comparing a model where the parameter for weight = 0 to a model where the parameter for weight is the least square estimate.
      2) For details on F (and a lot more insight into the answer to question #1), see: ruclips.net/video/nk2CQITm_eo/видео.html ruclips.net/video/zITIFTsivN8/видео.html and ruclips.net/video/hokALdIst8k/видео.html

    • @anastasiachery5874
      @anastasiachery5874 Год назад

      @@statquest Hello! Thanks for your awesome work :) In a regression with multiple explanatory variables, what p-value should we look at ? Does the R² p-value give us a sense of the impact of all the variables on the explained variable and the specific p-value tell us the specific impact of one explanatory variable ? Thank you !

    • @statquest
      @statquest  Год назад

      @@anastasiachery5874 The answers to your questions are in this video: ruclips.net/video/zITIFTsivN8/видео.html

  • @Rainpub
    @Rainpub 3 года назад +2

    I'm waiting for a heavy metal style intro

    • @statquest
      @statquest  3 года назад +1

      I think the closest I get is: ruclips.net/video/azXCzI57Yfc/видео.html

    • @Rainpub
      @Rainpub 3 года назад +1

      n i c e

  • @kvh9757
    @kvh9757 3 года назад

    Great video, what is a good reason not to standardize parameters estimates?

    • @statquest
      @statquest  3 года назад +1

      If we leave the parameters in their raw form they can be much easier to interpret in terms of changes in the underlying variables.

  • @giosang1111
    @giosang1111 4 года назад

    Hi. I have 2 questions:
    - Should we used Multiple R-squared or Adjusted R-squared in our conclusion?
    - What does it mean 1 and 7 DF specifically? Is 1 the numerator and 7 is the denominator?
    Thanks!

    • @statquest
      @statquest  4 года назад +1

      I answer these questions in my video on linear regression: ruclips.net/video/nk2CQITm_eo/видео.html

  • @cjcstorage3105
    @cjcstorage3105 2 года назад +1

    you are God

  • @ehatipo4598
    @ehatipo4598 Год назад

    What kind of a god you’re sir!?

  • @tinacole1450
    @tinacole1450 Год назад +1

    For example, can you redo this example out of data, from say the ALL package which contains acute leukemia information. Maybe show genes against any pData type such as age, remission, biology (mol.biol)

  • @berkingurcan6097
    @berkingurcan6097 3 года назад +1

    OMG I HAVE FALLEN AGAIN THIS CHANNEL

  • @ygbr2997
    @ygbr2997 Год назад +1

    I am wondering how the std error is calculated for the slope (or the intercept), which i reckon is the standard deviation of the sampling distribution of the slope (or the intercept) by definition. It does not seem to appear in any of your other videos. BTW, I study CS but i have to do data science internship every summer, so I rewatch your video every year for 3 years😂

    • @statquest
      @statquest  Год назад

      That would require a whole new video and I'll keep that in mind.

  • @travisneuberger9752
    @travisneuberger9752 3 года назад

    why doesn't the intercept value found in the summary correlate with the y-intercept seen in the plot? I must be missing something. Thanks for the video!

    • @statquest
      @statquest  3 года назад

      The x-axis in the plot does not go all the way to 0, so that could be throwing you off. If the plot had an x-axis that went all the way to 0, then we would see the line y-axis intercept at 0.58.

  • @yimingshao4240
    @yimingshao4240 2 года назад

    Thank you for the video, but for example like a regression formula yi= b0+ b1Xi+ the degree of deviation, so where can I find the degree of deviation in R, is that the same as standard errors?

    • @statquest
      @statquest  2 года назад

      Unfortunately I'm not familiar with the term "degree of deviation" so I can't tell you what it means.

  • @istifashaniaputri9340
    @istifashaniaputri9340 4 года назад

    is that same with partial least square (PLS)?

  • @kailashks901
    @kailashks901 2 года назад

    I have some small doubts. How are the degrees of freedom 7? and how do we assume the significance of p-value. For example, in hypothesis testing, we reject the null hypothesis if p-value < 0.05 (significance level) but on the other hand.. we want it to be less here in this example? Great video as always though :D

    • @statquest
      @statquest  2 года назад +1

      The degrees of freedom come from the equation for F. The number of observations - the number of parameters etc. As for your second question, I'm not sure I understand it.

    • @kailashks901
      @kailashks901 2 года назад +1

      @@statquest Oh thank you so much for answering. I was confused about the 2nd question but I actually realised it so no issues. Thank you again.

  • @dhruvdesai3728
    @dhruvdesai3728 Год назад

    Good explanation. Just noted that Rsq and adj Rsq are different although there is only one independent variable. Isn’t it counter-intuitive?
    Also, can you explain the difference between Rsq (not the adj Rsq) and multiple Rsq ?

    • @statquest
      @statquest  Год назад +1

      I explain that there is no difference between R-squared and Multiple R-squared at 3:57. And yes, the equation for the adjusted R-squared penalizes for every parameter except for the intercept.

    • @dhruvdesai3728
      @dhruvdesai3728 Год назад

      @@statquest Thanks for prompt response. Since there is only 1 independent variable, shouldn't the R-sq and Adj R-sq be same?

    • @statquest
      @statquest  Год назад +1

      @@dhruvdesai3728 You would think, but that's not how adjusted R-squared is defined. The adjusted R-squared penalizes for every parameter except for the intercept.

    • @dhruvdesai3728
      @dhruvdesai3728 Год назад

      @@statquest Ok , got it now. Great videos by the way. Keep it up!

  • @nidhiyaduvanshi2324
    @nidhiyaduvanshi2324 2 года назад

    Which p-value we will consider for interpreting statistical significance?

    • @statquest
      @statquest  2 года назад

      The individual parameters have p-values (see: 3:04) and the entire model also has a p-value (see: 4:29)

  • @masonlovell2946
    @masonlovell2946 2 года назад +1

    nice intro

  • @chunzili6256
    @chunzili6256 Год назад

    Hi, Josh, I a question about the distribution of the residuals. If we expect that they are symmetrially distributed, why the mean and the max should be the same distant from zero. I think the mean should be close to the zero which means that the data fits the line, but the max means the extreme value, why it should be the same distant as the mean from zero? Thanks!

    • @statquest
      @statquest  Год назад

      What time point in the video, minutes and seconds, are you asking about?

    • @chunzili6256
      @chunzili6256 Год назад

      @@statquest at 2:07

    • @statquest
      @statquest  Год назад

      @@chunzili6256 When you say "mean", do you mean "min", which is short for "minimum"? If that is the case, then you want them symmetrically distributed because that implies that the straight line is a good model for the data. If they are not symmetrically distributed, then maybe a curved line would be better.

  • @michaelnguyen7081
    @michaelnguyen7081 4 года назад

    Hi Josh, I have a question.
    When i use simple linear regression, the P value of the coefficient of independent variable 1 is very low.
    But when i add 1 more independent variable, the P value of the coefficient of independent variable 1 is very high.
    Can you explain that problems, thank you!!

    • @statquest
      @statquest  4 года назад +1

      I discuss adding multiple independent variables and interpreting the p-values in the StatQuest, Multiple Regression In R: ruclips.net/video/hokALdIst8k/видео.html

  • @ajithkhan7314
    @ajithkhan7314 2 года назад

    Hi Josh, I have a doubt. How this model will be useful in day-to-day real life? I mean, How this model is useful in predicting the size of the given weight? Thanks

    • @statquest
      @statquest  2 года назад

      It's just an example of the method, not something that is actually used in day-to-day real life.

  • @jlnie1761
    @jlnie1761 3 года назад

    Hello! I'm wondering why the intercept equals 0.58 but when weight = 0 the line cut the y-axis at the point beyond 1? Is the figure just partly shown?

    • @statquest
      @statquest  3 года назад

      If you look at the x and y-axes in the figure, you see that it does not show the origin (when x or y = 0). The figure just shows the region where we have data.

    • @danielpouly4961
      @danielpouly4961 3 года назад

      I had the same remark, you can actually define the range of the axes when making the xy plot by passing 2 boundaries values vectors to the arguments xlim = and ylim = . plot(mouse.data$weight,mouse.data$size,xlim=c(0,8),ylim = c(0,8)).

  • @ahugescrub
    @ahugescrub 2 года назад

    Hi, just a question why is the intercept the y value 0.58, but when it is plotted on the graph it is just above 1

    • @statquest
      @statquest  2 года назад +1

      When you look at the x-axis on the graph, you'll see that it doesn't go all the way to 0, and this is why it appears like the y-axis intercept (on the graph) is > 1. However, if you replot the graph and include 0 on the x-axis, you'll see that the y-axis intercept is correct. Here's the code to include 0 on the x-axis...
      plot(mouse.data$weight, mouse.data$size, xlim=c(0,7), ylim=c(0,7))
      abline(mouse.regression, col="blue")

  • @Pirata251976
    @Pirata251976 5 лет назад +1

    .Dear Josh. There is any video in NonLinear regresión in your channel?

    • @statquest
      @statquest  5 лет назад

      The closest thing I have to non-linear regression is fitting a Lowess curve: ruclips.net/video/Vf7oJ6z2LCc/видео.html

  • @BrunetteViking
    @BrunetteViking Год назад

    Could you also upload a video about linear regression in Python, sir?

  • @rokalinin
    @rokalinin 4 года назад +1

    Here is the code
    github.com/StatQuest/linear_regression_demo/blob/master/linear_regression_demo.R

    • @statquest
      @statquest  4 года назад

      That's right. I'm in the process of moving all the code to GitHub. I hope to be done by the end of the day.

  • @marcosj685
    @marcosj685 2 года назад +1

    I only come for the song

  • @hbmoller
    @hbmoller 4 года назад

    Do you have anything on interpreting OR and RRs from this output??

    • @statquest
      @statquest  4 года назад

      What do you mean by "OR and RR from this output?"

    • @hbmoller
      @hbmoller 4 года назад

      @@statquest thanks for replying! I meant odds ratio and risk ratio, but my tutor explained it to me! (have to use 'logit' and 'log' functions and then exp() (if that makes sense) complete stats/R newb here.

  • @tmsplltrs
    @tmsplltrs 3 года назад

    It's very confusing when you say 'we want to'. When you said 'we want the min value and the max value to have approximately the same distance from zero', I was assuming you meant that if they aren't the model is not reliable. But then you say 'we want the p-value to be

    • @statquest
      @statquest  3 года назад +1

      I'm sorry my phrasing is so confusing. For p-values, people, generally, but not always, will only go through the trouble of collecting data if they suspect that their hypothesis is correct, so people often "want" to find significance, even if they are supposed to be impartial to the result. That being said, the state of the residuals does have bearing on the significance in that if the residuals re highly skewed, then it's a good sign that the results should not be trusted, regardless of the p-value.

  • @hugh.lawrence3003
    @hugh.lawrence3003 2 года назад

    Do you know where I can get a dataset that I can download for linear regression? I can't find any viable ones ANYWHERE. They're either longitudinal, split the CDV into categories, unable to be accessed, or no CDV in the whole set. I need to analyse a dataset for linear regression - with a good sample and like 10 variables

    • @statquest
      @statquest  2 года назад

      Check archive.ics.uci.edu/ml/index.php

  • @psiko_nini4362
    @psiko_nini4362 3 года назад

    hi there, if the regression line is straight horizontal, does it means that the data set is not suitable for the linear regression model?

    • @statquest
      @statquest  3 года назад

      Correct

    • @psiko_nini4362
      @psiko_nini4362 3 года назад +1

      @@statquest thank you o much! I'm new to R so am quite confuse with how to answer the statistic questions

    • @statquest
      @statquest  3 года назад

      @@psiko_nini4362 For more information on Regression, see: ruclips.net/video/nk2CQITm_eo/видео.html

    • @psiko_nini4362
      @psiko_nini4362 3 года назад +1

      @@statquest thank you!

  • @DSharma117
    @DSharma117 2 месяца назад +1

    Hello 👋

  • @munshir.c6161
    @munshir.c6161 2 года назад

    Hi thank u.
    Could u please tell me, how can I do regression through orgin

    • @statquest
      @statquest  2 года назад

      Just set the intercept to 0.

    • @munshir.c6161
      @munshir.c6161 2 года назад

      @@statquest but , how should I code for this in R?

    • @statquest
      @statquest  2 года назад

      @@munshir.c6161 If you have two variables in your data, x1 and x2, then lm(y~ 0 + x1+ x2, data)

    • @munshir.c6161
      @munshir.c6161 2 года назад

      @@statquest thank you :) let me try it

  • @mscit_08_omprakash40
    @mscit_08_omprakash40 3 года назад +1

    Error in model.frame.default(formula = p ~ t, data = pressure, drop.unused.levels = TRUE) :
    invalid type (NULL) for variable 't'
    help me

    • @statquest
      @statquest  3 года назад

      What time point, minute and seconds, in the video, are you asking about?

    • @mscit_08_omprakash40
      @mscit_08_omprakash40 3 года назад

      @@statquest it's my question to you when I do I get this error and how to solve this just I asked

    • @mscit_08_omprakash40
      @mscit_08_omprakash40 3 года назад

      @@statquest why this come tell the reason

    • @statquest
      @statquest  3 года назад +1

      @@mscit_08_omprakash40 To be clear, you are not asking about the video. Is that correct? Instead, you are asking about some code that you wrote on your own and you have an error?

    • @mscit_08_omprakash40
      @mscit_08_omprakash40 3 года назад

      @@statquest yes

  • @ssg365
    @ssg365 2 года назад

    Abline() function is not working, giving error plot.new has not been called yet! Can smbdy help!!

    • @statquest
      @statquest  2 года назад

      You need to draw a graph before you call abline().

  • @newday8074
    @newday8074 3 года назад

    I couldn't put the names , I don't know where the problem is?

  • @fmetaller
    @fmetaller 6 лет назад +1

    Please make more python tutorials

    • @statquest
      @statquest  6 лет назад

      Will do!

    • @fmetaller
      @fmetaller 6 лет назад +1

    • @rajatbhosale8188
      @rajatbhosale8188 5 лет назад

      python is not that good i feel

    • @fmetaller
      @fmetaller 5 лет назад

      RAJAT BHOSALE why do you think that?

    • @rajatbhosale8188
      @rajatbhosale8188 5 лет назад

      nah i just said out of sarcasm bro...nothing to worry python is good compare to R because its coding is lil easy compare to R. Also, it is in boom now.

  • @nguyennick3674
    @nguyennick3674 3 года назад

    how can i type "~" in R studio. Anyone can help me?

    • @statquest
      @statquest  3 года назад +2

      On my keyboard it's a key in the upper left hand corner.

  • @itsasecret8841
    @itsasecret8841 3 года назад

    what if you have a HUGE amount of data?

    • @statquest
      @statquest  3 года назад

      If you have a HUGE amount of data, than that's awesome!

  • @bin4ry_d3struct0r
    @bin4ry_d3struct0r Год назад

    TIL we do not care about the p-value of the intercept.

  • @yeahyeah54
    @yeahyeah54 2 года назад

    how can you write the tilde?

    • @statquest
      @statquest  2 года назад

      It depends on your keyboard. For me, it's the key right below the "esc" key in the upper left hand corner.

    • @yeahyeah54
      @yeahyeah54 2 года назад

      @@statquest thank you, i have another crucial question, where do i find some data to analyze?
      i would like to use this knowledge in a practice

    • @statquest
      @statquest  2 года назад

      @@yeahyeah54 Here's a place with all kinds of data: archive.ics.uci.edu/ml/index.php

    • @yeahyeah54
      @yeahyeah54 2 года назад +1

      @@statquest another question, can i make regression between time and another variable?
      for example i make the linear regression between quarters and log(revenue) of alphabet, i get a p-value of 2.2*e^(-16) and a R^2 of 0.97, does this mean anything?

    • @statquest
      @statquest  2 года назад

      @@yeahyeah54 Yes. If the slope is positive, it means that as time passes, alphabet makes more money.

  • @minahilshafiq8374
    @minahilshafiq8374 Год назад

    What the hell....!!! Paid🤨

    • @statquest
      @statquest  Год назад

      I am really sorry! I don't know what is going on. I've contacted RUclips and have not heard anything back. This is breaking my heart because I never wanted this to happen, but somehow it is. I am sorry and doing everything I can to fix this.