Multiple Regression, Clearly Explained!!!

Поделиться
HTML-код
  • Опубликовано: 29 сен 2024

Комментарии • 208

  • @statquest
    @statquest  4 года назад +36

    Correction:
    2:49 I left off some of the parentheses for the equation for F. The numerator should be: (SS(mean) - SS(fit))/(pfit - pmean)
    Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/

    • @25478648vicky
      @25478648vicky 3 года назад +1

      Hi Josh, I have a project on multi linear regression. I have to come up with my own question using all the data I can find can you please help me

    • @falaksingla6242
      @falaksingla6242 2 года назад

      Hi Josh,
      Love your content. Has helped me to learn a lot & grow. You are doing an awesome work. Please continue to do so.
      Wanted to support you but unfortunately your Paypal link seems to be dysfunctional. Please update it.

  • @daphalvarez
    @daphalvarez 3 года назад +70

    Let me tell you this: I was crying in fetal position 4 hours ago; until I saw this video. So THANK YOU for your brilliant channel! 🙏🏻👐🏻

    • @statquest
      @statquest  3 года назад +2

      Glad it was helpful! :)

    • @dhgcrack3r111
      @dhgcrack3r111 2 года назад +1

      I’m pretty close

    • @dhgcrack3r111
      @dhgcrack3r111 2 года назад +1

      But I think this is what I need. Maybe hot all of it but a lot

  • @dannysnee4945
    @dannysnee4945 5 лет назад

    Useful video but that intro almost killed me with cringe

    • @statquest
      @statquest  5 лет назад +4

      Death by cringe is a terrible way to go. In the future, just skip the first 20 seconds or so and you’ll be spared.

  • @chelsypearo983
    @chelsypearo983 5 лет назад +164

    that intrO soNG SLAPS

  • @bens3884
    @bens3884 2 года назад +19

    First, you are an awesome, caring and friendly teacher and your way of teaching is very effective! Love your songs too! Second, would you consider covering historical uses of multiple linear regression, statistics and linear algebra? For instance, I read in my linear algebra textbook about how a russian mathematician used linear algebra to try to help the soviets logistical war effort during WW2 and also how an American economist used linear algebra during the late 1940s. These are just two examples from linear algebra but I would also hope there are interesting historical examples using multiple linear regression and statistics too. With your teaching style it would be very informative and inspire more people to get into math at a deeper level. Thanks either way, your efforts are much appreciated! :)

    • @statquest
      @statquest  2 года назад +9

      Wow! I'll keep that in mind. However, the current plan is to make a series of videos on neural networks.

  • @速战速决-v6q
    @速战速决-v6q 3 года назад +53

    I almost gave up on statistics until I saw your videos. Now I can confidently say I understand most concepts here.

  • @Privacy-LOST
    @Privacy-LOST 5 лет назад +12

    BAAAAAM. Comparing the simple & multiple regression efficiency through the difference of R². As simple as that. Love it. Thanks for making stats fun

  • @kautsarfadlyfirdaus1879
    @kautsarfadlyfirdaus1879 4 года назад +15

    Thank you for this well explained video, truly help me to understand multiple regression. BAM!

  • @yaminmiah2708
    @yaminmiah2708 Год назад +2

    Its telling me to pay to watch this video, but a week and a half ago i didnt have to pay, im confused, whats happened?

    • @statquest
      @statquest  Год назад +1

      I am really sorry! I don't know what is going on. I've contacted RUclips and have not heard anything back. This is breaking my heart because I never wanted this to happen, but somehow it is. I am sorry and doing everything I can to fix this.

  • @ellaluzpicavet
    @ellaluzpicavet 4 года назад +18

    Oh this was so helpful!, If only there was a video about the assumptions of multiple regression and interaction effects, stats is slowly killing me

  • @shrinivas1086
    @shrinivas1086 Год назад +1

    Hi josh, one request from me as a follower of yours...........
    I searched sooo many videos to get clarification about "Multiple Regression" with "Categorical data as independent variable"(like gender) but not satisfied with their explanation, either they explain using R or their data not contains 'categorical data'.
    Could u come up with one video? that should explain the concept with numerical form(like your most of the videos are ) but not theory.
    Actually i also watched your "Anova test" video to get clarification but further u hv not shown how to implemented on regression.
    Can u come up with one video from taking data of different categories to implementation in regression?
    please🙏

    • @statquest
      @statquest  Год назад +1

      See: ruclips.net/video/CqLGvwi-5Pc/видео.html

  • @snackbob100
    @snackbob100 4 года назад +4

    Up there with khan acadamy and 3blue one brown. Best machine learning and stats videos on youtube. Would love if there were exam style question and answer examples, like in khan acadamy, but I still seriously good. thank you very much for your videos

    • @statquest
      @statquest  4 года назад

      Thank you very much!!! :)

  • @SergeySenigov
    @SergeySenigov 10 месяцев назад +1

    Hi Josh! Is there a theoretical possibility for the "multiple regression" model to be worse than "simple" model, i.e. adjusted "multiple" R2 less than "simple" R2 ? Or "multiple model" p-value is greater than "simple" p-value?

    • @statquest
      @statquest  10 месяцев назад

      Good question. I believe the answer is "yes". So I would always be careful about what variables I added to a model.

  • @gr4707
    @gr4707 4 года назад +4

    MIxed effects model, please? Thank you!

  • @benbernanke4037
    @benbernanke4037 4 года назад +6

    This visualisation of multiple regression with the green 3d line that you are using is really nicely done, these kind of visual aides really help us newbie students with the intuition of these concepts the first time they see them. I am using the Woolridge textbook on econometrics and this video is better done in about a fifth the amount of time. Just thought you would like to know that your videos are greatly appreciated :)

    • @benbernanke4037
      @benbernanke4037 4 года назад +2

      oh I almost forgot, BAM

    • @statquest
      @statquest  4 года назад +2

      Thanks! I'm glad you like my videos. :)

  • @PunmasterSTP
    @PunmasterSTP 6 месяцев назад +1

    I'm not sure I ever thought too much about comparing a simpler model to a more complex one like that! By the way, how are things going in Chapel Hill these days?

    • @statquest
      @statquest  6 месяцев назад +1

      Great! Weather is perfect today.

    • @PunmasterSTP
      @PunmasterSTP 6 месяцев назад

      @@statquestI'm glad to hear that! Are you currently teaching classes with stats and/or machine learning?

    • @statquest
      @statquest  6 месяцев назад +1

      @@PunmasterSTP I've never taught a class before and stopped working at UNC 4 years ago to do youtube full time.

    • @PunmasterSTP
      @PunmasterSTP 6 месяцев назад +1

      @@statquestThat sounds cool, and I imagine you reach way more people on RUclips!

  • @aayushjariwala6256
    @aayushjariwala6256 2 года назад +1

    what is the value of n? (3:31) (number of data points?)

    • @statquest
      @statquest  2 года назад

      It is the number of data points.

  • @MeatballQQ
    @MeatballQQ 4 года назад +4

    Hi Josh,
    Thank you for all of your videos so far. It's truly a great help.
    I would like to know if you have made any videos for multiple regression with CATEGORICAL independent variables? I have some confusion of how to interpret the coefficients of such regression. Or maybe I just give you a simple example and you may give me a brief explanation.
    Thank you very much again.
    Quynh.

    • @statquest
      @statquest  4 года назад +3

      The next video in this series, Part 2, shows multiple regression with a categorical independent variable. A specific case of this is ANOVA, and that's what I demonstrate: ruclips.net/video/NF5_btOaCig/видео.html Once you understand that, you should check out the Part 3, Design Matrices: ruclips.net/video/CqLGvwi-5Pc/видео.html , and Part 4, Design Matrix examples in R: ruclips.net/video/Hrr2anyK_5s/видео.html

  • @reytns1
    @reytns1 6 лет назад +3

    Cool, Could you clearly explain how stepwise regression? I see it in several bioinformatic software and also the ridge regression and lasso? thanks

    • @reytns1
      @reytns1 6 лет назад

      Just you talking about R2, I was wondering also, what about AIC (Akaike information criterion) and BIC (Bayesian information criterion) ? are there related to?

  • @georgewang7770
    @georgewang7770 5 лет назад +3

    Hi. Awesome video.
    Can you do one one multivariate analysis, Anova, covariates please? I had a hard time in uni understanding them. Thanks

  • @Kaaaaaaaam
    @Kaaaaaaaam 6 лет назад +3

    This is great. Can you expand on this by explaining why and how Multiple Regression, Hierarchical, and Step-wise Regression are different? I more or less know when to use one over the other, but your videos are really helping visualize and understand these principles.

  • @ΚτηνιατρικήΜονάδαΥγείας

    How do you formally present the results of multiple regression model analysis on reports? I have read that you report, the R2 value, the coefficients and the anova results on the model but can you elaborate or give an example on a model with both numeric and factor independent variabes?

    • @statquest
      @statquest  3 года назад

      The answer to this varies, depending on the field. Some want more data than others. I would report the R^2, coefficients and their p-values. For details on how to do this, see: ruclips.net/video/hokALdIst8k/видео.html

  • @shashankbhosagi
    @shashankbhosagi 2 года назад +1

    Oh my god !!!!!!!!!!!!!!!!
    Statistics seems too easy now after watching your videos 😲😲
    I searched whole internet for such videos that explain concepts in simple and fuN wAy ......
    FInally RUclips Algorithm after seeing my struggle 🤣Suggested me your video
    Thanks a lot
    Now I'll see the whole Playlist of statquest 🤪😎😎😎🤪
    now i only require this part for my exams !!! 😁
    Once Again THank YOu 🙌🔥🙌🙌🙌🙌😝

    • @statquest
      @statquest  2 года назад +1

      Hooray! I'm glad the video was helpful! :)

  • @nathannguyen2041
    @nathannguyen2041 4 года назад +2

    Do you plan on covering polynomial regression?

    • @statquest
      @statquest  4 года назад +3

      In my mind, Polynomial Regression is just a special case of Multiple Regression. For Polynomial Regression we square or cube or whatever the values associated independent variables prior to putting them into the Multiple Regression model.

  • @suri_youtu2463
    @suri_youtu2463 5 лет назад +1

    How do we compare the Simple Regression fit to the Multiple Regression fit (F-statistic) using software ? Comparison to the mean is done automatically by say Excel or R but how would we do this comparison for Simple Regression v/s Multiple Regression ?

    • @statquest
      @statquest  5 лет назад +1

      In R you have two models, "simple" and "fancy", you can compare them with anova(simple, fancy).

  • @TheForRealTacoKing
    @TheForRealTacoKing 3 года назад +1

    Bam! Peanut Butter and JAAAAAAAM!

  • @user-me9nj6dj8t
    @user-me9nj6dj8t 4 года назад +3

    3:17 Does the "n" in the denominator means the number of data? and why the SS(fit) in the denominator has to divide the value? thanks

    • @statquest
      @statquest  4 года назад +10

      Yes, "n" is the number of observations in your dataset. And if you look at the equation, if 'n' is very large, meaning you have a lot of data, then the denominator will be very small, resulting in a large value for "F". This large value for "F", in turn, will correspond to a relatively small p-value (the larger the value for F, the smaller the corresponding p-value). Thus, the more data you have, the smaller the p-value and the more confidence you have in the "fit" being significant and not just the result of random chance. Does that make sense?

    • @manojharshavardhan2385
      @manojharshavardhan2385 4 года назад

      @@statquest yes thanks for the explanation

    • @statquest
      @statquest  3 года назад +1

      @Sam Dillard Among other things, this compensates for the fact that adding more variables (and parameters) to a model will improve the fit, even if those variables are not very helpful. In other words, if we add a lot of random variables to a model, some of them, by random chance, will correlate with the dependent variable (the thing we are predicting) and improve the fit. Thus, one of the reasons we subtract p-fit from 'n' is to compensate for this.

    • @statquest
      @statquest  3 года назад +1

      @Sam Dillard You are correctly on base. If 'n' is huge relative to p-fit, then it will not make a big difference - and this is desired because it means we have tons of data relative to the number of parameters and thus, the data will dominate the result (and the best way to avoid overfitting your model is to have tons of data). However, for smaller 'n' ('n' closer to p-fit), then the we don't really have tons of data anymore and we need to worry about over fitting, and that's when subtracting p-fit from 'n' will make a difference.

    • @iamkrishn
      @iamkrishn 3 года назад

      This thread is as crucial as the video. Thanks for the doubt and thanks Josh for answering every single question! You're an awesome guy! :)

  • @kwtan3814
    @kwtan3814 4 года назад +1

    Can you kindly clarify that, by SS(Simple) you are referring to the SS(Fit) of the simple model, and SS(multiple) is the SS(Fit) of the multiple-regression model?

    • @statquest
      @statquest  4 года назад +1

      That is correct and is explained at 4:11

  • @TheAugustinePark
    @TheAugustinePark 4 года назад +1

    At around 4:20 of the video, you introduce this new F equation where we eliminate SS(mean) from the equation to compare the simple and multiple regressions directly to each other. At 4:50 of the video, you mention the difference in R^2 values between the simple and multiple regressions. The equation for R^2 = (SS(mean)-SS(fit))/SS(mean). Do we also have a new R^2 equation when comparing the simple and multiple regressions directly to each other? Or are we calculating 2 separate R^2 values and comparing them to each other? Thank you!

    • @statquest
      @statquest  4 года назад +2

      We are calculating 2 separate R^2 values and comparing them.

  • @bashiransari6258
    @bashiransari6258 10 месяцев назад +1

    bammmmm !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

  • @blythe.11
    @blythe.11 2 года назад +1

    most of the instructors of resources would always divide simple linear regression and multiple linear regression, not gonna lie that just makes everything more complicated!! so super appreciated how you put all of these concepts in the simplest way to digest, and they are really, not a big deal LOL

  • @HitAndMissLab
    @HitAndMissLab 2 года назад

    stupid jokes and tone-deaf singing. I'm sure your girlfriend is cringing each time you start singing, or for that matter, start telling jokes 😞. Or if you don't have GF you might start looking for improvement in these areas.
    but great explanations.

  • @harishnagpal21
    @harishnagpal21 6 лет назад +8

    Your voice is superb and mesmerizing......

  • @MrDamrad
    @MrDamrad 10 месяцев назад

    I feel so sad when that BAAM!!! doesnt work for me...
    P.S. Just tired, can't concentrate. But ussually it works just as it takes!

    • @statquest
      @statquest  10 месяцев назад

      Ok! I hope you can get some rest.

  • @saidisha6199
    @saidisha6199 8 месяцев назад

    If the difference in R2 values between simple and multiple regression is large and p value is small, why does adding tail length is worth the trouble ? Sorry I didn't understand this part clearly.

    • @statquest
      @statquest  8 месяцев назад

      What time point, minutes and seconds, are you asking about?

  • @sku5088
    @sku5088 7 лет назад +2

    Can you do multivariate regression next? (multiple outcome variables)

  • @Alchemist10241
    @Alchemist10241 Год назад +1

    ooh yeah Statquest

  • @alecvan7143
    @alecvan7143 4 года назад +3

    BAM!

  • @tvvt005
    @tvvt005 Месяц назад

    3:36 if we have more than 3 variables, how exactly would we fit a higher dimension on a graph? Is graphing relevant here?

    • @statquest
      @statquest  Месяц назад

      When you have more variables, you can no longer draw the data and the graph, but the math still works out the same.

  • @yellow7023
    @yellow7023 2 года назад +1

    Nice intro! HAHA!

  • @akshayalawani2258
    @akshayalawani2258 2 года назад +1

    This channel is 🥰

  • @recisuser
    @recisuser 5 лет назад +1

    Dr. Starmer, is there a term for the F value between simple and multiple regressions (the thing you said helps one determine whether collecting data for additional variables is meaningful) that distinguishes it from just plain F between linear or multiple regressions and the mean regression? I determined it for my data and would like to know how to refer to it.

    • @statquest
      @statquest  5 лет назад +1

      To be honest, I'm absolutely horrible with terminology. I have no idea what it's called. Maybe "the nested F-statistic"? I'm not sure.

  • @873035035
    @873035035 Год назад +1

    that was awesome!

  • @jayatinaik8689
    @jayatinaik8689 2 года назад +1

    Hello Josh, thank you for the wonderful video series. I was wondering if formula for F for simple vs Multiple is correct. I believe the denominator should have SS(Simple) instead. Please confirm.

    • @statquest
      @statquest  2 года назад +1

      The video is correct. The denominator has SS(mutliple).

  • @luckyentropy
    @luckyentropy 5 лет назад +1

    What's the 'n' of the 'n-p(multiple)' in the denominator?

  • @tamirrozenfeld3572
    @tamirrozenfeld3572 2 года назад +1

    Thank You !

  • @couragee1
    @couragee1 2 года назад +1

    thank you!

  • @adammess
    @adammess 2 года назад +1

    I love you

  • @lin1450
    @lin1450 3 года назад +1

    I know you covered the formula for F already in the linear regression video, but would you mind doing one more just on this formula? Sadly its still not intuitive for me :/
    PS: Thanks for all your videos! They are a great help!

    • @statquest
      @statquest  3 года назад +1

      I'll keep that in mind.

    • @anirbanaws143
      @anirbanaws143 3 года назад

      @@statquest Same with me Joshua. Thank you in advance. :)

  • @dinajose7co
    @dinajose7co 2 года назад +1

    Thanks

    • @statquest
      @statquest  2 года назад +1

      HOORAY! Thank you so much for supporting StatQuest! It means a lot to me that you care enough to contribute. BAM! :)

  • @diegobuenovillafane869
    @diegobuenovillafane869 5 лет назад +1

    Excelent video!!!! Can you make one about beta regressions?

  • @xXMaDGaMeR
    @xXMaDGaMeR 2 года назад +1

    legend

  • @sudhanshusingh7924
    @sudhanshusingh7924 3 года назад +1

    Bam!!

  • @wanderingseeker2932
    @wanderingseeker2932 2 года назад +1

    Bam!

  • @qianhuijin6971
    @qianhuijin6971 2 года назад +1

    Hey Josh! Great videos!! THANKS!!

  • @lelamakharadze9411
    @lelamakharadze9411 3 года назад +1

    Hey Josh ! thanks for one more clearly explained concept ! Just one question: maybe it's too obvious but I don't really get how comparing multiple to the simple regression would help me avoid collecting data for the extra variable if I have already collected it for testing the model. Or is it possible that we measure some number of samples for tail length for rough estimation and decide if it's worth further doing so for a bigger sample size?

    • @statquest
      @statquest  3 года назад +2

      We can use the full dataset for the original models and testing of models. Once we decide which model and variables we want to use, future predictions can be made by only collecting the limited set of variables that we need.

  • @varunsagar7522
    @varunsagar7522 2 года назад

    Suppose the relationship between three variables Y, X1, and X2 is estimated using the multiple linear regression as Y = 2.2 X1 + 3.1 X2 + 4.1. Given this information, which of the following statements is true?
    Answer choices
    Select only one option
    For every unit change in X1, Y changes by 2.2 units
    For every unit change in X2, Y changes by 3.1 units keeping X1 constant
    For every unit change in X1, Y changes by 3.1 units
    For every unit change in X2, Y changes by 3.1 units

  • @nicholasheimpel5998
    @nicholasheimpel5998 3 года назад +1

    Please do a statquest on dummy variables!

    • @statquest
      @statquest  3 года назад +1

      The next two videos in the series may answer your questions about dummy variables: ruclips.net/video/NF5_btOaCig/видео.html and ruclips.net/video/CqLGvwi-5Pc/видео.html

  • @Kig_Ama
    @Kig_Ama 2 года назад

    Can u make some examples with Python instead of R since Python becoming more and more popular?

    • @statquest
      @statquest  2 года назад +1

      Yes! That is the plan.

    • @Kig_Ama
      @Kig_Ama 2 года назад +1

      @@statquest Great!

  • @spp626
    @spp626 Год назад

    Hello Josh, I have a doubt about hypothesis testing. Since we have to comment on the population on the basis of sample only, we can only reject Ho if we get significant difference. But if we don't get so, it doesn't mean that there is no significant difference in the variables of the population under consideration. Then, why do people (including books) say we accept Ho when difference is not found in the sample???

    • @statquest
      @statquest  Год назад +1

      You should never "accept the null" because it could be that we just don't have enough data to properly reject it.

    • @spp626
      @spp626 Год назад

      @@statquest yes exactly, that's why, we must say failed to reject the null, Right? Thank you so much Josh. Thanks a lot!!!

    • @statquest
      @statquest  Год назад

      @@spp626 Correct.

  • @danielnagy6360
    @danielnagy6360 4 года назад

    but WHY? Noone ever explains the geometrical meanings of the equatations... :(

  • @feroci-tay5708
    @feroci-tay5708 3 года назад +1

    This was expalined pretty well. Thanks.

  • @sugarmoon795
    @sugarmoon795 4 года назад +1

    Josh! Building a model to evaluate PV system performance and came across your videos. Love it!

  • @ericamoveit
    @ericamoveit Год назад +1

    cutest yt channel intro eveeerrr

  • @brunomartel4639
    @brunomartel4639 4 года назад

    Pro tip. If you like each video you watch,you will not re-watch videos

  • @Emma-tk9fr
    @Emma-tk9fr 5 лет назад +1

    Amazingly helpful video, could you do one on GEE's?

  • @shripujasiddamsetty
    @shripujasiddamsetty 4 года назад

    I saved the video. But didnt watch. Got it as a 15 Mark's answer.

  • @silentsuicide4544
    @silentsuicide4544 2 года назад

    For F_value of multiple regression in numerator we have SS(simple) and the thing that got me thinking is for what feature it is suppose to be calculated? The path taken in the video is clear, we have first collected points with weight and length and then we are considering adding new feature to our model. But what if i got the data from someone with two features (weight and tail length) and one target (length), how to calculate F_value now, as i am able to compute two values of SS(simple), one for pair (weight, length) and one for pair (tail length, length), should i calculate both F_values? what if i have N features and one target? Now i can calculate N SS(simple) values and thus, N F_values. And what exactly "simple" means when I have for example N features and my linear model have N+1 (intercept) parameters, then "simple" means with two parameters or with (N+1) - 1?

    • @statquest
      @statquest  2 года назад +1

      In theory, simple can refer to any "simpler" model, or any model that contains a subset of the features found in the full model. So it really depends on what you are interested in testing. You don't have to do every single test possible, just the ones you are interested in. The standard tests are... 1) Compare the fancy model to only using the mean y-axis value to make predictions 2) comparing the fancy model to all the models that are missing one of the features. For details, see: ruclips.net/video/hokALdIst8k/видео.html

  • @libertychiduke6608
    @libertychiduke6608 Год назад +1

    Bamm!!

  • @infinitewarr1or699
    @infinitewarr1or699 4 года назад +2

    stat quest!!! yay!!!

  • @SunSan1989
    @SunSan1989 Год назад

    Dear Josh, Thank you very much for your video, which benefited me a lot. Could you do a tutorial on polynomial regression? If multiple linear regression speaks of direct effects from different explanatory variables to explain SSM, can polynomial regression be understood as considering not only direct effects but also cross-effects? Looking forward to your reply

    • @statquest
      @statquest  Год назад

      Polynomial regression is just like multiple regression. en.wikipedia.org/wiki/Polynomial_regression

  • @_N0_0ne
    @_N0_0ne 2 года назад +1

    Thank you kindly ✍️

  • @soumyopattnaik6787
    @soumyopattnaik6787 4 года назад

    How can we do a multicollinearity check when you have ordinal variables in the data?

  • @Han-ve8uh
    @Han-ve8uh 3 года назад

    At 4:50, there's the statement on analysing difference in R-sq, and the p-value to determine if adding a new feature is worth the trouble. I'm wondering how can this interpretation be done properly? I'm thinking if there are 3 predictors x1,x2,x3, does order of adding them affects the R-sq and p-value? Like if added in order of x1,x2,x3 vs x1,x3 vs x2,x3, these are 3 different scenarios where x3 is added, which should give 3 different incremental R-sq and p-values? How do we interpret properly whether x3 is helpful to the predicting y if the state of currently included predictors affects the results from adding a new predictor? Or more generally, why do people use stepwise regression for feature selection when it is only locally optimal and different order of including variables affect the results of interest?

    • @statquest
      @statquest  3 года назад +1

      One possible solution it to use Lasso Regression or Elastic Net Regression to select which variables go in the final model. For details, see Ridge: ruclips.net/video/Q81RR3yKn30/видео.html Lasso: ruclips.net/video/NGf0voTMlcs/видео.html Ridge vs Lasso: ruclips.net/video/Xm2C_gTAl8c/видео.html and Elastic-Net: ruclips.net/video/1dKRdX9bfIo/видео.html

  • @velvetguitar95
    @velvetguitar95 2 года назад

    What does the fit mean and where does it come from?

    • @statquest
      @statquest  2 года назад

      This is described in this video: ruclips.net/video/nk2CQITm_eo/видео.html

  • @ArnabJoardar
    @ArnabJoardar 3 года назад

    Hi Josh,
    At the 4:52 mark, when you say that it is worth the trouble of including the extra parameter into your model, wasn't that already done when I created the multi-regression best-fit equation? Seems kind of a moot point, wouldn't you agree? All it is saying is that: 'replace your old simple regression model with this new multi-regression model'.

    • @statquest
      @statquest  3 года назад +2

      However, you should only do that if the increase in R^2 is large enough. For example, if we have a big fancy model that has a bunch of variables, but gathering the data is very expensive compared to a simple model, then, if the simple model works almost as well as the fancy model, we might opt to just continue using the simple model (and no longer bother spending all of the money to continue to collect the extra data).

  • @meghanareddy6035
    @meghanareddy6035 5 лет назад +1

    Nice video

  • @hang1445
    @hang1445 3 года назад

    But that means we still need to obtain some data of Tail length beforehand in order to calculate the SS(multiple).
    So will the following be more precise: To know whether it is worth spending *much* time or paying *more* effort to find *more* data?

  • @dvijeniya
    @dvijeniya 5 лет назад +1

    thanks for the super easy explanation. I appreciate your support.

  • @christopheranderson1968
    @christopheranderson1968 3 года назад

    Didn't understand any of this, to be perfectly honest.

    • @statquest
      @statquest  3 года назад

      Did you watch the first part first? (this is the 2nd part) ruclips.net/video/nk2CQITm_eo/видео.html

    • @christopheranderson1968
      @christopheranderson1968 3 года назад +1

      @@statquest Thanks for the link.

  • @jay-kh6om
    @jay-kh6om 4 года назад

    How would you find the SS(mean) value when there's multiple variables?Are we supposed to add the SS(mean) value for each independent variable with each other?

    • @statquest
      @statquest  3 года назад +1

      In 2-D, the line y=mean(y-axis value) is a horizontal line in 2-d at the mean of the y-axis values and we do not need to specify the x-axis values. Likewise, we can specify a plane (or hyper-plane) with y=mean(y-axis value) without specifying the other variables.

  • @huesOfEverything
    @huesOfEverything 3 года назад

    how would R^2 indicate if collecting data on a new feature is going to help if we have not already collected data on the feature to calculate R^2?

    • @statquest
      @statquest  3 года назад

      This could just be a pilot study, done before a larger, more expensive study. In this case, the results from the pilot study could inform how we carry out the larger study.

  • @Jcroffe
    @Jcroffe 3 года назад +1

    funny intro!!! lol

  • @sumayyakamal8857
    @sumayyakamal8857 3 года назад

    Hey! Why do we have a single intercept in Linear Regression for multiple features?

    • @statquest
      @statquest  3 года назад +1

      Because there is a single thing we are trying to predict.

  • @ve2848
    @ve2848 2 года назад

    do you know how to find the intercept b

    • @statquest
      @statquest  2 года назад

      There are two ways to solve this problem. There is actually an analytical solution, meaning there's an equation that you can plug your data in and out comes a solution. However, I prefer using Gradient Descent, because it can be used in a much wider variety of situations. To learn more about Gradient Descent, see: ruclips.net/video/sDv4f4s2SB8/видео.html

  • @akashprabhakar6353
    @akashprabhakar6353 4 года назад

    Thanks for this awesome videos...I have a small doubt... Why SS(mean) is found about the tail length and not mouse weight?

    • @statquest
      @statquest  4 года назад

      Because we are trying to predict length, not weight.

  • @denisneznanow2571
    @denisneznanow2571 3 года назад +1

    Great one

  • @vickykumawat766
    @vickykumawat766 2 года назад

    I really like your video
    can you make a full series on time Series analysis

    • @statquest
      @statquest  2 года назад

      I hope to do that one day.

  • @lifeisarace4968
    @lifeisarace4968 5 лет назад

    Can someone explain what R2 and P-value mean? I have seen the videos but can't figure out how they are different

    • @statquest
      @statquest  5 лет назад +1

      To understand what R^2 and P-values are, you should start with simple linear regression: ruclips.net/video/nk2CQITm_eo/видео.html

  • @leonardostocchero1092
    @leonardostocchero1092 Год назад

    I didn't need to see the video to subscribe, you got me with that intro!

  • @seanm2818
    @seanm2818 3 года назад

    Despite the annoying intro, this video was very helpful.

    • @statquest
      @statquest  3 года назад +1

      You can just skip the first 30 seconds of all my videos.

  • @chnlyi
    @chnlyi 4 года назад

    Shouldn’t F be (SS(mean) - SS(fit))/(pfit - pmean) on top?

    • @statquest
      @statquest  4 года назад

      Yes, I was sloppy with the parentheses.

  • @abhineetram2197
    @abhineetram2197 3 года назад

    Hi Josh! What do you call the process of comparing the R^2 of the simple to multi? And how is it different than simply comparing cross validated R^2 of simple vs multi?

    • @abhineetram2197
      @abhineetram2197 3 года назад

      Oh I you answered my first question already, but my second question still stands!

    • @statquest
      @statquest  3 года назад

      Presumably you just average the values across the different folds, but that is just a guess.

  • @kittytangsze
    @kittytangsze 3 года назад

    nice video. will you talk about non-linear models?

    • @statquest
      @statquest  3 года назад +1

      I mention them in this video: ruclips.net/video/Vf7oJ6z2LCc/видео.html and in most of my machine learning videos. Tree based methods, support vector machines and Neural Networks are all non-linear. Here's a list of all of my videos: statquest.org/video-index/

  • @devendermudgal3575
    @devendermudgal3575 4 года назад

    Sir, please provide a lecture on multivariate analysis.

    • @statquest
      @statquest  4 года назад

      I'll keep that in mind.

  • @morytep
    @morytep 3 года назад

    can you please do a series on multivariate data analysis

    • @statquest
      @statquest  3 года назад

      Can you give me ideas for topics other than this one?

  • @begris
    @begris 2 года назад

    what is r^2?

    • @statquest
      @statquest  2 года назад

      To learn more about R^2, see: ruclips.net/video/2AQKmw14mHM/видео.html and ruclips.net/video/nk2CQITm_eo/видео.html

  • @gibriljallow1780
    @gibriljallow1780 4 года назад

    Lol, thought you were sing 😁

  • @bendraven76
    @bendraven76 4 года назад

    This video DID NOT clearly explain multiple regression. You keep referring to another video that I didn't see so you lost me when you kept referring to it. Next time, just come up with something else. Keep it simple. You need to learn pedagogy because I'm sure there were others who were lost. SMH...

    • @statquest
      @statquest  4 года назад

      Here are the links to the other videos:
      ruclips.net/video/PaFPbb66DxQ/видео.html
      ruclips.net/video/nk2CQITm_eo/видео.html