Pearson's Correlation, Clearly Explained!!!

Поделиться
HTML-код
  • Опубликовано: 21 дек 2024

Комментарии • 621

  • @statquest
    @statquest  5 лет назад +104

    NOTE: Although I do not mention it by name in the video, this StatQuest covers Pearson's Correlation Coefficient. Unfortunately, this did not occur to me until after I posted the video, otherwise I would have mentioned it at least 20 times...so maybe it's better the way it turned out. ;)
    Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/

    • @sunilkumarsamji8507
      @sunilkumarsamji8507 4 года назад +2

      Hi Josh Thanks a lot for the wonderful work. it helps learners a lot. My query: At 9 : 08, it is mentioned p = 2.2 * 10 ^ -16 means low probability that a randomly selected point has similarly strong relationship. Does it mean to say that the hypothesis or prediction (line through the data points) of the trend cannot generalize with respect new data point a randomly selected data point? Is that what a low p means to say? At the same time a low p means high confidence level in the trend which means that high confidence level implies that a randomly selected that will have similarly stronger relationship? Let me please know if I am missing some point.

    • @statquest
      @statquest  4 года назад +29

      @@sunilkumarsamji8507 No. The p-value tells us that the probability that random noise could create the relationship we observed, or a stronger relationship. When you have small p-value, that means the probability that the relationship we observed is due to noise is small. This means we can have confidence that new observations will behave similarly to what we have seen before, rather than completely randomly. Does that make sense?

    • @inderjeetsinghintech
      @inderjeetsinghintech 4 года назад +1

      @@statquest yes, that is the reason we keep the threshold to only 5% or 0.05.

    • @lorisbach9905
      @lorisbach9905 3 года назад

      @@statquest Thanks for your amazing videos. I am watchng them all to try to catch up in statistics for my master degree in geology.
      In this video, I am unsure on how you calculated the p-values. Can you please explain a little ?

    • @statquest
      @statquest  3 года назад +4

      @@lorisbach9905 Unfortunately, I don't have a video that explains the p-values for Pearson's correlation coefficient in detail. However, I do have a video that explains the p-value for R-squared, which is very, very closely related (and is actually much more useful) here: ruclips.net/video/nk2CQITm_eo/видео.html

  • @SumitOli007
    @SumitOli007 3 года назад +154

    I am crying rn, Statistics was the one thing that scared me in high school, never studied it in engineering & after watching tons of videos & losing hope. I finally found your channel.
    I am finally understanding bits and bytes of statistics & I owe everything to this beautiful pedagogy
    Infinite BAM

    • @statquest
      @statquest  3 года назад +7

      Hooray! I'm glad my videos are helpful. :)

  • @alberttamazyan
    @alberttamazyan 5 лет назад +41

    I am so thankful to you!!! I tried learning statistics multiple times in my life and never succeded with any source. I discovered your stat quests about a week ago and I already feel so comfortable with many concepts in statistics! Huge thanks.

    • @statquest
      @statquest  5 лет назад +1

      That's awesome! I'm glad the videos are helpful. :)

  • @bigorange6328
    @bigorange6328 5 лет назад +210

    My days spent on statistics before knowing statquest were so wasted

  • @bartgacrama2295
    @bartgacrama2295 5 лет назад +156

    You are a genius in pedagogy.

    • @statquest
      @statquest  5 лет назад +5

      Thank you! :)

    • @anitapallenberg80
      @anitapallenberg80 5 лет назад +5

      100 % agree! I love StatQuest with Josh Starmer!! ♥

    • @vikranttyagiRN
      @vikranttyagiRN 4 года назад +1

      @@anitapallenberg80 Me too. Alot

    • @Cat99
      @Cat99 3 года назад

      Simple, easy to understand.

  • @mathavraj9662
    @mathavraj9662 4 года назад +9

    As soon as I started the video, the differences between r-square, covariance and correlation were lingering in my mind. Glad you cleared them all!!

    • @statquest
      @statquest  4 года назад +1

      Glad it was helpful!

  • @TheCJD89
    @TheCJD89 5 лет назад +11

    I just watched the Covariance and Correlation videos back to back. Very well put together and really easy to follow

  • @allthingsconsdrble
    @allthingsconsdrble 11 месяцев назад +4

    Very much appreciate the crawl, walk, run approach with emphasis on conceptual understanding

  • @CamilaMachadodeAraújo
    @CamilaMachadodeAraújo Год назад +1

    Ohhh man!!! I'm instantly falling in love with this channel, definitly the best sense of humor to learn machine learning.

  • @Alchemist10241
    @Alchemist10241 Год назад +1

    I've added this channels videos to my Anki cards and every time I review them I get even deeper insights. well done statquest

  • @arnabchanda2609
    @arnabchanda2609 4 года назад +3

    All My Life I have been looking out for you, glad that I found you... BAM!!!

  • @BatkhuuByambajav
    @BatkhuuByambajav 3 года назад +2

    Bam Bam BAM...
    Eventually, I've fallen in love with your BAMs :)
    Addictive BAMs and gorgeously simple videos!
    Thanks a lot!

  • @blakebodycote1024
    @blakebodycote1024 Год назад +2

    I have yet to get into most of these concepts in my statistics major, but I am so thankful to have these bite-sized informational videos with lots of visual explanations to explain each concept so I can start practicing and studying machine learning early. Thank you so much for every single video you put out. Truly a blessing.

  • @2002budokan
    @2002budokan 4 года назад +1

    I find the best and non-boring stats explanations in this channel.

  • @chocolatemodelsofficial5859
    @chocolatemodelsofficial5859 Месяц назад +1

    This is why when drawing trend lines on stock charts they say you need at least 3 points/touches and not 2. Very helpful video!

  • @zumiao234
    @zumiao234 4 года назад +4

    Thank you! You actually help me to understand many basic concepts in a clear and easy-acceptable way, you are so smart and kind-hearted.

    • @statquest
      @statquest  4 года назад

      Thank you very much! :)

  • @drditup
    @drditup 4 года назад +4

    you just explained this better than i ever heard. im a phd student (who for some reason wasn't given a decent statscourse through his master degree in robotics engineering. Needless to say, statistics are good for science)

  • @גיאחנן-ע4ט
    @גיאחנן-ע4ט 5 месяцев назад +1

    I just learn to my exam in two days with your videos ! You are awesome man keep going ! thank you !

    • @statquest
      @statquest  5 месяцев назад +1

      Best of luck!

  • @mostinho7
    @mostinho7 2 года назад +4

    Thanks!
    Great summary at 9:00
    Correlation strength nothing to do with slope, but with how many points the line goes through. Can have correlation of 1 with large slope or small slope as long as the points lie on a line.
    14:00 equation cov(x,y) in previous video

  • @bubbleworld4172
    @bubbleworld4172 5 лет назад +43

    Appreciative bäm from Germany.

    • @statquest
      @statquest  5 лет назад +16

      That's awesome! I'm glad Bam has an umlaut in German. ;) That makes it twice as cool. TWICE BÄM!

  • @wernhervonbraun4222
    @wernhervonbraun4222 3 года назад +2

    I'm very grateful to all of your videos. I want to support you but I am a student in 3rd world country. Even I get capable enough I'll surely contribute to this great project! Thank you

    • @statquest
      @statquest  3 года назад

      Thank you very much! BAM! :)

  • @shubhamjain2423
    @shubhamjain2423 4 года назад +1

    Dear Josh, This video made my endless nights trying to grasp on this topic.

  • @reach2puneeths
    @reach2puneeths 5 лет назад +3

    These are the best videos which explains the concept in simple way. Thanks for making these videos.
    Please upload Al and deep learning videos.

  • @amizan23
    @amizan23 4 года назад +1

    Your videos are way better than most of the paid courses.

  • @shreyastudies4693
    @shreyastudies4693 Год назад +1

    You have a knack for teaching... this was an amazing video, thank you!!

  • @terryliu3635
    @terryliu3635 4 года назад +2

    The best intro on correlation, thank you!

  • @vamanieperumal5262
    @vamanieperumal5262 4 года назад +1

    I cant believe how all your videos are so perfect !

  • @vedprakash-bw2ms
    @vedprakash-bw2ms 5 лет назад +25

    Stat quest is the best ..

  • @pasqualegiorio3651
    @pasqualegiorio3651 4 года назад +1

    Josh you are just a genius of Stat explanations, thank you.

    • @statquest
      @statquest  4 года назад

      Thank you very much! :)

  • @YeekyYeeky
    @YeekyYeeky 5 лет назад +2

    your video makes it really easy to understand(even my english is not really strong , I can still understand almost all of them) , thank you from Thailand

    • @statquest
      @statquest  5 лет назад

      Hooray! I'm glad you like my videos. :)

  • @jairoalves8083
    @jairoalves8083 5 лет назад +57

    how to obtain the p-value from this data?

    • @amanmanveen
      @amanmanveen 4 года назад +1

      @@minhtoto1542 Had the same question. Found this video helpful: ruclips.net/video/8Aw45HN5lnA/видео.html

    • @eugenefrancisco8279
      @eugenefrancisco8279 3 года назад

      You might be referring to a t-test for slope. You would need to calculate a sample regression line using the data and then obtain a p value by performing a test on the data with some null hypothesis.

  • @calciumfree9626
    @calciumfree9626 5 лет назад +2

    Big thanks! I couldn't get any intuition from my school lecture, and it's lucky for me to find this video a day before my exam for this!

    • @statquest
      @statquest  5 лет назад +1

      Good luck on your exam! :)

  • @gtrstreet
    @gtrstreet 4 года назад +1

    Very well explained. I like that you give lots of examples and answer many of the possible questions in advance. Thanks a lot!

    • @statquest
      @statquest  4 года назад

      Thank you very much! :)

  • @Some_random_guy_16
    @Some_random_guy_16 4 года назад +2

    you are doing a great job enabling us to learn may super tough concepts relatively easy .. that too free of cost...thankss

    • @statquest
      @statquest  4 года назад +1

      Thank you very much! :)

  • @OlegGolubev_yolo
    @OlegGolubev_yolo Год назад +1

    How good you r at this. I tried really hard to understand what it this when i've been in university. but failed. Because there was no explanation why we need this. Only the words that it is "how x related to y"... I figured out what is it actually only 7 years later... Thanks a lot man

  • @LOVEONLYLOVEable
    @LOVEONLYLOVEable 5 лет назад +1

    If anyone finds a better teacher than this guy on you tube, do let me know 😎😎

  • @수삼블-q9n
    @수삼블-q9n 4 года назад +2

    Bam....I started to think that statistics can be fun....Huge thanks from Korea

  • @paveldvorak2014
    @paveldvorak2014 5 лет назад +4

    @Josh, it is great you actually put the text on the screen, I cannot play sound but I can still follow closely what you are saying. Great videos, I hope you will later dive into more advanced topics in time series analysis (unit roots, ARIMA, GARCH, etc). Pls keep it up!

    • @statquest
      @statquest  5 лет назад

      I'm glad you like my style. :)

  • @taladiv3415
    @taladiv3415 3 года назад +1

    Guys like this help make the study world a better place!

  • @kylebecker5083
    @kylebecker5083 3 года назад +2

    I've learned so much from this channel. Thanks, Josh.

    • @statquest
      @statquest  3 года назад +2

      Awesome, thank you!

  • @kaigordon2900
    @kaigordon2900 3 года назад +2

    Extremely helpful and clear with good examples and explanation! Wonderful, thank you!
    BAM!!!

  • @sherrynsherryn5071
    @sherrynsherryn5071 5 лет назад +3

    Thank you from Indonesia, I love your videos!

  • @harithagayathri7185
    @harithagayathri7185 5 лет назад +1

    Josh, you explain in such a way that even layman can understand easily.
    A big shout out to all the hard work you put in for making these videos.👏👏

    • @statquest
      @statquest  5 лет назад

      Thank you very much!!! :)

  • @hassanrevel
    @hassanrevel 3 года назад +1

    Josh you're super great man. I really enjoy listening you.

  • @sashacollum7583
    @sashacollum7583 3 года назад +1

    Soooo thankful to have found this video. Why did it seem so hard to understand before?!

  • @thesagecc
    @thesagecc 3 года назад +2

    As a graduate level I-O Psychology student.... thank you... I watched the summary first and then went back to watch the entire video

  • @prateeknagaich13
    @prateeknagaich13 3 года назад +1

    Hi, great video. Can you please provide additional guidance on the following:
    a. How do you quantitatively determine the P-value for a correlation?
    b. What's the difference, both formulaically and conceptually between R2, Correlation, and Beta/coefficient in a regression?

    • @statquest
      @statquest  3 года назад

      For details on p-values and linear regression, see: ruclips.net/video/nk2CQITm_eo/видео.html

  • @draviaartistwithbat5756
    @draviaartistwithbat5756 5 лет назад +11

    Thanks for the video.
    And please make next video series on hypothesis testing (z test, t test, anova, chi square)

    • @markobe08
      @markobe08 5 лет назад

      That is right!!!

    • @statquest
      @statquest  5 лет назад +7

      If you want to have a super deep understanding on t-tests and ANOVA, you should check out my StatQuest videos on Linear Models: ruclips.net/p/PLblh5JKOoLUIzaEkCLIUxQFjPIlapw8nU

    • @draviaartistwithbat5756
      @draviaartistwithbat5756 5 лет назад

      Sure I will check it and let you know if anything else is needed. Thank you very much. You are doing great man keep up the good work.

  • @sammiechung2711
    @sammiechung2711 4 года назад +1

    Thanks for your detailed and clear explanation. Saving much of my time to read books which hard to understand.

    • @statquest
      @statquest  4 года назад

      Thanks! I'm glad the video is helpful.

  • @mohammednadeem662
    @mohammednadeem662 3 года назад +1

    StatQeust is really amazing to learn and understand things very easy

  • @ckarcher4504
    @ckarcher4504 4 года назад +5

    Thank you for your amazing video!
    Could you explain how to calculate the p-value in this video (such as 12:30). I have watched your p-value, but still do not know how to use it in this video's examples' calculation. 🙏🙏🙏

    • @statquest
      @statquest  4 года назад +4

      Unfortunately I can't explain it in a comment. Hopefully one day I'll make a video.

    • @ckarcher4504
      @ckarcher4504 4 года назад +1

      @@statquest Great😊😇🤓 I look forward to it😍😍. thank you very much!🙏🙏

  • @石政泰
    @石政泰 Год назад +1

    Hi, Josh. Nice to meet you! I am Tai from Taipei, Taiwan. From the video you mentioned in @7:42, can we say that the probability of a random dot on a random line is equal to the proportion of a line to the 2-D plain, which is the area of a line/area of a plain = 0/1? As we are interested in the probability of a random dot on a random line, it's actually the same as asking the chance of the dot on the line/the chance of the dot on the whole plain. As a line is 1-D, and the plain is 2-D, the proportion is 0. Hence, the probability of a random dot on a random line is equal to 0.

    • @statquest
      @statquest  Год назад +1

      That might be a way to look at it.I've never thought of it that way.

    • @石政泰
      @石政泰 Год назад +1

      @@statquest Thank you :)

  • @delliscool4924
    @delliscool4924 9 месяцев назад +1

    thank you , you are distinguished brilliant mind and great teacher for many

    • @statquest
      @statquest  9 месяцев назад

      Wow, thank you!

  • @jihowoo8227
    @jihowoo8227 2 года назад +1

    Thank you very much. You saved my day with (silly) songs and also my day, even my course :))))

  • @xruan6582
    @xruan6582 4 года назад

    Great course. May I point out that at (17:38) it is better to say "correlation quantifies the strength of linear relationships"

  • @qiliu4100
    @qiliu4100 4 года назад

    I am familiar with the concepts you talk about.
    But I am a fan of your songs, so I am here to listen to the music.

  • @qinqinkong8330
    @qinqinkong8330 3 года назад +1

    Triple Bam!! Thanks for the great lecture, although I think the p-Value not only depend on the amount of data we have, but also depend on the strength of relationship. For example, given the same amount of data, the chance to generate stronger relationship from random points is smaller for higher correlation than lower correlation.

    • @statquest
      @statquest  3 года назад

      Yes, that's sometimes true, but not always (for example, if your sample size = 2), so I decided to focus on the things that are always true in my video, and that is Correlation is determined by the strength of the relationship and p-values are determined by sample size. In other words, if the sample size is too small you will never have a small p-value, and if the sample size is huge, then it doesn't matter what the correlation is, the p-value will probably be significant. For example, if we have any 2 data points, we can draw a line through them, and correlation = 1, however, the p-value = 1. In contrast, if we have enough data, it doesn't matter how close the correlation is to 0, we can still have a significant p-value.

    • @qinqinkong8330
      @qinqinkong8330 3 года назад +1

      @@statquest You reply my comments! Bam!!!!

    • @statquest
      @statquest  Год назад

      @@yangyu5525 Corrected!

  • @tuankietly6076
    @tuankietly6076 5 лет назад +2

    your video is so great and easy to understand!

  • @caspase888
    @caspase888 5 лет назад +1

    Waiting for your videos is a cause worth waiting for 👍👍👍

  • @李广鸣
    @李广鸣 3 года назад +1

    It solves my confusion. Thanks a lot.

  • @MahdiSafarpour
    @MahdiSafarpour 5 лет назад +1

    Dear Josh Starmer
    ,
    I am thankful to you for your wonderful videos.
    May I know why the numerator of correlation formula is always lower than denominator?

    • @statquest
      @statquest  5 лет назад +2

      That would take a whole StatQuest to explain. We'd have to go through the Cauchy-Schwarz inequality. However, it's on the to-do list. One day I will do it.

    • @MahdiSafarpour
      @MahdiSafarpour 5 лет назад +1

      @@statquest It was a great clue. I start reading about Cauchy-Schwarz inequality and look forward to watching your lecture in future.

  • @tatianajaramillo1148
    @tatianajaramillo1148 4 года назад +1

    Thank you for your time to explain and make this video!!!

    • @statquest
      @statquest  4 года назад

      Thank you very much! I really appreciate your feedback.

  • @SevenRavens007
    @SevenRavens007 5 лет назад

    Still getting this clear in my mind. ..At 13:11 you say that adding data (and a decreased p value) increases our confidence in our guess. I think this may be misleading because it suggests that b smaller p values mean more accurate guesses. I would rather say that smaller p value means more confidence that we are accurately seeing the QUALITY of the guesses we can make (not the guess itself, which is indicated by the correlation value). So with a weak correlation, smaller p value means I am more certain that there is a weak relationship and that my guess will be poor
    I hope that makes sense. Thanks for a great series

    • @statquest
      @statquest  5 лет назад

      What I was trying to say was in the picture on the left, we can't be sure if adding more data would give us a totally different correlation value, so we have low confidence in it. In the picture on the right, we have enough data to be confident that the correlation value will not change much with additional data.

    • @yangyu5525
      @yangyu5525 10 месяцев назад

      Dear professor, at 12:57 in respect to the picture on the left, you said "increase the sample size ,don't increase the correlation". I have a different opinion about the statement. Because that at starting if I have two dots, so no doubt the correlation of the straight line is equal to 1,and P-value =1.then I add randomly some dots to the graph, well the correlation value will be changed , and so the P-value will do .thus, the P-value just tell us if there is a trend or not ,don't tell you how much the difference and how accurate the trend you find close to the actual of the stuff . Alternatively, the accurateness of trend or model you find depends on not only the amount of dots ,but also the development of technology, right?@@statquest

  • @joserobertopacheco298
    @joserobertopacheco298 Год назад +1

    the ultimate clearly explanation

  • @TheEbbemonster
    @TheEbbemonster 5 лет назад +10

    Very good - I would have liked to see a p-value calculation also :)

    • @SaintSaint
      @SaintSaint 3 года назад +1

      ruclips.net/video/vemZtEM63GY/видео.html ruclips.net/video/5Z9OIYA8He8/видео.html Both answer this.... but I agree... a quick explanation of p values would be the only extra credit that I felt was missing from this video. Much the way he did variance recap at the beginning.

  • @kiranisrani7245
    @kiranisrani7245 2 года назад

    Hi. Your explanation was perfectly fine.
    I have a doubt at 16:20, shouldn't it be "That means that there is 3% chance that random data could produce a weak relationship, or weaker".
    or
    "That means that there is 97% chance that random data could produce a strong relationship, or stronger".
    Because smaller the p value, stronger the correlation.

    • @statquest
      @statquest  2 года назад

      The video is correct. p-values are kind of tricky, and to learn more about how to interpret them, you can check out this video: ruclips.net/video/vemZtEM63GY/видео.html
      Also, a small p-value doesn't mean a strong correlation. We could have a weak correlation, like 0.1, and still have a small p-value.

  • @Bharathkumar-gv4ft
    @Bharathkumar-gv4ft 3 года назад +1

    p-value superbly explained!

  • @AnayFereshetyan
    @AnayFereshetyan 2 года назад +1

    This was so incredibly helpful, thank you!

  • @arismenachekanian1804
    @arismenachekanian1804 Год назад +1

    Thank you for making this great video!

  • @Neuroszima
    @Neuroszima 10 дней назад

    Hey nice video!
    In wikipedia there is also a "non-pearson" corelation, that aims to center data points around the origin, and calculate correlation with the use of covarianve in the form of the dot product with respect to vector norm of data points.

    • @statquest
      @statquest  10 дней назад

      Thanks for the info!

  • @abhishek-shrm
    @abhishek-shrm 4 года назад +1

    Best video ever seen on correlation👍😁

    • @statquest
      @statquest  4 года назад +1

      Thank you very much! :)

    • @abhishek-shrm
      @abhishek-shrm 4 года назад +1

      @@statquest Welcome and thank you for making these videos😁

  • @gin36147
    @gin36147 2 года назад

    I watched it as background music so not sure if this is already addressed: I think it might be worth mentioning that here "relationship" refers to "linear relationship". Otherwise, e.g. data generated by=x^2 on (-1,1) will get 0 correlation but obviously have a relationship. Relationship sounds more corresponding to "(in)dependence".

    • @statquest
      @statquest  2 года назад

      Throughout the entire video I mention that we are using a straight line to define the relationship.

  • @felipebraga7753
    @felipebraga7753 3 года назад

    Cara, seu vídeo é mega claro, sem deixar de ser rigoroso! Super obrigado pelo trabalho!

  • @evemarealle621
    @evemarealle621 2 года назад

    Hi Josh. Thanks for the great video.
    I have a question.
    1) Why does the correlation have a bound of -1 to 1 when you divide covariance with the product of the two standard deviations? Is the product of the standard deviations the maximum covariance the two random variables can have? If so, how do you show that
    2) And, how does the correlation of 1 tell you that the points lie on the straight line?

    • @statquest
      @statquest  2 года назад +1

      Unfortunately, showing how the limits of correlation are -1 and 1 isn't super easy. However, you're on the right track. When all of the points are on the same line, then the absolute value of the covariance = the product of the standard deviations.

  • @vaibhavpandey7398
    @vaibhavpandey7398 2 года назад

    Uncle josh, ur only one who answers my query of why can't squiggly line be made. Thanku

    • @statquest
      @statquest  2 года назад

      It can be, but it's not as easy (however, modern neural networks can fit a squiggly line to just about anything. For details, see: ruclips.net/video/zxagGtF9MeU/видео.html ). When we use squiggly lines, we use R^2 instead of Pearson's Correlation because Pearson's correlation is explicitly defined for straight lines.

    • @vaibhavpandey7398
      @vaibhavpandey7398 2 года назад

      @@statquest ok thanku.. It's entirely new for me

  • @sumitkumar-el3kc
    @sumitkumar-el3kc 4 года назад

    I love how you teach us like we're bunch of 7-8 year's old kids.

    • @statquest
      @statquest  4 года назад +1

      I just teach the way I teach myself.

  • @jingjingli101
    @jingjingli101 Год назад +1

    Great video! Can you also explain the difference between spearman and pearson corrlelation? Thanks a million!

  • @robinkohrs8097
    @robinkohrs8097 4 года назад +1

    That's simply amazing education...!! Just one question: What is "much" data? Doesn't it always depend on the context?

    • @statquest
      @statquest  4 года назад +1

      Yes - it depends on how much variation in there is in your data. If there is not much variation, then you don't need many observations. If there is a lot of variation, then you need a lot of observations.

    • @robinkohrs8097
      @robinkohrs8097 4 года назад +1

      @@statquest thank you so much !!

  • @Clarkephix
    @Clarkephix 4 года назад +1

    First time hearing a female voice on your channel, and it's hilarious. Anyway, thanks for all of your videos, it helps me survive throughout my statistic course

  • @aapje180
    @aapje180 5 месяцев назад +1

    Bedankt

    • @statquest
      @statquest  5 месяцев назад

      TRIPLE BAM!!! Thank you so much for supporting StatQuest!!! :)

  • @persephonez7067
    @persephonez7067 5 лет назад +1

    This is much better than the class in uni..

  • @sattanathasiva8080
    @sattanathasiva8080 3 года назад

    Hi, one doubt, in practical as you mentioned, smallest p-value will have high correleation I agree.
    However, I'm confused, If I goes by theoretical explanation of p-value.
    As per my understanding from your p-value video, p-value is the sum of probability of
    1. choosen rare event to occur
    2. similar rare event to occur
    3. Any other rare event to occur
    If this is the case, p-value for correleated value shouldn't be high, because p-value will inform what is the probability of having this event to occur i.e event of higher correlation.
    So it will inform that there is high probability that such correlation will occur.
    I'm new to stats, so please bear if my understandings are wrong.

    • @statquest
      @statquest  3 года назад

      It's not true that a small p-value = high correlation. As illustrated in this video, high correlation is simply a function how well a line fits the data, and a line fits any 2 random data points perfectly, and thus, will have the highest correlation, even though the points are random, and thus, will have a p-value of 1. To learn more about p-values, see: ruclips.net/video/vemZtEM63GY/видео.html and ruclips.net/video/nk2CQITm_eo/видео.html

  • @PRODKAZ-fy8cx
    @PRODKAZ-fy8cx Год назад +1

    Thank you so much! This was so helpful.

  • @JupiterChamsae991102
    @JupiterChamsae991102 4 года назад

    Awesome video again! But just a question about 15: 07 - 15:13, regarding "When the data all fall on a straight line with a positive or negative slope, then the covariance and the product of the square roots of the variance terms are the same and the division gives us 1 or -1, depending on the slope", I don't think I fully get it intuitively. So how could we know the absolute value of nominator and denominators are the same without calculation?

    • @statquest
      @statquest  4 года назад +1

      Unfortunately the mathematics that show why correlation is limited to a maximum value of 1 and a minimum value of -1 are quite complicated, which is why I glossed over it in the video.

    • @JupiterChamsae991102
      @JupiterChamsae991102 4 года назад

      @@statquest Thank you so much for your instant reply! Then without calculation, is there a possible way to just understand it intuitively?

    • @statquest
      @statquest  4 года назад +1

      @@JupiterChamsae991102 I did the best I could with this video.

    • @JupiterChamsae991102
      @JupiterChamsae991102 4 года назад +1

      @@statquest Ok~ Thank you so much as always ❤️

  • @Rustincohle88
    @Rustincohle88 5 месяцев назад +1

    BAM !!
    You are legend 😭👏

  • @maverick_0325
    @maverick_0325 5 лет назад +1

    Hi Josh, as always, thank you for your great videos! Would you consider making a video to explain the relationship between correlation and R-squared? I've watched all the videos about these two terminologies, but still can not figure out the relationship.

    • @statquest
      @statquest  5 лет назад +2

      Presumably you want something other than, "take your correlation value, r, and square it, and that's r-squared". More like, "how does the square of this equation get transformed into this other equation"?

    • @julieyananzhu1134
      @julieyananzhu1134 3 года назад

      @@statquest So how? I also curious about how/why correlation value^2=r-squared. The equations are so different. Appreciate it if you can kindly explain on that! Thank you, Josh!

    • @statquest
      @statquest  3 года назад

      @@julieyananzhu1134 I'd have to make a whole video to go through that derivation. Maybe one day I will! :)

  • @taotaotan5671
    @taotaotan5671 4 года назад +1

    BAM.. Get addicted to your video

  • @harshthechampful
    @harshthechampful 4 года назад

    at 8:47 why doesn't smaller p value lead to lesser confidence in predicting a random new point?

    • @statquest
      @statquest  4 года назад

      The p-value is the probability that random noise could generate a relationship as strong or stronger than what you observed. A small p-value suggests that it is unlikely that random noise created the data that you observed. Thus, this gives us more confidence that our model is correct. Does that make sense?

  • @siddharthmishra8233
    @siddharthmishra8233 2 года назад +1

    always enjoys your song josh!

  • @jaleshjalesh5712
    @jaleshjalesh5712 5 лет назад +11

    can u please tell how did you calculate p- value?

    • @ALEX-mo1zo
      @ALEX-mo1zo 4 года назад

      He did a video on P-Value

  • @robertdavis2855
    @robertdavis2855 10 месяцев назад +1

    I didn't think that Machine Learning and humor were correlated but here we are...BAM!

  • @yunbai2536
    @yunbai2536 3 года назад

    Hi Josh,
    Could you explain how you get the P-value? I have split into the below 2 sub-questions.
    - what is the input to come out P-value?
    - which probability density function are you using to calculate p? are you using ChiSquare person?

    • @statquest
      @statquest  3 года назад

      There are a lot of ways to calculate p-values for Pearson's correlation coefficient. For details, see: en.wikipedia.org/wiki/Pearson_correlation_coefficient

    • @yunbai2536
      @yunbai2536 3 года назад +1

      @@statquest thanks!

  • @maheshsonawane8737
    @maheshsonawane8737 11 месяцев назад +1

    Thanks Josh!!!!!!!!!!!!!! Helps lot.

    • @statquest
      @statquest  11 месяцев назад +1

      Thank you! :)

    • @maheshsonawane8737
      @maheshsonawane8737 11 месяцев назад +1

      @@statquest I can't believe u replied. I am pursuing MS Data Science. Your work really give me better understanding. I will pay ur tuition fee when I get job. ✌🤟👆👍😎

  • @1miffy1
    @1miffy1 Год назад +1

    When Phoebe decides to sing stats... xD
    Love the videos... lifesavers to sinking ships in the sea of numbers

    • @statquest
      @statquest  Год назад

      Check out: ruclips.net/video/D0efHEJsfHo/видео.html

    • @1miffy1
      @1miffy1 Год назад +1

      Omg xD best!

  • @haneulkim4902
    @haneulkim4902 4 года назад

    Thanks for you video! So covariance is just used to calculate correlations? What is the reason for making term covariance if it is just being used as stepping stone for calculating correlation?

    • @statquest
      @statquest  4 года назад +1

      It's used in other contexts as well (like in PCA or in longitudinal analysis). It's a useful intermediate step in a lot of ways, so it's good to give it its own name.

    • @haneulkim4902
      @haneulkim4902 4 года назад

      @@statquest Thanks but why not simply use correlation as stepping stone for other calculations as it provides weather slope is +,-,neutral(like covariance) as well as slope and closeness to line?

  • @evrenkutlu2339
    @evrenkutlu2339 5 лет назад +1

    Thank you very much for every video, you are awesome.
    16:26 I think there is a wording mistake; instead, that means that there is 3% chance that random data could "not" produce a smilarly strong relationship or stronger, am I right?

    • @statquest
      @statquest  5 лет назад +3

      The wording in the video is correct. For more details on p-values, check out the 'Quest: ruclips.net/video/5Z9OIYA8He8/видео.html

    • @Fionelko
      @Fionelko 4 года назад

      you can find the answer for your question in @ali alqaraan comment below.

  • @asdfafafdasfasdfs
    @asdfafafdasfasdfs Год назад +1

    Just to confirm, this correlation coefficient is the R that we have to square to get R squared?

  • @sophiacho5149
    @sophiacho5149 2 года назад

    11:23 you mean the closer the correlation values get to 1 or -1, right?

    • @statquest
      @statquest  2 года назад

      Correlations values of 1 and -1 represent situations when the data is all on the same straight line. However, when the data is not all on a straight line, then the correlation values get closer to 0.

  • @_kingston
    @_kingston 7 месяцев назад

    You mention that R^2 as a more intuitive / useful method for understanding goodness of fit than correlation, but doesn't R^2 require the assumption that the model is linear (so it cant be used for logistic regression and other non-linear models)? Does correlation have this same requirement too?

    • @statquest
      @statquest  7 месяцев назад

      Yes, they both share that requirement.

  • @santoshbala9690
    @santoshbala9690 4 года назад

    Hi Josh.. Very well explained... Thank you
    Please do a video on ACF & PACF (Auto Correlation & Partial Auto Correlation)