@@Usernotknown21 Data can be exponentially distributed if it uses space between events instead of time (for example, the number of miles between gas stations). Also, according to the wikipedia article, there are a handful of non-time/space applications as well: en.wikipedia.org/wiki/Exponential_distribution
@@statquest that makes sense because it is hospital safety events and processes. For example, if a nurse doesn't wash his hands, in the future that can lead to increase in negative safety events. I tried to run Pearson and other correlations to do a correlation matrix but Pearson assumes the data is normally distributed. Any advice on running correlation matrix on exponentially distributed data in r?
@@Usernotknown21 Well, the good news is that Pearson's correlation does not assume normality (see stats.stackexchange.com/questions/3730/pearsons-or-spearmans-correlation-with-non-normal-data )... So you can go ahead and use it. BAM!
Don't know, if someone already commented on that, but: I'm just wondering why you multiply both sides by λ, where you could also multiply by 1/n. In that case, you would get λ = 1/x̄ and you could have made a reference to a video of yours explaining x̄ :D I'm just messing around, don't take that seriously :)
As an engineering student at a top UK university, I can safely say that you have been more helpful/useful than my information engineering lecturer. Two hours of confusion over likelihood when I could've just watched a few of your videos. Being smart enough to work at a top university definitely does not correlate with being a good teacher. Thank you anyhow!
I never understood why people have to go to uni to attend lectures when they could instead be given excellent books to study instead and examined on them.
can someone pls tell me what this guy eat growing up?!! how can someone be this good of a teacher???? spent 22 years (school+undergrad+grad) in academic career and this is the first time i see "Statistics" differently.
Josh is clearly passionate about Statistics and teaching. I wish I had these learning resources in high school. At least my kids are benefiting from these videos now.
This channel saved me from failing a class and now I just watch the videos for fun. You totally changed how I perceive educational videos. Came here because I had to, staying because I want to. Much love from UNC
It just took 9.39 min to explain what I never understood for past 2 weeks. Thank you a ton or a million times. My exams are nearby and now my head is clear to move forward with other distributions as well. My lecturer skipped likelihood and started explaining Max likelihood est. I pray for your good health 🙏 always.
How can I thank you for this? I mean how hard it was for my university professor to make me understand this. I don't know about me but the students who study under you is blessed. Thanks a lot Sir, thanks a lot. Double BAM!!!
"It's a statistical distribution that models the time between events" So that's why in phylogenetics we use an exponential distribution to describe the time between occurring and disappearance of lineages on a phylogenetic tree! Thank you a lot, man!
Amazing. I studied this first many years ago, and now am in need of reviewing all the concepts for career progression. Never understood so many of these concepts until now. Thank you.
Good stuff and deeply helpful. You're among the best teachers I've had, even if it's via online video. I hope you can do some videos on Poisson regression one day.
Thanks a lot! I searched for exponential distribution everywhere. Everything was so difficult. You are a star. Great. How easily you are explaining. I have become a fan of your teaching style.
Sir, the way you explain things is simply superb. Even though I knew most part of this concept already, but only after watching this video, I have now got the intuition behind this simple logic. Thanks a lot.
I'm here just to say thanks! A lot of studies that I'd saw, and only you were straight to the point, simple as possible. This was really helpful. Subscribed!
Thank you so much for this! Likelihood has definitely been one of the more elusive things for me to understand in all of my upper division statistics classes! Especially why the likelihood was a product of the functions with the parameters. I wish I had watched this earlier!!! Especially when deriving likelihoods for likelihood ratio tests and for Max likelihood estimates
The first time i watched this video i did not understand concepts, now i totally understand how to use maximum likelihood for estimating parameters and using derivative for finding the optimal value
Excellent video. I feel that to KEY concept to understanding this lecture is the slide on 3:28 the equation that L(lamb|x)=y, the rest is mostly arithmetics.
Dude, you have no idea how helpful these videos are. I'm pretty sure my professor has no "theory of mind," and I never understand anything he says, and the book we're using is miserable also. Your videos are literally how I'm learning everything for this class.
The illustrated guide is great. If you don't have it, strongly consider buying it. Couldn't help but notice that the MLE is the reciprocal of the mean. Why is that? Coincidence?
I love your work! You clearly explains concepts, which sooo many professors with A+ publications cannot communicate half as good (and often leave confused students after the first 20 min of their lectures). A small suggestion for these "estimating the parameters of a distribution"-videos. You could mention somewhere in these vids that -in applied statistics - it is really important to pick a distribution that is suitable to represent the "data generating process" i.e. encourage people to first figure out (e.g. by plotting a histogram), what distribution might be useful to describe the data (or more specifically how the data were created). Often students blindly assume a certain distribution and wonder why e.g. regression parameter estimates appear wonky.
I was doing Master Degree in Finance with the result of High Distinction from a very reputable univesity in Australia. And yeah, I still got no clue what the hell exponential distribution doing in modelling risk, thanks for this video :')
I'm really happy for what you shared, you have very interesting explanation method, it makes math easy, so I hope if you can share more exercises with their solutions, and also differential equations, we want really about more. Thanks in advance Mr statquest
I'm glad you like the videos so much. I'm working on a few other maximum likelihood examples, then I'll do some basic statistics and then back to Machine Learning.... Although differential equations would be fun, too.
2nd video on the MLE and I can safely say you're my new best friend. I'm going to tell all my classmates about you. It's amazing how you have simplified this confusing concept for me. Thank you so much. P.S. I like the little intro songs too 😂 So cute! I hope they're different on every video.
Mr. Josh. Thank you for the videos. It brings my confidence back and makes me more determined. Any chance u would be able to make a video on order statistics.. i understand the pdf but not totally clear.. again thank you so much for the videos...:)
Sir also make videos on negative binomial,beta distribution,gamma distribution and also try to complete every topic in the syllabus of bsc and msc statistics Your way of teaching is excellent I am following you now a days I am eagerly waiting for your more statistical videos Please reply
I love your video so much ! I found fun in learning difficulte stuff, like math ! you should record more songs like those at the intro. so far my favorite one is "smelly cat"
Hello Josh, forever grateful for your videos!! Thank you very very much! I have 1 thought.. since you are talking about Max. Likelihood Estimation - I think it's important to add the second derivative step too- that it should be negative. Would love to hear from you. Lots and lots of love :) :)
That's exactly right. When you don't know if your function has a single minimum point or a maximum point, then you should check the second derivative to make sure you have what you expect.
Hi everyone! First of all: Josh, thanks a lot for your awesome videos!!!!!!!!! I really love them!!!!! Everything makes sense when it's explained by you. Second: I have a question, hope someone can help me (: Why L(λ | X1 and X2) = L(λ | X1) L(λ | X2) instead of L(λ | X1 and X2) = L(λ | X1) + L(λ | X2) why we multiply each likelihood instead of adding them? hehe, hope it isn't a very dumb question
It's from probability theory. When we "AND" we multiply. When we "OR", we add. Intuitively, when we want both things to happen, multiplying them together will generally give us a smaller likelihood, but adding them will give us a larger likelihood. Thus, when either thing can happen, we add.
The idea is that we want to find the best value for lambda such that the data will have their maximum likelihood. If the data are all 0, then lambda = 0 would be great! However, usually the data are spread out some, and this makes the problem a little more interesting. Depending on how they are spread out, different value for lambda will result in different likelihoods. We want to find the value for lambda that maximizes the likelihood. Again, if the data are all 0, then lambda = 0. But the data are not always 0.
Referring to 7:17 Although the results will be same! If we assume λ^n[ e^-λ(X1+X2+X3......Xn)] to be equal to a variable 'Y'. Recalling from my mathematics class, I think that dy/dλ should be equal to (Y)(n1/λ - (X1 + X2 + X3.....XN). Although as 'Y' can't be zero as we are maximising it, the results will be same when we equate the derivative result to zero. Please correct me if I am wrong! Again...thanks so much for another amazing statquest!!!!
Thanks for this! It seems that although likelihood is not probability (except for discrete distributions wherein theyre equal right?) it is treated like probability because L(x1,x2)=L(x1)*L(x2) 🤔 there's still a bit of confusion left but the videos have been helping me!!
Yes it is. So when you don't know the shape of the function, then you should check to make sure you are at a maximum. In this case we know the shape of the function.
For independent events A, B: Pr(A & B) = Pr(A)Pr(B). For example, coin flips are independent of each other. If you flip the coin twice, the probability you get tails both times is simply multiplying .5*.5.
Thank you so much Josh for all these videos!! Just one question: It got clear how to find the best estimates for the exponential distribution, but how to decide in general if a normal distribution, an exponential distribution or any other distribution fits best the given data?
This is a good question, and one that people ask about a lot. There are a bunch of ways to figure this out. 1) If you have enough data, draw a histogram. That can tell you the distribution 2) Experience with other datasets that are similar. 3) If you understand the process that generates the data, often this will tell you what distribution to use. 4) Did I mention experience? That's probably the most important one.
Thanks a lot for your video, it's awesome..!!!! I have some doubts, I would be thankful to you if you can clarify it. - In the maximum likelihood approach, we are finding the best estimate of the parameter assuming a distribution, by observing the data points, so in that case, we are estimating the parameter after observing data. So, is it not same as the posterior probability, where we update the prior after observing the data? -Also, what it means when it is said that in this approach, the error bars for the estimate are obtained by considering the distribution of possible data sets? Thanks !!
This method, maximum likelihood, is different from calculating posterior probabilities. They are talking about the standard error. I have a video that explains the standard error: ruclips.net/video/A82brFpdr9g/видео.html
holy crap i didn't realize critical points of f and log f are the same until now. subbed! question: you got lambda from calculating a 1st order constraint, but maximization is a 2nd order constraint. why doesn't this lambda give, for example, the minimum likelihood?
You can do the second derivative and prove that the result is optimal. I've done it, and it works. However, I didn't think it was necessary to add to this video.
The second derivative, in this case, will always tell you the same thing - that you are at a minimum. So you can use it to formally prove that you are correct, but it's not needed.
Hey Josh, Thanks for clear explanation, right from basics. However, I have a question. Why is the graph showing declining trend? How to interpret in English sentence? Could you please help here?
Josh, good day! I have a question at 0:41. Why exactly the exponential distribution is said to model the time between events? On which grounds? Can we use other distributions for time between events modelling? For example chi-squared with df=1 has very similar probability distribution. Or uniform distribution?
We could collect actual data, the time between events, like people watching this specific video, and plot a histogram of that data and look at it's shape.
Hi Josh, should not it be L(x1|Lambda), L(x2|Lambda) etc as we are trying to maximize the likelihood of observing these points if the value was lambda. ?
Hey Josh, Thanks for this set of videos saving us of not understanding statistical inference and enjoying your songs as well. Love it. I have a question if you'd like want to answer, I see the histogram or density plot, and I am not sure wish one is a better fit for my data between two distribution, to pick the distribution that fit the best, may I compute the two MLE for both of them, then compare the values of loglikelihood at the estimators given the data, and then conlcude that the MLE with maximun value in its respective loglikelihood functions be the distribution that I should pick? or What woulld you do? not getting too technical please, you are good for that.
@@statquest Josh, this will be like a replicated of many messages... God Bless you for your advice. I am a firm believer that the knowledge that is shared is the one that helps the world to be a better place. You are doing that via this youtube channel.
Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/
I plotted data and it looks like exponential distribution but it is not time series data. I wonder why?
@@Usernotknown21 Data can be exponentially distributed if it uses space between events instead of time (for example, the number of miles between gas stations). Also, according to the wikipedia article, there are a handful of non-time/space applications as well: en.wikipedia.org/wiki/Exponential_distribution
@@statquest that makes sense because it is hospital safety events and processes. For example, if a nurse doesn't wash his hands, in the future that can lead to increase in negative safety events. I tried to run Pearson and other correlations to do a correlation matrix but Pearson assumes the data is normally distributed. Any advice on running correlation matrix on exponentially distributed data in r?
@@Usernotknown21 Well, the good news is that Pearson's correlation does not assume normality (see stats.stackexchange.com/questions/3730/pearsons-or-spearmans-correlation-with-non-normal-data )... So you can go ahead and use it. BAM!
Don't know, if someone already commented on that, but: I'm just wondering why you multiply both sides by λ, where you could also multiply by 1/n. In that case, you would get λ = 1/x̄ and you could have made a reference to a video of yours explaining x̄ :D I'm just messing around, don't take that seriously :)
As an engineering student at a top UK university, I can safely say that you have been more helpful/useful than my information engineering lecturer. Two hours of confusion over likelihood when I could've just watched a few of your videos. Being smart enough to work at a top university definitely does not correlate with being a good teacher. Thank you anyhow!
Wow, thank you!
“being smart enough to work at a top university does not correlate with being a good teacher”.....i feel you sir,i feel you ❤️
Oh my gosh....which university u study.?
I never understood why people have to go to uni to attend lectures when they could instead be given excellent books to study instead and examined on them.
can someone pls tell me what this guy eat growing up?!! how can someone be this good of a teacher???? spent 22 years (school+undergrad+grad) in academic career and this is the first time i see "Statistics" differently.
Hooray!!! I ate a lot of carrots. :)
@@statquest Double Bam!
Josh is clearly passionate about Statistics and teaching. I wish I had these learning resources in high school. At least my kids are benefiting from these videos now.
This channel saved me from failing a class and now I just watch the videos for fun. You totally changed how I perceive educational videos. Came here because I had to, staying because I want to. Much love from UNC
Thank you and go heels! :)
It just took 9.39 min to explain what I never understood for past 2 weeks. Thank you a ton or a million times. My exams are nearby and now my head is clear to move forward with other distributions as well. My lecturer skipped likelihood and started explaining Max likelihood est. I pray for your good health 🙏 always.
Thank you and good luck with your exam! :)
Oh my God. I'm halfway through and I can't BELIEVE this was THE concept I could NOT understand at university, THANK YOU SO DAMN MUCH!!!
Happy to help!
How can I thank you for this? I mean how hard it was for my university professor to make me understand this. I don't know about me but the students who study under you is blessed. Thanks a lot Sir, thanks a lot.
Double BAM!!!
Thank you very much! :)
This channel is pure gold! Thanks for all your efforts sir, you are helping us students a lot! All the best
Thanks! :)
I wish you were my probability and stats professor. Concise yet crystal clear explanation. I take a deep bow. 🙇
Thank you! :)
"It's a statistical distribution that models the time between events"
So that's why in phylogenetics we use an exponential distribution to describe the time between occurring and disappearance of lineages on a phylogenetic tree! Thank you a lot, man!
Bam! :)
You are wrong. That is a double bam, BAM!
How did I not know about this channel until now? This is a gold mine, I just found!!!!
Bam!
I am writing from Brazil. Josh Starmer is the magician of the stats explanations!
Thank you!
Amazing. I studied this first many years ago, and now am in need of reviewing all the concepts for career progression. Never understood so many of these concepts until now. Thank you.
Hooray! I'm glad the videos are helping you! :)
Finally someone who explains stuff in a way that my brain prefers it. Thank you!
Glad you liked it!
Good stuff and deeply helpful. You're among the best teachers I've had, even if it's via online video. I hope you can do some videos on Poisson regression one day.
Thank you! Poisson Regression is definitely on the to-do list - so it's just a matter of time before I get to that one.
This is by far the best explanation of MLE, I've ever come across! Awesome!
Thank you!
Thanks a lot! I searched for exponential distribution everywhere. Everything was so difficult. You are a star. Great. How easily you are explaining. I have become a fan of your teaching style.
Thank you very much! :)
Amaziiiiing! You make me actually understand what is happening in all this calculations, such a blessing....SUCH A BLESSING, THANK YOU !
Thanks! :)
Sir, the way you explain things is simply superb. Even though I knew most part of this concept already, but only after watching this video, I have now got the intuition behind this simple logic. Thanks a lot.
Thank you! :)
how can someone be so good at explaining stuff? I might end up watching all the videos XD
Thanks! :)
it has cleary figured out the meaning of MLE which was foggy in my brain. Thanks a lot
Hooray!! I'm glad the video was helpful. :)
I'm here just to say thanks! A lot of studies that I'd saw, and only you were straight to the point, simple as possible. This was really helpful. Subscribed!
Thank you very much!
Taking lecture of top rated google engineer, i couldn't clear this topics, thank you.
Happy to help!
Thank you so much for this! Likelihood has definitely been one of the more elusive things for me to understand in all of my upper division statistics classes! Especially why the likelihood was a product of the functions with the parameters. I wish I had watched this earlier!!! Especially when deriving likelihoods for likelihood ratio tests and for Max likelihood estimates
I'm so glad to hear that the video was helpful! :)
This is where I hit the wall with my limited understanding of math. I'll be back after I learn some! Thanks for the awesome videos!
Good luck!
The first time i watched this video i did not understand concepts, now i totally understand how to use maximum likelihood for estimating parameters and using derivative for finding the optimal value
BAM! :)
Love your videos! They always present concepts with a clarity and simplicity that makes them much easier to grasp.
Thank you! :)
Triple BAM!!! and even quadruple BAM!!!! Another brilliant, crystal-clear explanation. Thanks so much, Josh.
My pleasure!
Excellent video. I feel that to KEY concept to understanding this lecture is the slide on 3:28 the equation that L(lamb|x)=y, the rest is mostly arithmetics.
Yep.
Dude, you have no idea how helpful these videos are. I'm pretty sure my professor has no "theory of mind," and I never understand anything he says, and the book we're using is miserable also. Your videos are literally how I'm learning everything for this class.
I'm glad my videos are helpful! :)
I understood the exponential distribution very well , thanks to this video.
Thank you for this extremely detailed and crisp explanation
Glad it was helpful!
Wow , Thank you sir ....just Thank you please don't ever stop making these amazing videos.....
You got it!
I've never seen any explanation of likelihood make so much sense
Thank you! :)
The illustrated guide is great. If you don't have it, strongly consider buying it.
Couldn't help but notice that the MLE is the reciprocal of the mean. Why is that? Coincidence?
Thank you! As for your question, I'm not exactly sure why.
I love your work! You clearly explains concepts, which sooo many professors with A+ publications cannot communicate half as good (and often leave confused students after the first 20 min of their lectures).
A small suggestion for these "estimating the parameters of a distribution"-videos.
You could mention somewhere in these vids that -in applied statistics - it is really important to pick a distribution that is suitable to represent the "data generating process" i.e. encourage people to first figure out (e.g. by plotting a histogram), what distribution might be useful to describe the data (or more specifically how the data were created). Often students blindly assume a certain distribution and wonder why e.g. regression parameter estimates appear wonky.
Thanks! I'll keep that in mind.
I want to thank you for your amazing work! I would love to see a StatQuest about marcov chains or autoencoders
sir your songs are as enjoyable as your teaching!!! you are awesome sir!! NAMASKARAM🙏🙏🙏
Thank you very much! :)
Once I get a job, I'll be more than happy to donate! Great video!
Bam! Thank you! :)
Thanks, I reinforced my understanding of maximum likelihood by watch this video.
Glad it helped!
Intro is specially fitting since one of the most common examples for time between events modelling is time between rainy days in a lot of books.
Nice
You really are a great teacher, nice job Josh!
Thanks! :)
I was doing Master Degree in Finance with the result of High Distinction from a very reputable univesity in Australia. And yeah, I still got no clue what the hell exponential distribution doing in modelling risk, thanks for this video :')
BAM! :)
I'm really happy for what you shared, you have very interesting explanation method, it makes math easy, so I hope if you can share more exercises with their solutions, and also differential equations, we want really about more. Thanks in advance Mr statquest
I'm glad you like the videos so much. I'm working on a few other maximum likelihood examples, then I'll do some basic statistics and then back to Machine Learning.... Although differential equations would be fun, too.
Wish I had found you sooner. This could not be better explained, thank you!
The ONLY thing i like about this channel is EVERYTHING
Hooray!!! You're the best! :)
Thank goodness for this I am no longer entirely confused
bam! :)
your video is sooooooooooo easy to understand!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Thanks!
Just brilliant, a total career saver. Bless you.
bam! :)
2nd video on the MLE and I can safely say you're my new best friend. I'm going to tell all my classmates about you. It's amazing how you have simplified this confusing concept for me. Thank you so much.
P.S. I like the little intro songs too 😂 So cute! I hope they're different on every video.
Thanks!
Mr. Josh. Thank you for the videos. It brings my confidence back and makes me more determined.
Any chance u would be able to make a video on order statistics.. i understand the pdf but not totally clear.. again thank you so much for the videos...:)
StatQuest makes me so happy….
Hooray!!! :)
Second vid from this channel that I have watched, liked and subbed. Million thanks Josh
Hooray!!! I'm glad you like the videos and have subscribed!
RUclips ranks channels based on subscriptions so, by subscribing, you're helping me a lot! :)
Sir also make videos on negative binomial,beta distribution,gamma distribution
and also try to complete every topic in the syllabus of bsc and msc statistics
Your way of teaching is excellent
I am following you now a days
I am eagerly waiting for your more statistical videos
Please reply
Over time I hope to cover as many statistical topics as I can. I have a lot of fun making these videos and will make as many as I can. :)
I love your video so much ! I found fun in learning difficulte stuff, like math ! you should record more songs like those at the intro. so far my favorite one is "smelly cat"
Thank you! :)
Nice explanation! Would you also explain the bias and consistency of the estimator?
I'll keep that in mind.
@@statquest Thanks! Awesome and lucid explanation still!
Hello Josh, forever grateful for your videos!! Thank you very very much! I have 1 thought.. since you are talking about Max. Likelihood Estimation - I think it's important to add the second derivative step too- that it should be negative. Would love to hear from you. Lots and lots of love :) :)
That's exactly right. When you don't know if your function has a single minimum point or a maximum point, then you should check the second derivative to make sure you have what you expect.
Hi everyone!
First of all: Josh, thanks a lot for your awesome videos!!!!!!!!! I really love them!!!!! Everything makes sense when it's explained by you.
Second: I have a question, hope someone can help me (:
Why L(λ | X1 and X2) = L(λ | X1) L(λ | X2)
instead of L(λ | X1 and X2) = L(λ | X1) + L(λ | X2)
why we multiply each likelihood instead of adding them? hehe, hope it isn't a very dumb question
It's from probability theory. When we "AND" we multiply. When we "OR", we add. Intuitively, when we want both things to happen, multiplying them together will generally give us a smaller likelihood, but adding them will give us a larger likelihood. Thus, when either thing can happen, we add.
@@statquest oh! now it makes sense, thanks a lot Josh!!!! :D
Great lesson! About exponential density... its maximum point is (0,lambda) that means 0 is the most likely data (of course no). Im getting confused...
The idea is that we want to find the best value for lambda such that the data will have their maximum likelihood. If the data are all 0, then lambda = 0 would be great! However, usually the data are spread out some, and this makes the problem a little more interesting. Depending on how they are spread out, different value for lambda will result in different likelihoods. We want to find the value for lambda that maximizes the likelihood. Again, if the data are all 0, then lambda = 0. But the data are not always 0.
at time 6:50, a pair of brackets are missing for the differentiation of the log. It should be " =d/dlamda (log(..)+ log[....]) ". You are welcome.
Yes, I was a little sloppy with the notation.
Referring to 7:17
Although the results will be same! If we assume λ^n[ e^-λ(X1+X2+X3......Xn)] to be equal to a variable 'Y'. Recalling from my mathematics class, I think that dy/dλ should be equal to (Y)(n1/λ - (X1 + X2 + X3.....XN). Although as 'Y' can't be zero as we are maximising it, the results will be same when we equate the derivative result to zero.
Please correct me if I am wrong!
Again...thanks so much for another amazing statquest!!!!
I can't follow your logic.
@@statquest Can I email you the photograph of the steps that I used?
just the explanation I was loolikg for, thank you sir!
Glad it helped!
Thanks for this! It seems that although likelihood is not probability (except for discrete distributions wherein theyre equal right?) it is treated like probability because L(x1,x2)=L(x1)*L(x2) 🤔 there's still a bit of confusion left but the videos have been helping me!!
If you want to learn more about the differences between probability and likelihood, see: ruclips.net/video/pYxNSUDSFH4/видео.html
@@statquest thank you! I already did before this video :) the comments there have been helpful too!
A highly intuitive video! thanks for all your efforts..just a simple question
the derivative(slope) is also zero at minima of the function right?
Yes it is. So when you don't know the shape of the function, then you should check to make sure you are at a maximum. In this case we know the shape of the function.
@@statquest Thanks sir! You make our lives and assignments easier. :-)
Double BAM!!! I love this video, great explanation!
Thanks!
your videos are great. Can you make a series for time series modeling as well if possible!
That's the plan!
Question: why do we simply multiply L(lambda|x1) and L(lambda|x2) to get L(lambda|x1, x2)? What's the underlying assumption here?
The underlying assumption is that x1 and x2 are independent of each other.
For independent events A, B: Pr(A & B) = Pr(A)Pr(B). For example, coin flips are independent of each other. If you flip the coin twice, the probability you get tails both times is simply multiplying .5*.5.
Great video! Could you do a whole series on time series analysis/ forecasting?
I'll keep that in mind.
Thank you so much for such an amazing video!
Glad it was helpful!
Thank you so much Josh for all these videos!! Just one question: It got clear how to find the best estimates for the exponential distribution, but how to decide in general if a normal distribution, an exponential distribution or any other distribution fits best the given data?
This is a good question, and one that people ask about a lot. There are a bunch of ways to figure this out. 1) If you have enough data, draw a histogram. That can tell you the distribution 2) Experience with other datasets that are similar. 3) If you understand the process that generates the data, often this will tell you what distribution to use. 4) Did I mention experience? That's probably the most important one.
@@statquest thanks so much josh, helped to understand the process a lot!!
Hi!
First, thank you!
Second, What to do if we do not know the distribution of the data?
Thanks a lot for your video, it's awesome..!!!!
I have some doubts, I would be thankful to you if you can clarify it.
- In the maximum likelihood approach, we are finding the best estimate of the parameter assuming a distribution, by observing the data points, so in that case, we are estimating the parameter after observing data.
So, is it not same as the posterior probability, where we update the prior after observing the data?
-Also, what it means when it is said that in this approach, the error bars for the estimate are obtained by considering the distribution of possible data sets?
Thanks !!
This method, maximum likelihood, is different from calculating posterior probabilities.
They are talking about the standard error. I have a video that explains the standard error: ruclips.net/video/A82brFpdr9g/видео.html
You should do video on Bayes :) Great videos
What is the channel that is the Combined Maximum Learning Estimator for Statistics and Machine Learning?
StatQuest
Triple BAM!!!
I love it! Thank you again! :)
Another great explanation. Thank you!
Hooray! :)
Not critical but @7:10 there should be parentheses after that d/dλ.
By the way, great video!
Thanks!
Hi, This video is very useful. Can you make a video to explain "Maximum Likelihood for the Gamma Distribution"
I'm very interested in it!
I'll keep that in mind.
holy crap i didn't realize critical points of f and log f are the same until now. subbed!
question: you got lambda from calculating a 1st order constraint, but maximization is a 2nd order constraint. why doesn't this lambda give, for example, the minimum likelihood?
You can do the second derivative and prove that the result is optimal. I've done it, and it works. However, I didn't think it was necessary to add to this video.
Thank you so much. It was very useful
Glad it was helpful!
I guess it is not a coincidence that lambda is the inverse of the arithmetic mean?
Thanks a lot for your videos!
That's right. It's not a coincidence. It's the Maximum Likelihood estimate.
This is just great explanation :)
Glad it was helpful!
You are amazing!! Thank You Sir!
Thanks! :)
Big BAM! Thank You. Namasthe.
Thanks! :)
very fun to watch.everytime I feel like woah
bam!
Could you make the slides available as well? Thank you!!
wow...very helpful n easy to understand as compare to text book
How well explained, second derivative to derive a -ve value is not required?
The second derivative, in this case, will always tell you the same thing - that you are at a minimum. So you can use it to formally prove that you are correct, but it's not needed.
Hey Josh, Thanks for clear explanation, right from basics. However, I have a question. Why is the graph showing declining trend? How to interpret in English sentence? Could you please help here?
It just means that most events happen in a short amount of time. Fewer events take a long time before they occur.
thank you so much you bloody amazing legend
Love this channel ❤
Glad you enjoy it!
Josh, good day!
I have a question at 0:41. Why exactly the exponential distribution is said to model the time between events? On which grounds?
Can we use other distributions for time between events modelling?
For example chi-squared with df=1 has very similar probability distribution. Or uniform distribution?
We could collect actual data, the time between events, like people watching this specific video, and plot a histogram of that data and look at it's shape.
Love the BAM! XD
Thanks! :)
Please do a video on ReLu function and its advantage over sigmoid function
I'll probably cover that when I do neural networks.
You're the best.
Thanks!
I love you(r channel) so much!
Hooray!!! :)
Hai Josh,, Can you make a next video about "How To Estimate 2 Parameter Weibull Distribution with Maximum Likelihood Estimation"? Thank you.
I'll keep that in mind.
Hi Josh, should not it be L(x1|Lambda), L(x2|Lambda) etc as we are trying to maximize the likelihood of observing these points if the value was lambda. ?
You'll probably see it both ways.
josh starmer's the man
:)
Hey Josh, Thanks for this set of videos saving us of not understanding statistical inference and enjoying your songs as well. Love it. I have a question if you'd like want to answer, I see the histogram or density plot, and I am not sure wish one is a better fit for my data between two distribution, to pick the distribution that fit the best, may I compute the two MLE for both of them, then compare the values of loglikelihood at the estimators given the data, and then conlcude that the MLE with maximun value in its respective loglikelihood functions be the distribution that I should pick? or What woulld you do? not getting too technical please, you are good for that.
See: www.itl.nist.gov/div898/handbook/apr/section2/apr233.htm
@@statquest Josh, this will be like a replicated of many messages... God Bless you for your advice. I am a firm believer that the knowledge that is shared is the one that helps the world to be a better place. You are doing that via this youtube channel.
@@xondiego Thank you very much! :)
Kindly make one video on weibull distribution