Rasmus, you should really start making videos again. Seriously, the way you teach is, by far, one of the best I've seen on RUclips (and I watched plenties of videos here since I am a self-taught student of economics). I am amazed how tricky and deep concepts seem so simple when brought by you. Thank you for this series on Bayesian Statistics.
Notes for my future revision. Why Bayesian Data Analysis? 0:29 How easy it is to change Bayesian model while the computation stay the same. 0:32 You have great flexibility when building Bayesian models, and can focus on that, rather than computational (algorithmic) issues. There are often computational (processing) issue in fitting Bayesian model. But since there is clean separation between specifying and fitting model in a bayesian framework, you often don't have to focus too much on how your model is computed when you construct it. That mean you can focus on what assumptions are reasonable and what information you should use, rather than on algorithm when doing the actual modelling. There are many tools to help fitting Bayesian models (Stan, PyMc), just specifying the model might just be enough.
Thank you Rasmus! A very clear and accessible introduction with lots of opportunties to actually apply what we learned. Thank you for all you work on this.
Answer to the question "Why to accept both A & B at the same time": I think got this one :). for both A & B you need a normalizer often call the evidence or the marginal likelihood P(data) So it must be proportional and the same for A & B , thats why when you want to compare A and B you must accept both. It is an evidence for the joint distribution of A & B
I would be happy to delve further into interpretations of ML algorithms from Bayesian perspective which you talk about at around 20 minutes into the video. I get the linear regression, but curious to learn more.
Thank you for the excellent introduction series! Shouldn’t the profit calculation at 17:58 also take into consideration the cost of the campaigns for the non-respondents? So it would be profitA = (rateA*(1000-30))+(1-rateA)*-30. And then profitB = (rateB*(1000-300))+(1-rateB)*-300
I too was confused a little about the results, then I realized Rasmus calculated the average profit per person. Think of it this way: let say you use campaign A for n people and your signup rate is r. Then: 1. your total_profit is: total_profit = n*r*1000 - n*30 2. your average_profit is average_profit = total_profit / n = (n*r*1000 - n*30) / n = r*1000 - 30 Using Rasmus's notation, your average_profit = rateA*1000 - 30 😊His numbers are right. I hope this helps.
Great explanations, thanks! I just didn't quite understand why in the A/B testing you should accept only results that match "both" datasets at the same time. Is it simply to apply arithmetics to the resulting generated data (like you did for getting the difference between A and B)? Thanks!
7:50 I am confused with rate_diff. Simulated subscription rates for Method A and B are not related in any way other than the order in which these are generated (cf. unlike repeated mesures pre- and post-test scores). Is it meaningful to calculate difference scores between two numbers that are not related? Can we just look at confidence intervals?
@rasmusab, there is a small mistake in your tutorial at @4:40. The second rate is not incorporated into the model so 64 =/= 72 (which was from the previous draw).
17:34 Method B: Don't you send out a brochure (30 Kr) with a salmon (300 Kr) unless it is already included (Fish 270 Kr Brochure 30 Kr). Don't forget to include a QR code. :)
Why do we need to discard samples when only one draw matches the real data (A:4, B:10)? Why not just throw away the sample for the non-matching method or even just sample them separately? It isn't a bivariate distribution so each method's draws are independent of the other method's.
Please, could you provide a method how to calculate alfa & beta parameters for a beta-distribution given a particular probability? I noticed that in your example, probability of success ranges between 5% and 15%, and you genarated a distribution with params Beta(3,28). So the question is how did you achieve those alfa = 3 and beta = 28, respectively? Thanks!
Would also like to see explicit code for how one creates beta distributions in R. Currently trying with dbeta() but I can't get the same distribution you get for alpha=3, beta =25.
Great video! For the decision analysis, why do you need bayesian analysis? You could just use the maximal likelihood estimation with the cost equation and see that the brochure alone is better. (0.38*1000 - 30 > 0.63*1000 - 300). If you're making a decision based off of this, it doesn't seem beneficial to do the bayesian analysis.
Great videos!!! Thank you very much. I have a question though that troubles me. I didn't understand why, while calculating the posteriors for both A and B methods we have to take the one in consideration with the other. Why do we need to keep the probability values only when both model responses agree with the observed responses. Couldn't we produce the posteriors by running the models independently?
Yes, for this specific model, that is correct! If you can figure out winch parameter are independent of each other then you can run those part independently. In general, however, you would have to run it all at the same time. :)
Awesome video! But I think in the method B the cost of the brochure is missing, it's only been considered the cost of salmon, so the profitB would be (1000rateB-330) instead of (1000rateB-300), Anyway, the whole idea is perfectly explained. Cheers!
Can we use the posterior of one round of computation as the informative prior of the next round of improved estimate? When it has no relation to reality such as an expert opinion, and still has its origin from a wild guess of uniform distribution, would it be a better estimate?
Congrats for the videos. They are amazing. I really like the way you explain. It's brilliant. Also, your voice is a good fit for videos. When you calculate the profit for method B, shouldn't you take out 300 for the salmon and 30 still for the mail since you send both?
Ah, but the postage system work in mysterious ways. If you're already sending a salmon, the you don't need to pay postage for the brochure, which was the main cost of the 30 kr. :)
Congratulations! Perfect video and brilliant explanation! I wonder, what's wrong with your BayesianAid project? I missed it on cran.r-project.org?! In fact I use it and even have made a little fix for RNG.
Hi, thanks the great talk. May I have a question that do I need to follow the order of rateA and rateB when computing the diff, or I can randomly draw the value from A and B to compute the diff?
@@rasmusab A great video, thank you. I do have a question though ;), if my array of rates a and B is of different size then how can i calculate the rate_diff distribution? Or have i made a mistake and they should be of equal size?
@@JohnDraper1993 @UCO7kJ__JJ4v4RQU3ZymR3Kw: Nice lecture! I actually came across the same problem - do you have a solution for that? Thanks so much for your teachings - will check our your new course for sure!!!
@@rasmusab Nice lecture! I actually came across the same problem as John - do you have a solution for that? Thanks so much for your teachings - will check our your new course for sure!!!
Thanks for the upload, your explanations are great. Just FYI, during the 13th minute, your x axis label on the "Informative" histogram might be a mistake? It caused a little confusion for me.
If you go to time point 11:20 the bottom distribution shows a discrete graph of the continuous Beta distribution. The values are randomly created from this distribution. The fastest way to generate random values of specific distributions is to use uniform random numbers and plug them into the inverse of the cumulative distribution curve. What does the inverse of the Beta distribution look like? Alternate methods also exist. This question assumes a pre supplied package of functions are not being used. Have you had any experience with Polynomials (see Peter Fleischmann, 1978) which was enhanced by Todd Headrick (2002) for generating random values from non-Gaussian distributions. Sadly, this is something that I just recently became acquainted with. Thank you for your videos!!!
Rasmus, you should really start making videos again. Seriously, the way you teach is, by far, one of the best I've seen on RUclips (and I watched plenties of videos here since I am a self-taught student of economics).
I am amazed how tricky and deep concepts seem so simple when brought by you. Thank you for this series on Bayesian Statistics.
I can't agree more with what you said, so I second!
Really like your 3 part introduction on Bayesian modelling! Clearly structured, focussed and entertaining - thank you!
Nothing to add..You did sump up my feelings perfectly
Great introduction of Bayesian thinking. Much clearer than textbooks.
Notes for my future revision.
Why Bayesian Data Analysis?
0:29 How easy it is to change Bayesian model while the computation stay the same.
0:32 You have great flexibility when building Bayesian models, and can focus on that, rather than computational (algorithmic) issues.
There are often computational (processing) issue in fitting Bayesian model.
But since there is clean separation between specifying and fitting model in a bayesian framework, you often don't have to focus too much on how your model is computed when you construct it. That mean you can focus on what assumptions are reasonable and what information you should use, rather than on algorithm when doing the actual modelling. There are many tools to help fitting Bayesian models (Stan, PyMc), just specifying the model might just be enough.
Thank you Rasmus!
A very clear and accessible introduction with lots of opportunties to actually apply what we learned. Thank you for all you work on this.
Answer to the question "Why to accept both A & B at the same time": I think got this one :). for both A & B you need a normalizer often call the evidence or the marginal likelihood P(data)
So it must be proportional and the same for A & B , thats why when you want to compare A and B you must accept both. It is an evidence for the joint distribution of A & B
This was so clearly and amusingly demonstrated. Great video!
on 17:57, I think the Profit B should be =rateB *1000 - (300 for salmon +30 for brochure)
"I think he is smoking tobacco, but i don't know" hahahaha
All three parts are super helpful. Thanks a lot!
¡Breathtaking! ¡Bravo! No words,just thank you for sharing.
I would be happy to delve further into interpretations of ML algorithms from Bayesian perspective which you talk about at around 20 minutes into the video. I get the linear regression, but curious to learn more.
Thank you for the excellent introduction series! Shouldn’t the profit calculation at 17:58 also take into consideration the cost of the campaigns for the non-respondents? So it would be profitA = (rateA*(1000-30))+(1-rateA)*-30. And then profitB = (rateB*(1000-300))+(1-rateB)*-300
I too was confused a little about the results, then I realized Rasmus calculated the average profit per person.
Think of it this way: let say you use campaign A for n people and your signup rate is r. Then:
1. your total_profit is:
total_profit = n*r*1000 - n*30
2. your average_profit is
average_profit = total_profit / n = (n*r*1000 - n*30) / n = r*1000 - 30
Using Rasmus's notation, your average_profit = rateA*1000 - 30 😊His numbers are right. I hope this helps.
Great explanations, thanks! I just didn't quite understand why in the A/B testing you should accept only results that match "both" datasets at the same time. Is it simply to apply arithmetics to the resulting generated data (like you did for getting the difference between A and B)?
Thanks!
7:50 I am confused with rate_diff. Simulated subscription rates for Method A and B are not related in any way other than the order in which these are generated (cf. unlike repeated mesures pre- and post-test scores). Is it meaningful to calculate difference scores between two numbers that are not related? Can we just look at confidence intervals?
@rasmusab, there is a small mistake in your tutorial at @4:40. The second rate is not incorporated into the model so 64 =/= 72 (which was from the previous draw).
Very clear and helpful! Best resources I've seen now.
Thanks for posting these videos, man. In a economics student and I am very interested in this kind of thing.
17:34 Method B: Don't you send out a brochure (30 Kr) with a salmon (300 Kr) unless it is already included (Fish 270 Kr Brochure 30 Kr). Don't forget to include a QR code. :)
17:30 should be
profitB = rateB x 1000 - (300 + 30)
Excellent introduction! However, I can't stop thinking of those 16 unsuspecting Danes with "fresh" fish waiting for them to come home :D
Do you make the slides available? Such great explanations. Love your presentations. Thank you so much 🙏
Why do we need to discard samples when only one draw matches the real data (A:4, B:10)? Why not just throw away the sample for the non-matching method or even just sample them separately? It isn't a bivariate distribution so each method's draws are independent of the other method's.
Please, could you provide a method how to calculate alfa & beta parameters for a beta-distribution given a particular probability? I noticed that in your example, probability of success ranges between 5% and 15%, and you genarated a distribution with params Beta(3,28). So the question is how did you achieve those alfa = 3 and beta = 28, respectively? Thanks!
Would also like to see explicit code for how one creates beta distributions in R. Currently trying with dbeta() but I can't get the same distribution you get for alpha=3, beta =25.
@@nickjames1066 Use rbeta, just like runif and rbinom. E.g. hist(rbeta(n = 10000, shape1 = 3, shape2 = 25), col='darkgreen')
Great video!
For the decision analysis, why do you need bayesian analysis? You could just use the maximal likelihood estimation with the cost equation and see that the brochure alone is better. (0.38*1000 - 30 > 0.63*1000 - 300). If you're making a decision based off of this, it doesn't seem beneficial to do the bayesian analysis.
Great videos!!! Thank you very much. I have a question though that troubles me. I didn't understand why, while calculating the posteriors for both A and B methods we have to take the one in consideration with the other. Why do we need to keep the probability values only when both model responses agree with the observed responses. Couldn't we produce the posteriors by running the models independently?
Yes, for this specific model, that is correct! If you can figure out winch parameter are independent of each other then you can run those part independently. In general, however, you would have to run it all at the same time. :)
you a normalizer P(data) to be the same for both A & B in order to compare them so you need accept the parameter for both in the same time
finally i have an idea on how to apply Bayesian Analysis to optimization
Awesome video!
But I think in the method B the cost of the brochure is missing, it's only been considered the cost of salmon, so the profitB would be (1000rateB-330) instead of (1000rateB-300),
Anyway, the whole idea is perfectly explained.
Cheers!
Ah! So when you pay for shipping the salmon, due to the postage system in Scandinavia, you can slip in a brochure at no extra cost in postage :)
@@rasmusab Do they also give you the paper and print it for free? :p - Great bayes series btw, thank you!
Finally Bayesian is making sense
Great introduction, thank you!
One question why don't you keep t=268 the 72% rate2 for model B but throw both tries away?
Very good explanation.
Shouldn't profitB = rateB*1000-300-30?
Tack Rasmus, javligt bra!
Is there a R tutorial for this video?
it is very interesting !
how to import data from other software in to winbgs for analysis
Dude, you're a hero!
Can we use the posterior of one round of computation as the informative prior of the next round of improved estimate? When it has no relation to reality such as an expert opinion, and still has its origin from a wild guess of uniform distribution, would it be a better estimate?
I came with the same question!!
Excellent content!
Thanks for the great videos! Learnt a lot
Congrats for the videos. They are amazing. I really like the way you explain. It's brilliant. Also, your voice is a good fit for videos.
When you calculate the profit for method B, shouldn't you take out 300 for the salmon and 30 still for the mail since you send both?
Ah, but the postage system work in mysterious ways. If you're already sending a salmon, the you don't need to pay postage for the brochure, which was the main cost of the 30 kr. :)
really like the practical example
Congratulations! Perfect video and brilliant explanation! I wonder, what's wrong with your BayesianAid project? I missed it on cran.r-project.org?! In fact I use it and even have made a little fix for RNG.
when will part three be posted?? these videos are good!
Sorry for the delay, here it is: ruclips.net/video/Ie-6H_r7I5A/видео.html :)
Hi, thanks the great talk. May I have a question that do I need to follow the order of rateA and rateB when computing the diff, or I can randomly draw the value from A and B to compute the diff?
Generally order matters, but in this specific case it doesn't as the two rates are completely unrelated. :)
@@rasmusab A great video, thank you. I do have a question though ;), if my array of rates a and B is of different size then how can i calculate the rate_diff distribution? Or have i made a mistake and they should be of equal size?
@@JohnDraper1993 @UCO7kJ__JJ4v4RQU3ZymR3Kw:
Nice lecture!
I actually came across the same problem - do you have a solution for that?
Thanks so much for your teachings - will check our your new course for sure!!!
@@rasmusab
Nice lecture!
I actually came across the same problem as John - do you have a solution for that?
Thanks so much for your teachings - will check our your new course for sure!!!
Please Please make a video on Gaussian Process.....none of the video on the youtube gives intuition like your videos
very informative. Many Thanks
Great lecture
Simply great!
Thanks for the upload, your explanations are great. Just FYI, during the 13th minute, your x axis label on the "Informative" histogram might be a mistake? It caused a little confusion for me.
Yep! Definitely a misstake, nicely spotted. Both axes should read "Posterior on the rate of signup". :)
Nice work!
Please answer a question. How to generate Informative rates using beta(3,25) for n draws?
Hi tukmyjob. I'm sorry, but I'm not sure I understand the questions... :)
If you go to time point 11:20 the bottom distribution shows a discrete graph of the continuous Beta distribution. The values are randomly created from this distribution. The fastest way to generate random values of specific distributions is to use uniform random numbers and plug them into the inverse of the cumulative distribution curve. What does the inverse of the Beta distribution look like? Alternate methods also exist. This question assumes a pre supplied package of functions are not being used.
Have you had any experience with Polynomials (see Peter Fleischmann, 1978) which was enhanced by Todd Headrick (2002) for generating random values from non-Gaussian distributions. Sadly, this is something that I just recently became acquainted with. Thank you for your videos!!!
Shouldn't we deduct 330 from expected profit of B since we are sending both salmon and a pamphlet?
Ah nvm, saw your answer below :) Great video by the way.
Unable to find part 3. Could you help me with it?
ruclips.net/video/Ie-6H_r7I5A/видео.html
really really good
Thanks
Thanks!
9:20 my man
"How much we should trust our CEO? I think he is smoking tobacco, but I don't know"
hail papous with tsimpouk
do you realize that your Nordic accent is very sexy?