I know this is a rather old video but it is still highly relevant and useful. At 47:02 I don't think a standard EV calculation really does that situation justice. With high payout/low loss situations like that I think it is better to weight the payouts by their utility. For example losing $10 may have basically no subjective utility loss when compared to the subjective utility gained from having $100k. Lets say that to me having $100k has 20k times as much ultility as losing $10 does. When you switch from an ev calculation based on the win/loss to a subjective ultility of the payouts there is a drastic increase in the EV(although still negative in this case). e.g: win=10000 dollars lose = 10 dollars (win*5.4e-06)-((1-5.4e-06)*lose)= ~-$9.46 dollars win_util = 20000 utility points or "utils" lose_util = 1 utils (win_util*5.4e-06)-((1-5.4e-06)*lose_util)=-0.89 utils This is a simple example and we could for sure argue about the subjective utility values but I think overall it shows that the normal EV calculation doesn't really do the situation justice when you think about the utility of the win versus the utility of the loss. One could also flip this around and talk about the subjective utility of losing samples versus winning samples. Like say this was overall +ev but that the subjective value in winning so rarely was less than the subjective value loss from losing so often. I got this concept from game theory. There are plenty of examples, especially in poker, where doing something that is -ev right now could lead to a +ev situation later on. Poker players call that implied ev and an example could be calling with some sort of draw when the current raw pot odds don't justify it but you know that when you do make your hand that the profits gained will make up for the marginal loss now. So for example lets say I have some idea for a product or service that would earn $50k a year off a $100k investment. With using a fairly standard 10x income for valuation estimation I could say the subjective utility of winning that 100k is actually worth 50k utility points versus the 10k utility points implied by an even weighting. This specific situation would still be -ev though. All of that leads me down the path of seriously doubting most of rational economics.
and even just going with constant valuation of money, you get _multiple entries_. So, sure, the probability of 1 contribution winning is whatever he got, but you can submit multiple submissions whose combination could definitely lead to a win. especially, when you consider that you can make about 10k submissions. made a little sad to see that reductive analysis (and with all the floating) when I was excited to see a really interesting decision problem at hand :/
Just beginning to learn about Bayesian analysis ... thanks for the great video and everyone for links in comments ... Question: Is it correct to say that, in the world cup example 1, the only variables that are calculated by Stan are: b real, sigma_a >=0, and sigma_y >=0? In other words, Stan figures out (simultaneously/jointly): (1) the best b and sigma_a for the equation a = b*prior_scores + sigma_a*[eta_a ~ N(0,1)] (2) the best sigma_y so that student_t(df = 7, a[team_1] - a[team_2], sigma_y) best predicts ~ sqrt_adjusted[score(team_1) - score(team_2)] That seems kind of weird to me that after we figure out the formula for a, it kind of boils down to just one parameter = sigma_y
At 16:27 you talk about checking predictive posterior distributions for games against their actual results to check if they are within their respective 95% CIs. Are these games training data or unseen data?
He didn't say they were held out samples. So probably they were in the training data. Ideally you shouldn't do that. Because it would hide overfitting. However bayesian methods are much less prone to overfitting. In his example he found some completely different problem.
@@Houshalter I've heard people argue that bayesian methods don't overfit but the developers sometimes do have incorrect assumptions about priors and their distributions which can lead to situations that may look similar to overfitting in the classic sense of the word. For example say we naively look at some time series data and we think we have a solid basic understanding of the distributions of the processes that formed that data. We fit a model and it seems to do well on the trainging set but fails pretty horribly when testing on unseen data. There are many reasons this could happen and almost all of them are based on the fact that our training sample didn't include enough data to really estimate the distributions and their parameters, we picked the wrong distributions/priors for those processes or that the processes that generate our data vary over time.
@@crypticnomad bayesian can absolutely suffer from a bad model. But it's a different problem than normal overfitting. And a validation test would not necessarily show any difference from the training set
That golf putting model is just about the coolest thing ever.
I know this is a rather old video but it is still highly relevant and useful. At 47:02 I don't think a standard EV calculation really does that situation justice. With high payout/low loss situations like that I think it is better to weight the payouts by their utility. For example losing $10 may have basically no subjective utility loss when compared to the subjective utility gained from having $100k. Lets say that to me having $100k has 20k times as much ultility as losing $10 does. When you switch from an ev calculation based on the win/loss to a subjective ultility of the payouts there is a drastic increase in the EV(although still negative in this case). e.g:
win=10000 dollars
lose = 10 dollars
(win*5.4e-06)-((1-5.4e-06)*lose)= ~-$9.46 dollars
win_util = 20000 utility points or "utils"
lose_util = 1 utils
(win_util*5.4e-06)-((1-5.4e-06)*lose_util)=-0.89 utils
This is a simple example and we could for sure argue about the subjective utility values but I think overall it shows that the normal EV calculation doesn't really do the situation justice when you think about the utility of the win versus the utility of the loss. One could also flip this around and talk about the subjective utility of losing samples versus winning samples. Like say this was overall +ev but that the subjective value in winning so rarely was less than the subjective value loss from losing so often.
I got this concept from game theory. There are plenty of examples, especially in poker, where doing something that is -ev right now could lead to a +ev situation later on. Poker players call that implied ev and an example could be calling with some sort of draw when the current raw pot odds don't justify it but you know that when you do make your hand that the profits gained will make up for the marginal loss now. So for example lets say I have some idea for a product or service that would earn $50k a year off a $100k investment. With using a fairly standard 10x income for valuation estimation I could say the subjective utility of winning that 100k is actually worth 50k utility points versus the 10k utility points implied by an even weighting. This specific situation would still be -ev though.
All of that leads me down the path of seriously doubting most of rational economics.
and even just going with constant valuation of money, you get _multiple entries_. So, sure, the probability of 1 contribution winning is whatever he got, but you can submit multiple submissions whose combination could definitely lead to a win. especially, when you consider that you can make about 10k submissions.
made a little sad to see that reductive analysis (and with all the floating) when I was excited to see a really interesting decision problem at hand :/
Thanks a lot for making this code available for download. That was really helpful for getting started in Stan.
where is the code??
Wow! Brilliant - this really helped me a lot. Thank you.
Also really liked the golf putt example.
Just beginning to learn about Bayesian analysis ... thanks for the great video and everyone for links in comments ...
Question: Is it correct to say that, in the world cup example 1, the only variables that are calculated by Stan are: b real, sigma_a >=0, and sigma_y >=0?
In other words, Stan figures out (simultaneously/jointly):
(1) the best b and sigma_a for the equation a = b*prior_scores + sigma_a*[eta_a ~ N(0,1)]
(2) the best sigma_y so that student_t(df = 7, a[team_1] - a[team_2], sigma_y) best predicts ~ sqrt_adjusted[score(team_1) - score(team_2)]
That seems kind of weird to me that after we figure out the formula for a, it kind of boils down to just one parameter = sigma_y
Gelman is quite nice to listen to. His RL voice sounds different from his blog voice somehow
Wondered why we have '-1' in "2 * Phi(asin((R - r) / sigma) - 1" in the golf example model. (Min 51:38)
Thanks Dr. Gelman
Prof. Gelman: At 19:00 you talk about checking how the model fits the data; are there any tools in Stan to avoid overfitting?
There are no "tools", but this may help: andrewgelman.com/2017/07/15/what-is-overfitting-exactly/
Thanks for the great presentation and explanations on real models.
This made me laugh: "working with live posterior"
Thanks for this. Any link to the slides?
www.dropbox.com/s/sfi0pcf7hais91r/Gelman_Stan_talk.pdf?dl=0
This (for given slides) talk was given in Aalto university, October 2016.
At 16:27 you talk about checking predictive posterior distributions for games against their actual results to check if they are within their respective 95% CIs. Are these games training data or unseen data?
He didn't say they were held out samples. So probably they were in the training data. Ideally you shouldn't do that. Because it would hide overfitting. However bayesian methods are much less prone to overfitting. In his example he found some completely different problem.
@@Houshalter thanks
@@Houshalter I've heard people argue that bayesian methods don't overfit but the developers sometimes do have incorrect assumptions about priors and their distributions which can lead to situations that may look similar to overfitting in the classic sense of the word. For example say we naively look at some time series data and we think we have a solid basic understanding of the distributions of the processes that formed that data. We fit a model and it seems to do well on the trainging set but fails pretty horribly when testing on unseen data. There are many reasons this could happen and almost all of them are based on the fact that our training sample didn't include enough data to really estimate the distributions and their parameters, we picked the wrong distributions/priors for those processes or that the processes that generate our data vary over time.
@@crypticnomad bayesian can absolutely suffer from a bad model. But it's a different problem than normal overfitting. And a validation test would not necessarily show any difference from the training set
"Soccer games are random, it all depends how good the acting is"
What was the bug he fixed? I want to know how he solved the problem.
There was a factor of 2 missing. Full story is here: statmodeling.stat.columbia.edu/2014/07/15/stan-world-cup-update/
Lmfao this guys so funny