Five years old but still an interesting talk. I thought about adding some thoughts on a few points but I realise there are so many different ways to predict races that my thoughts probably won't make a difference. Bill Benter's story (and his associates) is pretty far out considering the technology of the time when he was doing his thing with horse racing. Maybe he still is betting along with his academic pursuits, I don't know. Personally I bet on horses every day using my own spreadsheet formulas but the latest thing is with the help of a data science guy we are developing a deep supervised-learning ann using the parameters I know work best for the data sets. It's working somewhat but time will tell how well that goes. As for ROI, accuracy of prediction and staking methods, I believe that when someone is getting tangible results over a period of time they probably won't be telling others how it's done. Even if they do I remember something Bill Benter said which really resonated with me and that was many people don't want to roll up their sleeves and do the hard work. The countless hours I have put into writing formulas or making data sets, lol, I don't even want to think about it. Anyway, it's all fun :)
The problem we , all the gamblers face in the end is: how race is going to develop knowing that the betting companies know prior to the start 1.) The amount of bets placed on a particular horse and 2.) The value of this bets.. Basically every thing can be controlled (manipulated) for the benefit of betting companies, otherwise this companies will cease to exist. If I misspelled something then please Pardon my French, but English is not my first speaking language
Mr. Silverman, thanks for your talk. I'm wondering if you can give any suggestions on possible sources to turn to for a novice without a statistical background and wants to bet on horses using statistically proven methods. Thank you.
Noah, Got two questions for you; 1) Does using conditional probability give you any advantage over just using the probability and ranking the horses grouped by race? 2) Are you able to provide any detail on the features that you've used? I'm looking at doing something similar for my MSc dissertation. thanks, Matt
Hey Noah, excellent talk. How did you get the .3 to .4 correlation between the odds and rank outcomes? Is that a number that you computed or something that comes from the academic literature? if you could provide a reference I'd be very grateful. Thanks.
Thanks James, The 186 variables were deduced from reading a ton of literature on the subject, speaking to experts, and a lot of trial and error. The LASSO I used was an L2, so some variables were pushed to small number, but none to 0
You can break down handicapping factors in a variety of ways. Average prizemoney won by jockey, horse's success from outside barrier, jockey/trainer strike rate for the last 12 months. A lot can be made redundant but one of the more successful high profile horse players Alan Woods used something like 130+ factors.
Judging from the video description the betting public underestimates the winning chance of a horse in 2 out of 10 races in Hong Kong, enough to overcome the track takeout over the long term. Is that correct? What betting strategy was simulated? Flat betting the bare minimum or a fixed proportion of the bankroll? This is important to know because horse racing typically doesn't encourage the implementation of a Kelly Strategy with a large bankroll relative to the size of the parimutuel pool.
Thanks for a great overview of your modeling. Are you using any open source libraries to do your conditional logistic regression and the LASSO optimization? Did you write this in C++ for the MAC? Thanks for any information you can provide on your algorithms.
Dear Dr.N Silverman can you please help me to find the parameters for benter correction in harvile formula. How to get maximum likelihood estimator on a sample of past data.
Dear Dr. Noah Silverman.!! Thanks for uploading such a informative video, for my knowledge it is little hard to understand. I want to know how the parameters for Benter correction in Harville formula can be obtained. Thanks in advance.
+Noah Silverman Thanks for your immediate response. Sorry, that was a typological mistake. I mean how the parameters for Benter correction can be obtained.
Hello Noah, I have been doing similar things with a Benter style two step regularised conditional regression on Australian races and have read your dissertation thoroughly. My question is, using the frailty/strength term from the odds has a effect similar to using the Kelly criterion? You are weighting horses that your model favours more than the public (odds) with a greater final probability? Are you then placing a uniform bet across all races? Wouldn't that be the same as finding a win probability that is un-weighted by the odds and using a Kelly bet to modify your stake to maximize your winnings?
Noah Silverman Thanks for replying. I suppose you can do both, and I guess they both do a similar thing. Interesting to see how a Kelly strategy works for your already weighted system, could be more robust due to the regression but also more non-linear as similar information is being used twice. Thanks again!
Nice fantasy, but things don't work that way. Just because a machine is "quantum" doesn't mean it has infinite insight into any phenomenon in the world.
but if you have a model, then all the way out is get a optimized answer, which i think the quantum machine D-wave in google can do the rest of answer, isn't it?
Unless someone has inside info about a race, there is no reliable way of predicting the outcome of a thoroughbred horse race. There are too many variables, not the least of which is the horse itself, whose temperament and condition at post time is known only to the horse, and the horse is keeping that a secret. The fact that even the most successful jockeys win only a small fraction of their races is proof that, presuming that the races are legitimate, the outcome is not a sure bet. Recently, in a maiden claiming race, the 75 to 1 longshot won by two lengths while the 6 to 5 favorite came in eighth. Predicting races is entertaining, but don't expect the horses to cooperate. They have other concerns that have nothing to do with money.
"Predicting races is entertaining, but don't expect the horses to cooperate. They have other concerns that have nothing to do with money." LOL FOFL My coffee nearly came out of my nose. GOOD ONE
The reason why people can win money in Hong Kong field is that the pool is a pari-mutual pool with many punters without intelligent that there are rooms of different between the probability and odds
And, to clarify: We have "factors" in the model, and then use machine learning techniques to estimate the coefficients (weights of the factors). So, the public odds is a "factor" not a coefficient
Noah Silverman last question, do you think your model's rsquare outperforming public model rsquare is a good indicator of potential success (along with OOS testing for ROI)?
This is all nice, but tell me which horse is going to win the first race at Aqueduct tomorrow.
Five years old but still an interesting talk. I thought about adding some thoughts on a few points but I realise there are so many different ways to predict races that my thoughts probably won't make a difference. Bill Benter's story (and his associates) is pretty far out considering the technology of the time when he was doing his thing with horse racing. Maybe he still is betting along with his academic pursuits, I don't know. Personally I bet on horses every day using my own spreadsheet formulas but the latest thing is with the help of a data science guy we are developing a deep supervised-learning ann using the parameters I know work best for the data sets. It's working somewhat but time will tell how well that goes. As for ROI, accuracy of prediction and staking methods, I believe that when someone is getting tangible results over a period of time they probably won't be telling others how it's done. Even if they do I remember something Bill Benter said which really resonated with me and that was many people don't want to roll up their sleeves and do the hard work. The countless hours I have put into writing formulas or making data sets, lol, I don't even want to think about it. Anyway, it's all fun :)
this was way before its time. Thanks for the great upload!!
The problem we , all the gamblers face in the end is: how race is going to develop knowing that the betting companies know prior to the start 1.) The amount of bets placed on a particular horse and 2.) The value of this bets..
Basically every thing can be controlled (manipulated) for the benefit of betting companies, otherwise this companies will cease to exist. If I misspelled something then please Pardon my French, but English is not my first speaking language
Mr. Silverman, thanks for your talk. I'm wondering if you can give any suggestions on possible sources to turn to for a novice without a statistical background and wants to bet on horses using statistically proven methods. Thank you.
Noah,
Got two questions for you;
1) Does using conditional probability give you any advantage over just using the probability and ranking the horses grouped by race?
2) Are you able to provide any detail on the features that you've used?
I'm looking at doing something similar for my MSc dissertation.
thanks,
Matt
Hey Noah, excellent talk. How did you get the .3 to .4 correlation between the odds and rank outcomes? Is that a number that you computed or something that comes from the academic literature? if you could provide a reference I'd be very grateful. Thanks.
Empirical correlation from dataset. If you want a formal "academic literature" reference, see my paper published on the topic.
Hi Noah, thanks for the great talk. I was wondering how you came up with 186 variables?! And how many of these did LASSO manage to get rid of?
Thanks James,
The 186 variables were deduced from reading a ton of literature on the subject, speaking to experts, and a lot of trial and error. The LASSO I used was an L2, so some variables were pushed to small number, but none to 0
You can break down handicapping factors in a variety of ways.
Average prizemoney won by jockey, horse's success from outside barrier, jockey/trainer strike rate for the last 12 months. A lot can be made redundant but one of the more successful high profile horse players Alan Woods used something like 130+ factors.
by the way, benter had hired journalists so they could get him some insider info.
Judging from the video description the betting public underestimates the winning chance of a horse in 2 out of 10 races in Hong Kong, enough to overcome the track takeout over the long term. Is that correct? What betting strategy was simulated? Flat betting the bare minimum or a fixed proportion of the bankroll? This is important to know because horse racing typically doesn't encourage the implementation of a Kelly Strategy with a large bankroll relative to the size of the parimutuel pool.
For that academic study, I used a fairly standard Kelly strategy. In "real life", it would be something more complex to manage risk
Thanks for a great overview of your modeling. Are you using any open source libraries to do your conditional logistic regression and the LASSO optimization? Did you write this in C++ for the MAC? Thanks for any information you can provide on your algorithms.
This project was all custom code. Some in R and some in C++
Noah Silverman, how can i get a copy of your study and use it to apply to U.S. horse racing.
+Joe Beasley Data Science Ltd offers consulting services for the gaming markets.
+Noah Silverman what's their website address?
+Joe Beasley www.datascience.io
New website: www.helios.ai
Super work , I am a Chinese and it's very interest in Hong Kong racing research. How I can learn that .and using your data for it
I need more information about econométrica method and betting Sports, please
Dear Dr.N Silverman can you please help me to find the parameters for benter correction in harvile formula. How to get maximum likelihood estimator on a sample of past data.
Dear Dr. Noah Silverman.!!
Thanks for uploading such a informative video, for my knowledge it is little hard to understand. I want to know how the parameters for Benter correction in Harville formula can be obtained.
Thanks in advance.
+VISHU JITH You,'ll have to find that one on your own.
+Noah Silverman Thanks for your immediate response. Sorry, that was a typological mistake. I mean how the parameters for Benter correction can be obtained.
Hello Noah, I have been doing similar things with a Benter style two step regularised conditional regression on Australian races and have read your dissertation thoroughly. My question is, using the frailty/strength term from the odds has a effect similar to using the Kelly criterion? You are weighting horses that your model favours more than the public (odds) with a greater final probability? Are you then placing a uniform bet across all races? Wouldn't that be the same as finding a win probability that is un-weighted by the odds and using a Kelly bet to modify your stake to maximize your winnings?
***** The two are not mutually exclusive. You can use weights in training AND Kelly for betting. They're separate things.
Noah Silverman Thanks for replying. I suppose you can do both, and I guess they both do a similar thing. Interesting to see how a Kelly strategy works for your already weighted system, could be more robust due to the regression but also more non-linear as similar information is being used twice. Thanks again!
Thank you so much
Since then, have you played with LSTM or Conv on this project or similar ? any better results ?
I have not. The challenge with any ANN is setting up the conditional probability (the probabilities for horses in a race must sum to 1.0)
Is this only optimal at Hong Kong could this be used a Fonner Park in Nebraska?
What will happen if a quantum computer give you a optimized result in fraction of second, and ruined the whole industry?
Nice fantasy, but things don't work that way. Just because a machine is "quantum" doesn't mean it has infinite insight into any phenomenon in the world.
but if you have a model, then all the way out is get a optimized answer, which i think the quantum machine D-wave in google can do the rest of answer, isn't it?
at 4:05 if I remember my 8th-grade math correctly does ∝ mean that there is a constant in the formula or am I an idiot?
Unless someone has inside info about a race, there is no reliable way of predicting the outcome of a thoroughbred horse race. There are too many variables, not the least of which is the horse itself, whose temperament and condition at post time is known only to the horse, and the horse is keeping that a secret. The fact that even the most successful jockeys win only a small fraction of their races is proof that, presuming that the races are legitimate, the outcome is not a sure bet. Recently, in a maiden claiming race, the 75 to 1 longshot won by two lengths while the 6 to 5 favorite came in eighth. Predicting races is entertaining, but don't expect the horses to cooperate. They have other concerns that have nothing to do with money.
I respectfully disagree (of course)
Tell Bill Benter that there's no reliable way of predicting the outcome of a horse race lol
"Predicting races is entertaining, but don't expect the horses to cooperate. They have other concerns that have nothing to do with money." LOL FOFL My coffee nearly came out of my nose. GOOD ONE
You would have to use Kelly and the law of large numbers to mitigate uncertainty and bad luck. Is that what you would do Noah?
The reason why people can win money in Hong Kong field is that the pool is a pari-mutual pool with many punters without intelligent that there are rooms of different between the probability and odds
Noah, was the quoted ROI calculated off closing prices?
Daniel Wishart I don't actually remember. This talk was from several years ago, and things have advanced significantly beyond the work presented.
Soo.... Have you made your billions yet?
are you saying you would combine the public's implied odds (strength) with your coefficients? you're using public odds as a coefficient?
Lot of racing models use the public odds as *one* of several factors. There is information in there.
And, to clarify: We have "factors" in the model, and then use machine learning techniques to estimate the coefficients (weights of the factors). So, the public odds is a "factor" not a coefficient
so this differs from benter slightly as he suggested running a second logit model with combined public estimate and your fundamental estimate?
There are many ways to do this.
Noah Silverman last question, do you think your model's rsquare outperforming public model rsquare is a good indicator of potential success (along with OOS testing for ROI)?