I love it when a professor of the caliber of Taleb explains basics. The insights he gives and the gotchas he warns of are never taught in class. His roots as a traders gives the viewers a totally different point of view from that of math professors at uni (though I am eternally grateful to them)
Every time Taleb laughs under his breath I feel like a schmuck because I know he's laughing at something so obvious it doesn't need to be said, and yet I have no clue what he's laughing at 💀
Agree with your frustration, but I don't think you should disparage yourself. Taleb has demonstrated true (and hitherto poorly appreciated) perspectives onto reality to people like me who lack the understanding of the inner workings of his reasoning. I am trying to assemble from these MOOCS the elements I should understand in order to appreciate his intuitions.
I don’t have that problem ... I get it every time he chuckles ... not ... 😀 ... but indeed, it does give a few good intuitions still which I plan to apply happily at the bar (if ever ... cfr COVID) to friends who think they are great investors 😀
I read your other books Fooled, Black Swan, Antifragile, Skin in the Game. Started Consequences of Fat Tails but put it down because the math was beyond me, but never saw the part on Mutual Information. First I'm hearing of this concept. Been a web coder most of my life, and still a slave to my corporate overlords, but would like to get into quant finance. Thank you for the lesson maestro.
Thank you, this is good. I've stumbled into a few 'natural' data of 'random' nature that seemed to have patterns, but I finally understand you can't fit the world in a straight line. this isn't one size fits all model, but that it is 'abused everywhere'
Thank you so much!, I was safety Engineer you changed my career path and cured my cancer 🙂, I was struggling to found scientific base for risk assessment...but I abandoned that..and switch to business intelligence! these insights that science needs in order to handle complexity is priceless, from Iran! 👌🙏
I see Professor Taleb is basically pitching information theoretic measures like mutual information. Let me suggest students read Cover and Thomas’ book on information theory, which was a textbook for a class I took years ago. The math is not too hard and the power you obtain is very great. At least everybody should learn about the basics in the first couple chapters, and when I say “should” I mean at a higher priority than learning calculus.
Nasim, great videos - super insightful, different and short On you critics of correlation and standard deviation: you state that these measures do not work in non-linear processes and distributions with fat tails. You clearly state that the one of the key assumptions is that samples are drawn from the SAME, distribution. What is the fat tail comes from DIFFERENT distribution? One just by mistake use the same object but draws observations from 2 different distributions. Example: measuring the length of the same astronaut jumps on Earth (possible thousands observations during training phase) and Moon (very rear small sample)? Are these the same distributions? Or behaviour of Lehman P&L and BS during “normal” times and during shock of GFC? Should we think of probability of regime change - transition/jump to different distribution rather than a Singel distribution with fat tail?
Merci pour vos partages. Regarding the problem you describe with correlation, is it affecting the result of paired t-test and which method should we use instead if so? Thank you if you can reply. Tanti auguri.
It would be really cool if you explained more about how to compute the mutual information. Especially for the case that you have only data pairs (x_i, y_i), and may not know what the underlying distributions should be. In your paper you show this for continuous random variables (equation 4, which assumes that the underlying distributions f(x,y), f(x,.) and f(y,.) are known). I got somehow lost in trying to understand the details and how to apply this. Maybe you could cover this a little more ? Perhaps with an example where the correlation is high but the mutual information is small (and be it just to make fun of some social scientists :-)
One of thing that erks me is how some people interpret correlations in social sciences. They have large sample and get correlation between two constructs of something like 0.15 buuuut it is statistically significant. So they speak willy nilly how A affects B or vice versa and what implications are in real life etc. r of 0.15 amounts to 2% of mutual variation explained.
An interesting intuitive take i had for when you have correlation (that relates to the difference between 2.5 and 0.5) its better to look at it like Resolution on an image. it's very obvious that something that looks like a blob at say 0.5 is not going to look half as much like a blob at 2.5.
So in nonlinear cases, one would use nonlinear regression to examine association. Of course if you apply a linear model (or simple correlation) to a nonlinear relationship you get weird results. You need to match the models complexity w. that of data and the regression will work ok.
Great content. Thank you for your time and knowledge. And thanks for sharing. The sounds made by the chalk and duster on the board is a pain for me. I don’t know why.
I think what you are teaching feels very important. I don’t think I’m smart enough to use this as you might hope. Do you have any suggestions how we should apply this. Should we just buy your books? Will I understand them? Do I need more statistics study?
Math is just an aggregation of ideas. Traditionally students are bucketed into "smart" and "stupid" if they don't understand something. In reality it is on the teacher to calibrate for the level they are teaching, and the student to identify when there is a gap in their knowledge. Learning anything is just a function of patience and dedication. Smaller classes sizes allow a teacher to also identify students that need additional structures to make the existing material a natural extension, though this rarely happens. Easier to call someone stupid and not put in the effort. In your case, it would be wise to first consider yourself entirely capable of understanding and therefore smart. Secondly, to identify what words or phrases don't have context, and then to research them. Like lego blocks. "Smart"/intelligent people know that they really know nothing. Life is continuous learning, always coming back to the so-called basics with a new angle. I assure you "experts" will watch this video and find new insights. Enjoy the ride!
Nice presentation it is. Nonetheless, it contains conceptual errors, which I would like to point out. In a scatter plot of Y vs. X, the points (Yi, Xi) are scattered about the regression line of Y_hat vs. X. The slope of this line is NOT correlation; the slope is the rate at which Y_hat changes relative to X. The correlation represents how close the points are to the line. In the extreme, if they all lie on the line, correlation would be 1.0 or -1.0. Intuitive, qualitative statements must be built upon and verified through quantitative reasoning - the risks of not doing this are numerous and copious. Note that, correlation = Cov(X,Y)/(SD(X)*SD(Y)) and slope = correlation * SD(Y).
"The slope of this line is NOT correlation" --> It is, because he normalized both Y and X, resulting in the same standard deviation. Hence, the slope is the correlation in this case.
I don't understand how you get that the correlation coefficient as 2:57 is equal to 0. For it to be zero, then it has to be case that E(XY) - E(X)E(Y) = 0 so E(XY) = E(X)E(Y). How do you know that E(XY) = E(X)E(Y) without assuming that X and Y are independent? You haven't been given any information about the joint distribution of X and Y. In fact, wouldn't the joint density not exist?
I like the part that visually there's no difference between a rho of 0.5 and 0.25! Quite simply correlation doesn't imply causality! Thanks Taleb it was invigorating 😁👍
No, because a Pearson correlation of 1 (or -1) means that a linear regression gives a perfect fit to the data. As soon as the correlation gets much away from + or -1 the pattern can be completely weird though.
Or you could’ve just shown Anscombe’s quartet, shown that correlation looks like dot products for a reason, and called it a day. Not sure why you don’t like sub-additivity. Don’t you want that for coherent risk measures? Also I’m guessing mutual information is related to sufficiency? Not that I know much en.wikipedia.org/wiki/Anscombe%27s_quartet
As an italian, listening to you speak italian was a wonderful surprise. Bless your mind and bless your simple method
Absolutely amazing (and the videos are so formative), but it looks to me like he got a French cadence when speaking Italian, hasn't he?
I love it when a professor of the caliber of Taleb explains basics. The insights he gives and the gotchas he warns of are never taught in class. His roots as a traders gives the viewers a totally different point of view from that of math professors at uni (though I am eternally grateful to them)
Definitely!👌
Every time Taleb laughs under his breath I feel like a schmuck because I know he's laughing at something so obvious it doesn't need to be said, and yet I have no clue what he's laughing at 💀
Agree with your frustration, but I don't think you should disparage yourself.
Taleb has demonstrated true (and hitherto poorly appreciated) perspectives onto reality to people like me who lack the understanding of the inner workings of his reasoning.
I am trying to assemble from these MOOCS the elements I should understand in order to appreciate his intuitions.
you got company :)
So relatable 😅
I don’t have that problem ... I get it every time he chuckles ... not ... 😀 ... but indeed, it does give a few good intuitions still which I plan to apply happily at the bar (if ever ... cfr COVID) to friends who think they are great investors 😀
It's amazing that we can access such high quality lessons on RUclips. Thank you Nassim!
Yes, thank you.
Man hearing the greatest speaking my mother tongue gave me chills! Grande Taleb, sei il numero 1 amico mio!
I read your other books Fooled, Black Swan, Antifragile, Skin in the Game. Started Consequences of Fat Tails but put it down because the math was beyond me, but never saw the part on Mutual Information. First I'm hearing of this concept. Been a web coder most of my life, and still a slave to my corporate overlords, but would like to get into quant finance. Thank you for the lesson maestro.
Thank you, this is good. I've stumbled into a few 'natural' data of 'random' nature that seemed to have patterns, but I finally understand you can't fit the world in a straight line. this isn't one size fits all model, but that it is 'abused everywhere'
Never subscribed to a channel faster in my life.
Thank you so much!, I was safety Engineer you changed my career path and cured my cancer 🙂, I was struggling to found scientific base for risk assessment...but I abandoned that..and switch to business intelligence! these insights that science needs in order to handle complexity is priceless, from Iran! 👌🙏
The incipit in Italian totally made my day. Grazie amico!
I see Professor Taleb is basically pitching information theoretic measures like mutual information. Let me suggest students read Cover and Thomas’ book on information theory, which was a textbook for a class I took years ago. The math is not too hard and the power you obtain is very great. At least everybody should learn about the basics in the first couple chapters, and when I say “should” I mean at a higher priority than learning calculus.
FRIENDS!
The maestro talks Italian very good.
Appreciate you.
I'm reading his book "Fooled by Randomness" at work everyday, this book has changed how I see people, the world, life, everything.
make bank from it yet?
Oggi parliamo della correlazione, okay ti amo. Complimenti per l'Italiano!
L'uso dell'italiano qui é un commento alla correlazione nell'individualismo collettivo?
As always, great insights from Taleb - Statistics is critical in making sense of genetic GWAS -
Sentirti parlare italiano mi rende felicissimo, grazie Taleb
Thank you for explaining, I didn't understand why IQ was so problematic from a correlation stand point.
Beautiful Video, e bravissimo per l'italiano
Grandissimo. Professore sei un grande. Il migliore
Your Italian is very very good!
Nasim, great videos - super insightful, different and short
On you critics of correlation and standard deviation: you state that these measures do not work in non-linear processes and distributions with fat tails. You clearly state that the one of the key assumptions is that samples are drawn from the SAME, distribution.
What is the fat tail comes from DIFFERENT distribution? One just by mistake use the same object but draws observations from 2 different distributions.
Example: measuring the length of the same astronaut jumps on Earth (possible thousands observations during training phase) and Moon (very rear small sample)? Are these the same distributions? Or behaviour of Lehman P&L and BS during “normal” times and during shock of GFC?
Should we think of probability of regime change - transition/jump to different distribution rather than a Singel distribution with fat tail?
First time to see correlation in the straight line equation 👍🏽👍🏽
Any guesses for what color sweater we’ll get next video?
I’d like a light gray to hide any chalk marks ☺️
Thank you thank you thank you Prof Taleb - Your books changed the way I saw the world
Worse is when people do all the correlation in soft sciences.
Informazione eterna.
Thanks for sharing! Ottimo italiano, complimenti!
Sir can you do a video on philosophy of Bayesian statistics?
Mamma mia! Grande taleb
good italian accent! thank you fo your videos Mr Taleb
Love the videos keep them coming!
Just finished final exams and final school projects! Back to relearn the statistics lessons. It's been so long!
Mr Taleb but you speak italian!! What a surprise!
Una bella lingua.
@@nntalebproba E le tue sono bellissime lezioni!
Merci pour vos partages.
Regarding the problem you describe with correlation, is it affecting the result of paired t-test and which method should we use instead if so?
Thank you if you can reply.
Tanti auguri.
Gold..
Thank you for your explanation.
It would be really cool if you explained more about how to compute the mutual information. Especially for the case that you have only data pairs (x_i, y_i), and may not know what the underlying distributions should be. In your paper you show this for continuous random variables (equation 4, which assumes that the underlying distributions f(x,y), f(x,.) and f(y,.) are known). I got somehow lost in trying to understand the details and how to apply this. Maybe you could cover this a little more ? Perhaps with an example where the correlation is high but the mutual information is small (and be it just to make fun of some social scientists :-)
One of thing that erks me is how some people interpret correlations in social sciences. They have large sample and get correlation between two constructs of something like 0.15 buuuut it is statistically significant. So they speak willy nilly how A affects B or vice versa and what implications are in real life etc. r of 0.15 amounts to 2% of mutual variation explained.
"This is not meant to be none technical"... I'm fked haha.
Grande Taleb!
An interesting intuitive take i had for when you have correlation (that relates to the difference between 2.5 and 0.5) its better to look at it like Resolution on an image. it's very obvious that something that looks like a blob at say 0.5 is not going to look half as much like a blob at 2.5.
Ottimo spiegazione, grazie!
So in nonlinear cases, one would use nonlinear regression to examine association. Of course if you apply a linear model (or simple correlation) to a nonlinear relationship you get weird results. You need to match the models complexity w. that of data and the regression will work ok.
Grande prof!
So that's what a statistician Superman cape look like...
Ah Beh perfetto... Tradotto Live!
Thank you Nassim!
Thank you kindly ✍️
Really nice!
Great, what about using Spearman rank correlation in finance? is it informative?
Thanku godfather 🙏😀
Thanks Nassim. But don’t stocks have linear movement? Options non linear
We use log returns, no?
@@nntalebproba thanks. Let me watch again. Maybe I am confused
Linear means y=a x + noise, as the connection between two returns is a constant, not a function
@@jamesmarsh4047
@@nntalebproba thanks Nassim mucho appreciated
Wonderful. thanks for doing these.
5:10 *raises hand* why are the other ones (kendall and spearman I guess?) worse?
Are u speaking in italian? wow... thanks for all
please Prof fik ta3mel shi elou 3ale2a bel fat tails bl trading?
Thanks so much !
I'm somehow not even surprised that NNT can speak Italian for no reason because he is such a polymath & polyglot lol
Great content. Thank you for your time and knowledge. And thanks for sharing.
The sounds made by the chalk and duster on the board is a pain for me. I don’t know why.
love it
keep it real
I have no idea what he is talking about.. But I'm still subscribing
Thank you 🙇
how does it come that you speak italian?
Possible error at 4:27
Y= a + bx
Assuming that’s a regression on data, x will always be correlated with y at nearly 100% if the data is not centered
Incorrect, sorry.
N N Taleb's Probability Moocs
Ah sorry, was thinking about correlation btwn a&b, not x&y
I think what you are teaching feels very important. I don’t think I’m smart enough to use this as you might hope. Do you have any suggestions how we should apply this. Should we just buy your books? Will I understand them? Do I need more statistics study?
Math is just an aggregation of ideas. Traditionally students are bucketed into "smart" and "stupid" if they don't understand something. In reality it is on the teacher to calibrate for the level they are teaching, and the student to identify when there is a gap in their knowledge. Learning anything is just a function of patience and dedication. Smaller classes sizes allow a teacher to also identify students that need additional structures to make the existing material a natural extension, though this rarely happens. Easier to call someone stupid and not put in the effort. In your case, it would be wise to first consider yourself entirely capable of understanding and therefore smart. Secondly, to identify what words or phrases don't have context, and then to research them. Like lego blocks. "Smart"/intelligent people know that they really know nothing. Life is continuous learning, always coming back to the so-called basics with a new angle. I assure you "experts" will watch this video and find new insights. Enjoy the ride!
Thanks Nassim :-)
Nice presentation it is. Nonetheless, it contains conceptual errors, which I would like to point out. In a scatter plot of Y vs. X, the points (Yi, Xi) are scattered about the regression line of Y_hat vs. X. The slope of this line is NOT correlation; the slope is the rate at which Y_hat changes relative to X. The correlation represents how close the points are to the line. In the extreme, if they all lie on the line, correlation would be 1.0 or -1.0. Intuitive, qualitative statements must be built upon and verified through quantitative reasoning - the risks of not doing this are numerous and copious. Note that, correlation = Cov(X,Y)/(SD(X)*SD(Y)) and slope = correlation * SD(Y).
"The slope of this line is NOT correlation" --> It is, because he normalized both Y and X, resulting in the same standard deviation. Hence, the slope is the correlation in this case.
ahah grandissimo in italiano
I don't understand how you get that the correlation coefficient as 2:57 is equal to 0. For it to be zero, then it has to be case that E(XY) - E(X)E(Y) = 0 so E(XY) = E(X)E(Y). How do you know that E(XY) = E(X)E(Y) without assuming that X and Y are independent? You haven't been given any information about the joint distribution of X and Y. In fact, wouldn't the joint density not exist?
Merci
Italiano in continuo miglioramento!
la prossima volta.
Molto pericoloso! Complimenti
Does anyone know which language is he using to draw the graphs?
Mathematica
@@noelmckinney897 Thanks friend, have a nice day !
just subscribed.
"this b would be the correlation". Non ci siamo, saresti bocciato all'esame 🙂
wow
Hi, friend!
O MIO DIO MA SA PARLARE ITALIANO????? ❤️❤️❤️❤️❤️❤️❤️
Dommage je ne comprends pas très vie. L'anglais
I like the part that visually there's no difference between a rho of 0.5 and 0.25! Quite simply correlation doesn't imply causality! Thanks Taleb it was invigorating 😁👍
Nothing, nothing to do with causality.
"Correlation doesn't imply causality" is so constantly misused I'd almost say it shouldn't be taught anymore.
On the flip side:
If correlation is = 1, is it possible for it to be nonlinear?
No, because a Pearson correlation of 1 (or -1) means that a linear regression gives a perfect fit to the data. As soon as the correlation gets much away from + or -1 the pattern can be completely weird though.
What is the thing he always wears over his shoulders like he just came out of the sauna or something lol
Or you could’ve just shown Anscombe’s quartet, shown that correlation looks like dot products for a reason, and called it a day. Not sure why you don’t like sub-additivity. Don’t you want that for coherent risk measures? Also I’m guessing mutual information is related to sufficiency? Not that I know much
en.wikipedia.org/wiki/Anscombe%27s_quartet
World’s smartest Pseudoscience killer🤌🏽