Pareto, Power Laws, and Fat Tails-what they don’t teach you in STAT 101

Shaw Talebi

Просмотров 4,5 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 10 сен 2024

Комментарии • 32

@ShawhinTalebi 10 месяцев назад ⁺¹
**Corrections**
Slide 5: Due to a mistake in my code, plot shows PDF of Pareto with α = -0.84 instead of 1.16 😅 (GitHub repo and blog have been corrected)
Slide 6: Power Law PDF should have exponent -(α+1) not -(α-1). Accordingly α values in legend are off by 2 i.e. α = -0.84, 0, 1
Slide 18: fuhgetaboutit (α
@dogie9548 4 месяца назад ⁺⁷
This was probably one of the best explanations I could've viewed on stats as someone with mediocre understanding of the subject.
@ShawhinTalebi 4 месяца назад ⁺¹
Glad it was clear :)
@ronallan8680 27 дней назад
This is some really good stuff! Thanks!
@AnimeshSharma1977 10 месяцев назад ⁺²
Nice to hear about this idea is such a short video which took Taleb volumes and most of readers like us never getting that this is what he was talking about 😜
@ShawhinTalebi 10 месяцев назад ⁺¹
😂😂 The volumes are usually a prerequisite for a more concise description.
Most of this video is me parroting things I learned from Taleb.
@AnimeshSharma1977 9 месяцев назад
Absolutely @@ShawhinTalebi hardly anyone reads original works of even paradigm-shifters, all we have is regurgitate versions, we just Kant do it 🤪
@QQ-xx7mo 5 месяцев назад ⁺¹
15:25 years later I understand why my model wouldn't fit / regress to the data, thank you 🤣🤣
@ShawhinTalebi 4 месяца назад
We have @nntalebproba to thank XD
@RajivSambasivan 7 месяцев назад
Nice. Power laws are modelled very well by graphs - as you probably are aware of as a physicist. There is a lot of pioneering work in this area by Mark Newman and Lara Adamic. From a practical standpoint, when you perform learning over graphs that correspond to fat tails, you have some nodes that are very highly connected and most others have small degrees. The crux of learning over graphs is that the prediction or model for each node is only influenced by its neighbors. For high degree nodes, you sample a fraction of the nodes to predict, but for regular nodes with few neighbors, you consider a 1 or 2 hop neighborhood. There are reasonably good graph ML libraries now - pytorch Geometric, for example, or, DGL. An algorithm called GraphSAGE can be a good candidate. Two really good, practical papers to read on this subject are: "Learning When Training Data are Costly: The Effect of Class Distribution on Tree Induction", by Wiess/Provost and "Thresholding for Making Classifiers Cost-sensitive" by Sheng/Ling. Cheers, enjoyed this. @ShawlinTalebi
@ShawhinTalebi 7 месяцев назад ⁺¹
Thanks for the recs Rajiv! Newman's "Networks" textbook was actually the first time I learned about power law distributions. I added those to my reading list :)
@spiderjerusalem 6 месяцев назад
Amazing video
@wallstreetjones 4 месяца назад
Just so I’m clear Regression doesn’t work well in any situation or just power laws?
@ShawhinTalebi 3 месяца назад ⁺¹
Regression doesn't work for Power Laws. There are many situations where it works great, especially if the data are normally distributed.
@testme2026 8 месяцев назад
great video, thank you, sound very low though!
@ShawhinTalebi 8 месяцев назад
Thanks! Sorry about the audio 😬
@stevencoutinho7138 8 месяцев назад
At 4:22 what exactly is x and what is P(X=x)? In the Gaussian example I would expect x to be wealth that increases from left to right and y to be the number of observations. Yet with that in mind, shouldn't the Pareto curve be such that x=w from low to high, with at the bottom a lot of observations and only very few at the high end of wealth? How is it that it typically explained that the 20% of the bottom end of the x-axis have 80% of everything?
@ShawhinTalebi 8 месяцев назад
Good question. Supposing wealth follows the given Pareto distribution, X could be net worth and P(X=x) is the probability that a random selection from the population will have net worth equal to x. The long tail here reflects the intuition that there are many more people with net worths around $100,000 than $100,000,000, since the as x increases P(X=x) decreases.
The 80-20 rule comes in when you add up all the wealth of to the top 20%, you would get about 80% of the total wealth in the population.
Hope that clears things up, happy to expand on any point.
@stevencoutinho7138 8 месяцев назад
@@ShawhinTalebi thanks for your reply Shawhin. Adding up would mean an integral between certain values of w. So P(w)dw between w_min and some cut off w*. Wouldn't this then say that 80% of all people (high P(w)s) have only 20% of all wealth (correlating to the small bandwidth of w's on the x-axis from left to right? (and not vice versa which is what I see everywhere)
btw shouldn't the alpha be negative? If it was positive the curve would be upward sloping. Sorry for all the questions. I am writing a book called Influencers and Followers. I believe that the way neurons fire when we choose (using power law) also influence wealth distribution at the society level (also a power law with alpha -1.16).
@ShawhinTalebi 7 месяцев назад
That's right, 80% would have only 20% of wealth.
Good question. This is a matter of convention. Here, I define a power law as: ~x^-(a-1), so the negative is baked into the definition. Sounds like an interesting book!
@jayo3074 9 месяцев назад
How can we financially support you? Do you have Patreon?
@ShawhinTalebi 9 месяцев назад
Thank you for your generosity :)
I currently accept caffeinated beverages here: www.buymeacoffee.com/shawhint
@chrstfer2452 9 месяцев назад
Addicted to this channel.
@ShawhinTalebi 9 месяцев назад ⁺¹
I hope it's one of those good addictions 😅
@chrstfer2452 9 месяцев назад
@@ShawhinTalebi already watched probably 40% of your videos, but almost all of the long ones except the causality playlist which im saving for a time i can sit and really watch.
@ShawhinTalebi 9 месяцев назад ⁺¹
Thanks for watching. Feel free to reach out with any questions or suggestions for future content :)
@chrstfer2452 9 месяцев назад
@@ShawhinTalebi off the top of my head anything on the intersection of time series/signal analysis and stats/machine learning, and also more info on what exactly you do as a contract data scientist. Maybe some info on if/how fresh undergrads could get into that or find a mentor in that space (without breaking the bank, i wish i could do what you did and hire out time to talk to people but its not in the cards just yet).
I especially like your nice mix of formal treatment and personability, so many channels that treat things more rigorously are impossible to listen to for a whole lecture without backgrounding the audio and missing most of it. I also really like how you mention your specific sources vocally and actually put them all in the description.
@ShawhinTalebi 9 месяцев назад ⁺¹
@@chrstfer2452 Thanks this is super helpful! I've been putting together ideas on a times series forecasting/classification series. Any suggestions are appreciated.
For breaking into freelance, I have an article on the subject which I'll make into a video based on your feedback: medium.com/the-data-entrepreneurs/how-to-start-freelancing-in-data-science-150551f25fda
We also recently put out videos about this on The Data Entrepreneurs channel: www.youtube.com/@TheDataEntrepreneurs/videos
If you ever want to chat about data science/entrepreneurship, feel free to set up some office hours: calendly.com/shawhintalebi/office-hours
@user-ng9uk1vp8u 8 месяцев назад
21:00 Give me that meme
@ShawhinTalebi 8 месяцев назад ⁺¹
😂😂😂
I shared it here: www.linkedin.com/posts/shawhintalebi_statistics-8020rule-fattails-activity-7132748486512447488-waTm?
@user-ng9uk1vp8u 7 месяцев назад
@@ShawhinTalebi Love it!

Следующие

Автовоспроизведение

The OpenAI (Python) API | Introduction & Example Code