Yeah and he teaches very well, too! When I want to understand a specific section of advanced math... most channels over simplified the higher level where it's either become unusable or they explain it the same way like those text books where its just waaaay above most people's'' head.
Love this channel too! I love discussions about intuitions… it’s so easy to get lost in statistical jargon and it’s refreshing to step back and put things into perspective.
Excellent work. The casual discussion is great to explain the concepts for newbies in data science or even the old dogs who want to learn new tricks. The most knowledgeable presenters are the ones who can explain something to a 5 year old. I'm also glad you have some content that formalizes these concepts as well. Always very helpful and though provoking.
This is the most informative video on the intuition behind distribution interpretation I ever watched! For the "pointy" distribution, I've just thought of them as Gaussians with low variances.
Just one realization of that pointy distribution from my work: it happened to a variable that is regulated but a not so powerful regulator. In my case, wind velocity in a tunnel (so signed and 1D) that is being regulated but some not so powerful fan
Tweedie distribution baby! Can be seen in some regression datasets where the government / local authority restrics max salary/price or whatever (California housing).
Love that last one. I use QQ plots more, makes more sense to me, but I've def seen these. Thanks for providing well explained content on a higher level than many do.
That video is really very useful! Please keep on telling about intuition behind the data distributions! That’s really hard to find such explainations in regular books or any other formal sources of data
I think that pointy distribution is modelled as a cauchy distribution and the skewed distribution is what you call a pareto distribution or an exponential distribution
A Cauchy distribution looks appropriate in this case. There are other "pointy" distributions to keep in mind if a Cauchy does not fit well, such as the Laplace distribution.
I definitely turn noisy data into sensible data by making bins. This is especially true with frequency per day. At the daily level, picking out trend is difficult, but grouped to several months, or even several years, really helps create some worthwhile numbers.
Great video, it would be EXTREMELY helpful to me as a perpetually aspiring data scientist if you could show how you might go about fitting a distribution to your data and using it in a simulation exercise. (I have an idea of how I might go about doing this, but I'm acutely aware that others might have better insights!)
I don't think most data scientists have the additional time to delve into geometry, but geometry is very much about the "shape" of mathematical objects.
You deserve to have more than 171 k subscribers compared to stupid other commercial channels which have 20M. Unfair life. They just capture their life buying stuff and have M subscribers while you convey useful things!! Just keep on doing what you’re doing!
Love the content. I started studying data science and your videos helped me a lot. A small suggestion/ request. For each concept/video that you are covering, can you also share some resource that you followed? Thanks
Isn't the pointy one the plot of distances away from a SLR line with L1 cost? I can't precisely remember the name of the curve, but it's not the curve shown.
1:44 I don't agree that a max GPA of 4 is a physical limitation in any usual sense of physics. If it is, then by what physical principle? Conservation of angular moment? But I appreciate the video overall. These are definitely common cases in data science. The video is both information and practical. Lately I have been thinking about counterfactual inference when there is an unknown upper bound on a facility's capacity. The bound will not change when intervening on the rates, but how the shape of the distribution will change with respect to the boundary and the expectation of the intervention distribution is non-obvious to me. From the modelling side I could derive a truncated distribution. Or I could derive the distribution of MAX(X, c) where c is a parameter or hyperparameter, although in NUTS/Gibb/MH sampling I find that such bounds are sampled poorly (i.e. lots of divergences) when they're treated as a parameter. Or you can have mixture distribution that transitions from "away-from-boundary behaviour" to "near-to-boundary behaviour".
I love your channel’s intersection of higher level math with stats and data science! Feels like no one does it quite like you
Thanks 😊
Yeah and he teaches very well, too! When I want to understand a specific section of advanced math... most channels over simplified the higher level where it's either become unusable or they explain it the same way like those text books where its just waaaay above most people's'' head.
dude I'm so grateful that this channel exists !!
Thanks! Grateful to you for watching
Teaching the intuition behind Data science and math in general, I find to be much more important than people might think
Thanks! I think so too
This is probably the best use of 8.5 minutes I'll see all day. Love the insights, concise and organized delivery, and relatable examples.
Great summary of the "main" distributions. Thanks.
Amazing content that links stats and real world data. Greatly appreciate your work and clear examples!
Glad it was helpful!
Love this channel too! I love discussions about intuitions… it’s so easy to get lost in statistical jargon and it’s refreshing to step back and put things into perspective.
Thanks!
Excellent work. The casual discussion is great to explain the concepts for newbies in data science or even the old dogs who want to learn new tricks. The most knowledgeable presenters are the ones who can explain something to a 5 year old. I'm also glad you have some content that formalizes these concepts as well. Always very helpful and though provoking.
Thanks for the thoughtful words!
I absolutely love this channel
One of the best videos on Data science makes us understand data better
This is the most informative video on the intuition behind distribution interpretation I ever watched!
For the "pointy" distribution, I've just thought of them as Gaussians with low variances.
Thanks!
Awesome stuff, very useful! Thanks ❤
Thanks for watching!
Another fantastic video. Nice job.
Appreciate it!
Just one realization of that pointy distribution from my work: it happened to a variable that is regulated but a not so powerful regulator. In my case, wind velocity in a tunnel (so signed and 1D) that is being regulated but some not so powerful fan
Thank you! I realy like how you can explain everything simple way
I like your teaching style.
Glad to hear that
Practicality and 'rule of thumb'... You excel at that sort of stuff.
👌🏽
Thanks!
Brilliant description of distributions
Thanks!
Tweedie distribution baby! Can be seen in some regression datasets where the government / local authority restrics max salary/price or whatever (California housing).
Love that last one. I use QQ plots more, makes more sense to me, but I've def seen these. Thanks for providing well explained content on a higher level than many do.
Thanks for the input and thanks for watching!
That video is really very useful! Please keep on telling about intuition behind the data distributions! That’s really hard to find such explainations in regular books or any other formal sources of data
Thanks! Will do
Excellent
Thank u for your work brother!🙏
Thanks for watching!
the power of this video 🔥
The power of my viewers 🔥
I think that pointy distribution is modelled as a cauchy distribution and the skewed distribution is what you call a pareto distribution or an exponential distribution
A Cauchy distribution looks appropriate in this case. There are other "pointy" distributions to keep in mind if a Cauchy does not fit well, such as the Laplace distribution.
I definitely turn noisy data into sensible data by making bins. This is especially true with frequency per day. At the daily level, picking out trend is difficult, but grouped to several months, or even several years, really helps create some
worthwhile numbers.
This was super helpful! Thanks for sharing!
Thanks for watching!
Im excited that you teach the message “what does it tell me” and explain by real life 🎉.
Thanks!
@@ritvikmath Under which of these types would you put the distribution of income in US, fat tail and big pick at the upper end?
Great video, it would be EXTREMELY helpful to me as a perpetually aspiring data scientist if you could show how you might go about fitting a distribution to your data and using it in a simulation exercise. (I have an idea of how I might go about doing this, but I'm acutely aware that others might have better insights!)
I don't think most data scientists have the additional time to delve into geometry, but geometry is very much about the "shape" of mathematical objects.
You deserve to have more than 171 k subscribers compared to stupid other commercial channels which have 20M. Unfair life. They just capture their life buying stuff and have M subscribers while you convey useful things!! Just keep on doing what you’re doing!
would be nice if you'd expand on how to analyze these dists
Love the content. I started studying data science and your videos helped me a lot. A small suggestion/ request. For each concept/video that you are covering, can you also share some resource that you followed? Thanks
Pretty nice. Add another one on how to model those distributions.
Thanks!
Isn't the pointy one the plot of distances away from a SLR line with L1 cost? I can't precisely remember the name of the curve, but it's not the curve shown.
Can you make a video on data engineering vs machine learning engineering vs data scientist vs data analyst? Great vid btw!
Thanks for the suggestion!
interesting ideas, but would be more helpful if you had a list of action items w/ each distribution.
Great suggestion
1:44 I don't agree that a max GPA of 4 is a physical limitation in any usual sense of physics. If it is, then by what physical principle? Conservation of angular moment?
But I appreciate the video overall. These are definitely common cases in data science. The video is both information and practical.
Lately I have been thinking about counterfactual inference when there is an unknown upper bound on a facility's capacity. The bound will not change when intervening on the rates, but how the shape of the distribution will change with respect to the boundary and the expectation of the intervention distribution is non-obvious to me. From the modelling side I could derive a truncated distribution. Or I could derive the distribution of MAX(X, c) where c is a parameter or hyperparameter, although in NUTS/Gibb/MH sampling I find that such bounds are sampled poorly (i.e. lots of divergences) when they're treated as a parameter. Or you can have mixture distribution that transitions from "away-from-boundary behaviour" to "near-to-boundary behaviour".
Has Bard/ChatGPT impacted your work in any way? How did you land up in DS?
Hey thanks for the questions! We will be covering those topics very soon in future videos
👍
👍
always great to watch your videos! Is there a way to contact you directly @ritwikmath?