This resource is quite helpful. This resource is ideal for those who possess prior knowledge of the language and need a condensed and intensive overview of the subject matter. I really like the conciseness and directness of this statement.
Hi Mark, thanks for the video and the playlist. I don't know, if you tell it after 18:25, but it is a lot easier to use a for - loop instead of copy and paste the code in each cell. What i made is in example: for i in df.columns: print(f'{i}: {df[i].dtype}')
Hi Mark, Amazing video, I was confused about the part of skewness and normality, and then I stumbled upon your video, which cleared all my doubts. Thank you
Mark, you should start using `for` loop instead of writing each line with `print()`. For example, try this code: --- for col in df.columns: print(f"{col}={df[col].nunique()}") --- and you will get the same output in just two lines.
Definitely 👍 I use these videos in a book for students who are coding for the first time. At this point, I haven’t gotten to loops yet. But then, a bit later, I use your exact code to make the point that automation saves them a lot of time.
That’s a good question, but the answer is a bit long for the comments. Basically, those results give you an idea about the cleanliness and distributions of the features. For example, if the skewness is too high, then you know that you’ll either need to transform the feature or choose a modeling algorithm that doesn’t depend on linear assumptions. If categorical features have a large number of unique values, then you’ll need to check to make sure that every value is adequately represented or you’ll need to do some grouping. I talk about a lot of these issues in later videos when I get to the modeling phase. Thanks!
Thank you bro it is so much helpful but I have question I do not understand the univariant in this data set we have numerical and categorical it is not seems ! thank you again from Saudi Arabia
Hey, thanks for the comment and question! Can you tell me a bit more about what you’re asking? Are you wondering about what to look for between numeric versus categorical features?
This resource is quite helpful. This resource is ideal for those who possess prior knowledge of the language and need a condensed and intensive overview of the subject matter. I really like the conciseness and directness of this statement.
Man, this is so useful. This is perfect if you know the language and just need a crash course on stuff. I really like how to the point this is
Hi Mark, thanks for the video and the playlist. I don't know, if you tell it after 18:25, but it is a lot easier to use a for - loop instead of copy and paste the code in each cell. What i made is in example:
for i in df.columns:
print(f'{i}: {df[i].dtype}')
Thak you so much for doing this video. My school taught us how to do EDA and I was struggling to keep up. This helped me a lot.
Hi Mark,
Amazing video, I was confused about the part of skewness and normality, and then I stumbled upon your video, which cleared all my doubts.
Thank you
I like how real this tutorial is
Mark, you should start using `for` loop instead of writing each line with `print()`. For example, try this code:
---
for col in df.columns:
print(f"{col}={df[col].nunique()}")
---
and you will get the same output in just two lines.
Definitely 👍
I use these videos in a book for students who are coding for the first time. At this point, I haven’t gotten to loops yet. But then, a bit later, I use your exact code to make the point that automation saves them a lot of time.
Hi Mark,
thanks for sharing. Very helpful to find a good approach to start analysis!
great content! can you also talk about how to interpret these results? what can we do with all the concepts you discussed
That’s a good question, but the answer is a bit long for the comments. Basically, those results give you an idea about the cleanliness and distributions of the features. For example, if the skewness is too high, then you know that you’ll either need to transform the feature or choose a modeling algorithm that doesn’t depend on linear assumptions. If categorical features have a large number of unique values, then you’ll need to check to make sure that every value is adequately represented or you’ll need to do some grouping.
I talk about a lot of these issues in later videos when I get to the modeling phase.
Thanks!
very clean programming! thank you.
It was good and cleared many of my doubts.
Thank you bro it is so much helpful but I have question I do not understand the univariant in this data set we have numerical and categorical it is not seems !
thank you again from Saudi Arabia
Hey, thanks for the comment and question! Can you tell me a bit more about what you’re asking? Are you wondering about what to look for between numeric versus categorical features?
The kurtosis of normal distribution is 3 but you say that if we ahve kurtosis near +-1 that is to be considered ?
thanks my dude
Hi Mark nice stuff for beginners. Can you please provide notebook(s) used?
I know I'm incredibly late on this, but I've added links in the description
@@MarkKeith😂😂
Where's the document from? It seems helpful. Could you please provide that?
I know I'm incredibly late on this, but I've added links in the description
hi mark , could you please provide the document of how you handled categorical and numerical variables??
I know I'm incredibly late on this, but I've added links in the description
what about multivariate? how do we calculate in pandas
Can you provide the document
I know I'm incredibly late on this, but I've added links in the description
awesome
i need subtitle in pt br :(