Thank you for this great video! I really wish you could make a follow-up in-depth video about various clustering algorithms, going in detail about ways to deal with large number of dimensions (PCA often makes data uninterpretable), categorical features and how to correctly select dimensions to project your clusters on.
Excellent tutorial Yiannis - attention to detail and explaining step by step in a brief and concise manner! Keep ur videos coming and all the very best! U deserve it!!!
Very good video, good that u add supplementory videos from such great ytbers like statquest. It shows how u really like us to know things, cant wait for future videos
@wise guy Nop, I am not an "institution" so I cannot offer that. That's why the course is really cheap; exactly as much as I need for the monthly payment to keep the page running. The course is geared towards learning how to handle data, clean / transform / join / visualize and generate insights. It's all about the skills! :)
Hi! Please, when transforming string data to numerical, does the method or strategy used affect or have an impact on the algorithm results? For example if I decide to replace months by 1,2,3,..,12 or if I use the get dummies function.
Great work! I love your way of explaining everything and then give the reasoning as well! Can you please explain how to plot the clusters of k-mean? Thanks a lot in advance! :)
Thank a lot Yiannis. You applied a pd.get_dummies because all your features were characters. What if you have a mix of numerical and characters in the original data set? Example: country, client, product, sales, market share ? can i face a bug if my "product" feature contains 100000+ distinct values? thanks so much !!!
Hey Nabi. Scaling comes into place when you have big scales and you want to normalize the data within a particular range; sometimes it helps in speed of calculations and outcome. In our case, we only have zeros and ones so there is no need for scaling. Hope this helps!
@@YiannisPi yes thanks for the answere , one more scenario if you you have only binary data in such case without scaling will work but if .if in case you have a data set which has a numerical and binary column . do we need scaling only for numerical column or scaling should be for both data type . One more question attached to this . selection criteria of variable for clustering should have what variance value for the better clustering .....
@@MrNabiwishes If you have both then, you only (might need to) scale the numeric. But it depends, your numbers may not need scaling the first place. Not sure what you mean from the second question :)
@@YiannisPi how to select variables which would be significant for clustering .depending on their data type . what strategy you use for variables selection before clustering is applied.
Hey Everyone! Let me know if you like this video in the comments below! Thanks!
you are the best
Great!!
The best video in RUclips for learning K Means Clustering!
I love them! :-)
Very well explained end to end (clustering). From theory to the Dashboards and insights. Thank you!
Wow! I love how you explain every line of code so clearly. You are a great teacher. Thank you so much.
Glad it was helpful!
3 minutes in I was so excited and confident in my ability to learn this. What a boss teacher!
Thank you for this great video! I really wish you could make a follow-up in-depth video about various clustering algorithms, going in detail about ways to deal with large number of dimensions (PCA often makes data uninterpretable), categorical features and how to correctly select dimensions to project your clusters on.
The best Data science channel. You really are good at this man!
Again thank you for a very concise lesson! I appreciate the time you put into everything that goes into your lessons! A lot of good take always!
Thanks Scott! Glad you liked it!
Excellent tutorial Yiannis - attention to detail and explaining step by step in a brief and concise manner! Keep ur videos coming and all the very best! U deserve it!!!
Glad it was helpful!
Thank you so much for this example. I like how concise you are with your explanations. Definitely top 3 data science channels for me now! Subscribed!
HI Yiannis, Than kyou so much for that series. Tha is great. I love to listen your tutorials. They are very practical and informative. :-)
This Kmeans series is great. Have subscribed for future learnings. Thanks, and please make more content about data science.
Great content Yiannis! Keep them coming!
Thanks Tom!
holy shit! your channels really underrated! i hope you blow up
Very good video, good that u add supplementory videos from such great ytbers like statquest. It shows how u really like us to know things, cant wait for future videos
Glad you enjoy it!
your content is amazing sir with end to end including theory & visualize concept
Thanks man! This is what the community requested! So I deliver :)
@wise guy Nop, I am not an "institution" so I cannot offer that. That's why the course is really cheap; exactly as much as I need for the monthly payment to keep the page running. The course is geared towards learning how to handle data, clean / transform / join / visualize and generate insights. It's all about the skills! :)
This is great session!
Thank you for this video with explaining really ez to understand and your code is so clear. Hope you will make more content in like this
awesome explanation
You made this look easy. Thank you
Hi! Please, when transforming string data to numerical, does the method or strategy used affect or have an impact on the algorithm results? For example if I decide to replace months by 1,2,3,..,12 or if I use the get dummies function.
Great work! I love your way of explaining everything and then give the reasoning as well! Can you please explain how to plot the clusters of k-mean? Thanks a lot in advance! :)
I explain plotting in part 2 of this series. Check it out!
@@YiannisPi Yes I checked that out! Thank you once again! :)
Excellent Video! You are a life saver.
God job can you help me to get a code example of topological data analysis and clustering using Persistent Homology. Thanks.
Thank a lot Yiannis. You applied a pd.get_dummies because all your features were characters. What if you have a mix of numerical and characters in the original data set? Example: country, client, product, sales, market share ? can i face a bug if my "product" feature contains 100000+ distinct values? thanks so much !!!
hello pls i have a question, can i get the link to the dataset origin
make a video on ''customer segmentation and clustering in retail using machine learning'' using real retail dataset
Amazing video. Thumbs up
Big thanks
Thanks for sharing.
Good Job Sir
Thanks Amine!
thank you this is really helpful !
Glad it was helpful!
Can we have your Github to enable us walk through your codes
Sure! it's in the desc!
just wanted to ask about , you havent done scaling . Why ?
Hey Nabi. Scaling comes into place when you have big scales and you want to normalize the data within a particular range; sometimes it helps in speed of calculations and outcome. In our case, we only have zeros and ones so there is no need for scaling. Hope this helps!
@@YiannisPi yes thanks for the answere , one more scenario if you you have only binary data in such case without scaling will work but if .if in case you have a data set which has a numerical and binary column . do we need scaling only for numerical column or scaling should be for both data type . One more question attached to this . selection criteria of variable for clustering should have what variance value for the better clustering .....
@@MrNabiwishes If you have both then, you only (might need to) scale the numeric. But it depends, your numbers may not need scaling the first place. Not sure what you mean from the second question :)
@@YiannisPi how to select variables which would be significant for clustering .depending on their data type . what strategy you use for variables selection before clustering is applied.
@@MrNabiwishes Watch all the series on K means to see how I use PCA before clustering! That is one of the many techniques!
👏👏👏👏👏👏👏👏👏