I am really grateful for your detailed explanation! I am self studying machine learning this summer holiday. And I am at this point now. I am so confused before watching your video. Now I finally understand this point. Thank you so much!
Hi Professor, thank you so much for this video! Clear and concise you have no idea how much I needed this. Keep up the great work, I will be sure to check out your other videos as well 😊
Thank you so much Professor Ryan. You just made my life easy. best explanation. so simple to understand even for someone who doesnt have a background knowledge in machine learning.
Well explained about standardization and normalization.Now i got full clarity on these topics.Thanks for taking this effort and explaining in this way.
Impressed with your way of teaching. You are explaining very well with the right examples... awesome work of you... One small request is that in your playlist sequence of 'Artificial Intelligence, Machine Learning, and Deep Learning' is jumbled, please keep the playlist in order for easy learning.
Great explanation however i think saying scaling is not required for distance based algorithm is wrong as these algorithm are most affected by the range of features. Can you comment on this.
I believe that such as in the case of k-means, the algorithm calculates distances based on column versus same column as opposed to a neural network were each column can have a impact on target output. As distances are measured in the same scale (column x column), of course one feature is going to affect more clusterization {for instance}, but that's the point of k-means, we want to see which features describe data distribution across dimensions.
Great video, would say we need scaling for distance-based as it will get wrong results if features are on different scales. We don't need scaling for tree-based as they are not susceptible to variance.
For ML context : if data is following gaussian distribution ( bell shape) follow standard deviation else go with normalisation ( improves cluster scaling as well).
Hi Prof. Ryan, Thank you for explaining the subject in a simple manner. I have a Human Resources situation at hand. We have an employee appraisal system and the rating is on a 6 point scale (ranging from Poor performer to Outstanding performer). We have 15 departmental heads who rate their respective team members on this 6 point rating scale. However, there are immense biases that creep in during evaluation. Also, some evaluators are tougher/lenient than others. Consequently, we end up with different ranges/averages. As the ratings are linked to incentives, sometimes, good performers lose out against their peers in other departments. I intend to eliminate this bias/lack of neutrality which have been rated by 15 different departments (for 1000 employees). Can you suggest how I should go about this situation please. Regards...Muralidhar
hey professor, that was a very cool and simple video to follow and understand, could i ask for where i cold find the notebook you used at the end to use?
This is my first time that I am watching your video.. You look very ..very much similar to Saif Ali Khan.. In fact the smile is also same. One like vote from me. A gentle smile on face make you different from all the others.
I'm finding a lot of sources are saying feature scaling is advised when using k nearest neighbours. Is there more nuance to this point? Is scaling required after all?
as always: outstanding! Your enthusiasm is inspiring... On the other hand, it is clear why tree-based algorithms do not require feature scaling. However, distance-based algorithms such as K nearest Neighbors and K-means require Euclidean Distance calculation which means that feature scaling is necessary with them. Am I wrong?
I think you should scale features for K-means and K-nn. Think about it intuitively. If you are looking at two points and their x y (feature) distances, how would you want to define their closeness? Do you want their features to be considered equally when calculating your distance or is one feature more important then the other ? If you want both x and y to be considered on equal playing fields, then you should scale them so that the distance computed reflects their importance. Scale each feature by the method that makes more since to that feature. This is most likely [0 to 1] across samples.
Firstly, I like very much your explination. Secondly, I would like to know, how do you plot the row and rescalled data? Do you use the histograms function from pandas? Thank you very much and keep working so on!
11:40 wait I am confused now, because I thought that since the distance of the data is so important in algorithms such as kNN, SVM etc. scaling is a MUST pre-process step, but now you are saying that it is not required ? Could you please clarify this ?
Just came from a KMeans clustering course that demonstrates how normalization results in better clusters. But at 11:40, you say KMeans clustering doesn't require standardization or normalization. I'm confused.
At 11:27 you mentioned in last bullets that scaling is not required for K-NN and SVM is not correct. K-NN and SVM exploits distances or similarities they do require scaling.
I really liked your explanation, thanks P.S. Are you Egyptian? I mean your accent is perfect, but your pauses while speaking give the intuition that you're from the Great Egypt.
Dear Rayan, how to test a scaled data model. i used this way the predict value is very different X_testing=np.array([[550,440,110,0,0,0,0.33,400,8.8,0,863,771]]) #78.6 ypred=model.predict(scaler.fit_transform( X_testing)) # predicted should be 78 , but i got [[0.17291696]] also without fit_transform also the value is different. many thanks for you replay.
this is by far the best explanation I've come across. So simple to understand. Thank you Prof. You just earned a follower!!
I am really grateful for your detailed explanation! I am self studying machine learning this summer holiday. And I am at this point now. I am so confused before watching your video. Now I finally understand this point. Thank you so much!
Hi Professor, thank you so much for this video! Clear and concise you have no idea how much I needed this. Keep up the great work, I will be sure to check out your other videos as well 😊
Your explanation is as amazing as a rainbow cloud after a thunderstorm!!! I'm so glad I found this visual explanation!
Thank you so much Professor Ryan. You just made my life easy. best explanation. so simple to understand even for someone who doesnt have a background knowledge in machine learning.
Well explained about standardization and normalization.Now i got full clarity on these topics.Thanks for taking this effort and explaining in this way.
Amazing Explanation.. Just in one run, i get your whole point in an easy way. Big Thanks
Impressed with your way of teaching. You are explaining very well with the right examples... awesome work of you...
One small request is that in your playlist sequence of 'Artificial Intelligence, Machine Learning, and Deep Learning' is jumbled, please keep the playlist in order for easy learning.
Great explanation however i think saying scaling is not required for distance based algorithm is wrong as these algorithm are most affected by the range of features. Can you comment on this.
I think the same
Exactly! Scaling is crucial for distance based algorithm.
I believe that such as in the case of k-means, the algorithm calculates distances based on column versus same column as opposed to a neural network were each column can have a impact on target output.
As distances are measured in the same scale (column x column), of course one feature is going to affect more clusterization {for instance}, but that's the point of k-means, we want to see which features describe data distribution across dimensions.
Thank you so much! I couldn't wait to end this video before thanking you ! you made it super clear.
A great scientist and teacher. keep it up, sir. thank you.
King !!! very good explantation. I watched multiple videos on yt and i asked Chatgpt many questions but now after your video i finally understand it
Came here from your udemy course. You are a life saver, prof!
This was such a crystal clear explanation! Thank you so much sir!
This was pretty clearly explained.
For anyone else looking for this, the standardization chapter begins at 6:49.
amazing video! clearly explained! Congratulation Professor !
Many thanks for this video... One of the best explanations ever seen by me
Awesome. I understand finally. Very good explanation. Easy to follow
thanks a lot ...worth watching..u explanined each concept in a simple way...
Thank You Leonard Hofstadder..🙂
Hahaha thanks ❤️😂
Great video, would say we need scaling for distance-based as it will get wrong results if features are on different scales. We don't need scaling for tree-based as they are not susceptible to variance.
The best simple explanation ever
Fantastic Explanation Sir ! Thanks so much !
the outlier thing is so crucial actually damn, i havent seen this is in a machine learning course before, banger
Many thanks for this video... One of the best explanations
Great explanation. Thank you very much, Sir!
clear as a crystal, thankyou
Fantastic explanation ! Thank you so much.
thank you very much, I can't pass without thanking you and subscribe for the clarity you gave me on that topic
Such a great explanation. Thank you very much
Thank you for your best explanation as easy to understand
This professor is so pleasant for all senses. Thanks for sharing knowledge selflessly :)
Very Clear Explanation.
Thank you :)
Awesome explanation. Thank you!
thank you boss man, just used normalization instead of standardization, life saver
Thank you for the clear explanation!
good. clearly explained. thanks
For ML context : if data is following gaussian distribution ( bell shape) follow standard deviation else go with normalisation ( improves cluster scaling as well).
Great explanation!! Could you say more about when the input is image datasets - like CNNs?
Awesome explanation for a beginner like me. Wish I had access to the S&P 500 dataset.
Amazing explanation!
Excellent explanation.
Thank you so much, Prof!
Outstanding content.
Hello Professor, Video was able to explain the concepts and its practical implementation in a concise manner. Awesome work
Many thanks!
Hi Prof. Ryan,
Thank you for explaining the subject in a simple manner.
I have a Human Resources situation at hand. We have an employee appraisal system and the rating is on a 6 point scale (ranging from Poor performer to Outstanding performer). We have 15 departmental heads who rate their respective team members on this 6 point rating scale.
However, there are immense biases that creep in during evaluation. Also, some evaluators are tougher/lenient than others. Consequently, we end up with different ranges/averages.
As the ratings are linked to incentives, sometimes, good performers lose out against their peers in other departments.
I intend to eliminate this bias/lack of neutrality which have been rated by 15 different departments (for 1000 employees). Can you suggest how I should go about this situation please.
Regards...Muralidhar
hey professor, that was a very cool and simple video to follow and understand, could i ask for where i cold find the notebook you used at the end to use?
This is my first time that I am watching your video.. You look very ..very much similar to Saif Ali Khan.. In fact the smile is also same. One like vote from me. A gentle smile on face make you different from all the others.
Excellent thanks!!!
Can you show an example of scaling with train test split? Do you scale the train and test data with the same scaler?
Informative!
Appreciated it, Thanks.
Excellent!
Superb illustration.
Thank you so much 😀
@@professor-ryanahmed You're welcome, Prof. Please, the link to the dataset?
Gr8 explanation!!!
Can you share the dataset you used for this demo pls?
For supervised algorithms, can we used both as data input ?
Thank you it is helpful
Good theoretical explanation.. but I think scaling is used for k means, knn
Thank you Prof!
Thank you for this video
My pleasure
Thanks for sharing ❤
top explanation along with code, can you upload the notebook file with each video u explain . thanks
Great job 👏👏❤
I'm finding a lot of sources are saying feature scaling is advised when using k nearest neighbours. Is there more nuance to this point? Is scaling required after all?
Question, what if our model encounters bigger value than what we had in training data? How do we handle that
He used to be on Stemplicity as well.
Thank you!
as always: outstanding! Your enthusiasm is inspiring... On the other hand, it is clear why tree-based algorithms do not require feature scaling. However, distance-based algorithms such as K nearest Neighbors and K-means require Euclidean Distance calculation which means that feature scaling is necessary with them. Am I wrong?
I think you should scale features for K-means and K-nn. Think about it intuitively. If you are looking at two points and their x y (feature) distances, how would you want to define their closeness? Do you want their features to be considered equally when calculating your distance or is one feature more important then the other ? If you want both x and y to be considered on equal playing fields, then you should scale them so that the distance computed reflects their importance.
Scale each feature by the method that makes more since to that feature. This is most likely [0 to 1] across samples.
thx Prof
SUPEEEEEEEER clair. thanks
could you put a link to the csv file so we can download and try the exercise ourselves please?
رائع .. متميز
Firstly, I like very much your explination.
Secondly, I would like to know, how do you plot the row and rescalled data? Do you use the histograms function from pandas?
Thank you very much and keep working so on!
I have all ready founded. :D
import seaborn as sns
sns.pairplot(df)
Can you please share the github repo link for accessing the data files used in the video
Please, where's the link to the dataset? I'd really appreciate if you can paste it here, Prof. Thanks a lot.
11:40 wait I am confused now, because I thought that since the distance of the data is so important in algorithms such as kNN, SVM etc. scaling is a MUST pre-process step, but now you are saying that it is not required ? Could you please clarify this ?
حبيبي يا بروف
How to apply z score normalisation in live data ??? 🙏🙏🙏
Helpful!
thx
you've not missed a single base brother. what an explain
where can I get the dataset?
Just came from a KMeans clustering course that demonstrates how normalization results in better clusters. But at 11:40, you say KMeans clustering doesn't require standardization or normalization. I'm confused.
Could've added this into your udemy course
Good
The best, marhaba
top of the top
dataset please
At 11:27 you mentioned in last bullets that scaling is not required for K-NN and SVM is not correct. K-NN and SVM exploits distances or similarities they do require scaling.
very true!!
Thank you. Where I can download the notebook code?
I also have this question
I feel very stupid because I see what we should use but not WHY :(
distance-based methods assume that features are normalized?. feature scaling is required?. please confirm that?.
tree-based does not need scaling
I really liked your explanation, thanks
P.S.
Are you Egyptian?
I mean your accent is perfect, but your pauses while speaking give the intuition that you're from the Great Egypt.
A whole semester in 20 minutes
Dear Rayan, how to test a scaled data model.
i used this way the predict value is very different
X_testing=np.array([[550,440,110,0,0,0,0.33,400,8.8,0,863,771]]) #78.6
ypred=model.predict(scaler.fit_transform( X_testing)) # predicted should be 78 , but i got [[0.17291696]]
also without fit_transform also the value is different.
many thanks for you replay.