Feature Selection techniques in Python | feature selection machine learning | machine learning tips
HTML-код
- Опубликовано: 26 мар 2022
- Feature Selection techniques in Python | feature selection machine learning | machine learning tips
Hello ,
My name is Aman and I am a Data Scientist.
About this video,
In this video, I explain in detail about feature selection techniques in python. I explain about feature selection machine learning with its types and categories. I show the python demo of feature selection technique as well. Below points are covered in this video:
1.Feature Selection techniques in Python
2.feature selection machine learning
3. machine learning tips
4. feature selection unfold data science
5. Python feature selection techniques
About Unfold Data science: This channel is to help people understand basics of data science through simple examples in easy way. Anybody without having prior knowledge of computer programming or statistics or machine learning and artificial intelligence can get an understanding of data science at high level through this channel. The videos uploaded will not be very technical in nature and hence it can be easily grasped by viewers from different background as well.
If you need Data Science training from scratch . Please fill this form (Please Note: Training is chargeable)
docs.google.com/forms/d/1Acua...
Book recommendation for Data Science:
Category 1 - Must Read For Every Data Scientist:
The Elements of Statistical Learning by Trevor Hastie - amzn.to/37wMo9H
Python Data Science Handbook - amzn.to/31UCScm
Business Statistics By Ken Black - amzn.to/2LObAA5
Hands-On Machine Learning with Scikit Learn, Keras, and TensorFlow by Aurelien Geron - amzn.to/3gV8sO9
Ctaegory 2 - Overall Data Science:
The Art of Data Science By Roger D. Peng - amzn.to/2KD75aD
Predictive Analytics By By Eric Siegel - amzn.to/3nsQftV
Data Science for Business By Foster Provost - amzn.to/3ajN8QZ
Category 3 - Statistics and Mathematics:
Naked Statistics By Charles Wheelan - amzn.to/3gXLdmp
Practical Statistics for Data Scientist By Peter Bruce - amzn.to/37wL9Y5
Category 4 - Machine Learning:
Introduction to machine learning by Andreas C Muller - amzn.to/3oZ3X7T
The Hundred Page Machine Learning Book by Andriy Burkov - amzn.to/3pdqCxJ
Category 5 - Programming:
The Pragmatic Programmer by David Thomas - amzn.to/2WqWXVj
Clean Code by Robert C. Martin - amzn.to/3oYOdlt
My Studio Setup:
My Camera : amzn.to/3mwXI9I
My Mic : amzn.to/34phfD0
My Tripod : amzn.to/3r4HeJA
My Ring Light : amzn.to/3gZz00F
Join Facebook group :
groups/41022...
Follow on medium : / amanrai77
Follow on quora: www.quora.com/profile/Aman-Ku...
Follow on twitter : @unfoldds
Get connected on LinkedIn : / aman-kumar-b4881440
Follow on Instagram : unfolddatascience
Watch Introduction to Data Science full playlist here : • Data Science In 15 Min...
Watch python for data science playlist here:
• Python Basics For Data...
Watch statistics and mathematics playlist here :
• Measures of Central Te...
Watch End to End Implementation of a simple machine learning model in Python here:
• How Does Machine Learn...
Learn Ensemble Model, Bagging and Boosting here:
• Introduction to Ensemb...
Build Career in Data Science Playlist:
• Channel updates - Unfo...
Artificial Neural Network and Deep Learning Playlist:
• Intuition behind neura...
Natural langugae Processing playlist:
• Natural Language Proce...
Understanding and building recommendation system:
• Recommendation System ...
Access all my codes here:
drive.google.com/drive/folder...
Have a different question for me? Ask me here : docs.google.com/forms/d/1ccgl...
My Music: www.bensound.com/royalty-free...
Thank you so much for this Aman. Please cover the wrapper and embedded categories too. I am currently a data scientist and these techniques are very helpful for my job. Thanks
Thank you so much for this and yes sir, please cover the rest of the two methods in detail.
Thanks for the video, plz cover other two methods as well
Thank you so much, I always learned new things with all your courses.
great teaching sir g
Thank you for your hard work!!!!
Today's topic is too good clean and clear explanation. 👏👏Please do next videos also Aman sir👍 waiting for next series of feature selection videos 😎😎😎😎😎
Thank you.
Nice tutorial. Thanks for your hard work and efforts you put.
I have one question though. Which category does PCA belong to?
Thanks. This was very useful. Please cover wrapper and embedded too if possible
Thanks a lot Aman for this detailed and simple explanation and putting a lot of efforts for us.
Can you cover following topics anytime in future: This will be very helpful for People learning Data science in fields like: Chemical Engg/, Pollution control in Manufacturing/ Power Sectors etc..
1. How to deal if test and validation set in time series regression data has different distribution.
2. Approach to deal with time series batch wise processes eg: Batch wise manufacturing of Alcohol or any other chemical product and we have to do make a model on either preventive maintenance or how to increase run length and efficiency of upcoming batch processes.
Thanks Sudhanshu, sure.
Thanks Aman. Please cover the wrapper and embedded categories too.
Thanks Aman , can you please create a seperate playlist regarding this topic , it would be a great thing🙂
Good Suggestion, let me create one. There are some old videos also which can be put.
Thanks aman. Please cover the other two categories as well!
Thanks Vijay. Sure
Hi Aman, we use chi2 test on top of categorical variables. But here petal length and width are numerical variable. Can you please explain this?
Yes please create RFE and Wrapper video
Sir the tutorial is such a good one...but I seek to know which method is more better for feature selection in random forest algorithm
Thank you sir for your hard work and for this video.🙂
Welcome Shubham.
Thanks a lot Aman bhai
Welcome Mayank.
Aman, i have a question , can we directly fit_transform the X in VarianceThreshold to get the desired number of Columns? why do we have to do firstly .fit() and then get_support() when if we can get directly the result from fit_transform()?
can embedded lasso and ridge regularization be used in multiclass classification like iris data??????please reply
do we need separate feature selection for categorical variables
w.r.t target variable is a categorical
Hello Aman, I have a doubt regarding threshold used in Variance Threshold, I mean what does this threshold signifies? And how we will decide that what threshold value should be chosen when?
Sir your videos are very very helpful
Thanks Mayank.
Very good Aman
Thank you
hello sir if there are so many techniques for feature selection how do we get to know what techniques to use when? chi and anova test looks similar which to prefer when? I have used Pearson's correlation to overcome multicollinearity at times... How to perform feature selection when there are around 150 features?
Hi Aman thank again for this video.
I have a question.
Who will suggest us the threshold value for correlation, is it business expert.
For n numbers of categorical features who will decide the number of to features ?
overall who is the person who decide threshold value for each way to select feature, can choose of our own of this decision is taken from Business expert.
HI Ajay,Domain+Experience will come into picture. Whoever suggest. Suppose there is medical data where there are 200 distinct features, now if I want to create bucket domain comes handy. Suppose I have 500 columns and 1 million rows run to a correlation analysis, experience will help me what will be the suitable threshold though research suggest 0.85/0.90.
Sometimes we can take a lower value like 0.75 based on how my features are. Let's say I have different info in most features for example
Hello, Thanks for the explanation. I have one question. My question is, Does using best features helps to reduce the training data sets. Say I do not have a large datasets, but I can make independent variable that is highly corelated with the dependent variable, will it help me reduce my traning data sets. Your response will be highly valuable.
Yes, but model may not be suitable for practical purposes
Hi Aman,
Great tutorial as usual but I have one question.
if chi square is used for categorical variables then why are you using chi square test for continuous variable?
That variable may be categorical only, which part of video?
@@UnfoldDataScience In the Iris dataset, where you are using SelectKbest with Chi Square (@13:50). The iris dataset has continuous variables if I am not wrong
yes please cover the wrapper and embedded
Thanks Ankit. Sure.
How to find the target variable and feature selection when there are multiple numerical and categorical variables?
How we will select optimal threshold value...
There can't be one size fits all kind of value, it will depend on how many features you are loosing on a threhold, how may u think u need based on domain understanding, multiple things come into picture. So start from zero and move little up and see how it is coming
Sir we want feature selection methods machine learning using pytho we want fast
Kindly shre the google drive link for this code
github.com/UnfoldDataScience
filter category
Sure
Hi Aman, we use chi2 test on top of categorical variables. But here petal length and width are numerical variable. Can you please explain this?