wow, your teaching and explanation both are great, awesome.. You cleared my doubt sir.. this video is very helpful for me. Thanks a lot sir jii👌👌👌👌😊😘🥰😍🙏🙏🙏
Hi Pushkar Patil, If you see it has significant impact on model's performance, then you can include that in hyperparameter tuning. Generally, it doesn't affect the performance.
Yes but it will consider your data. Random state ensures that the splits that you generate are reproducible. Scikit-learn uses random permutations to generate the splits. The random state that you provide is used as a seed to the random number generator.
Thank you for such a clear explanation. I used random forest with same random state for my data which normalize with zscore and min max and conclude same result(f1 score & accuracy). I don't understand why the result are the same, could you guide me?
"Hi maryam sadat seifi, thanks for your comment. F1 score is the harmonic mean of Precision and Recall while accuracy is the measure of all the correctly identified cases. Accuracy is used when the True Positives and True negatives are more important while F1-score is used when the False Negatives and False Positives are crucial. And in your case you get the same f1 and accuracy. Suppose you have something like this: >>> trueY = [0,1,0,1] >>> predY = [0,1,1,0] Here both accuracy and f1_score(binary) are same i.e both are 0.5 But when you have something like this: >>> trueY = [0,1,0,1,0] >>> predY = [0,1,1,0,0] Here you will have accuracy=0.6 anf f1_score(binary)=0.5 I hope you understand."
From lot of videos and explanation, I find this is the best . I have question on how to find that how many random states created for the data set? Is there any API available ?
If you ignore random_state in the code, then whatever your execution be, a new random value is generated and train-test dataset would have different values each time.
"Hi Ravi Sharma, thank you for the comment. Seems like you are asking about model score instead of model test score. Yes generally model score (i.e accuracy, f1-score, auc, roi) is higher in training dataset than in test dataset."
Hi Bijaya Manandhar, you should always use train-test split while training to find how your model is actually performing with those inputs that that have not been used for training.
I realize that this depends on the data set, but would it be safe to assume that the higher the number of the random_state, the better "trained" the model would be?
Hey thats a very good explaination. but i have s doubt here....you said instead of using ramdom state in loop, do hyper parameter tuning of parameters in the model. So while tuning the parameters should we use random state or not
Also what if i havent set the random state and my accuracy varies largely for each run. What does that indicate?. Ideally the accurscy should not chsnge much
Thank you for explanation, though isn't ML supposed to predict the same value for any data? I mean you train the data based on a training set, after validating the model you use it in a real world application where the random state is basically not necessarily existing! Or, am I missing something here?
"Hi thank you for your question. Here random state is used so that you can have random training and test set. Random state 24 will produce a different outcome when compared to 42, which can be used to evaluate your experiment in distinct scenarios."
suppose in a model, with a random_state 19, I am getting greater accuracy. So should I stick on to that random state, ie should I deploy the model with that random_state? or should my model perform well with all other random_state?
"Hi, Antony Joy, While deploying and predicting, you do not need to interact with randomness like you faced in training, you can deploy your model as it is."
@shushmakothapalli9657 When you set a random state to a specific value, the random number generator will produce the same sequence of random numbers each time you run the algorithm with that specific random state.E.g., X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42). Setting random_state=42 ensures that every time you run this code, you'll get the same split of data into training and testing sets
When you use random_state, it gives you same data points for training and test test, no matter how many times you execute your code the result would be the same. And it doesnt matter what value you give. You can give any number. Since many practioners use 42, the leaners also follows this. Changing the values for random_state, is not going affect the performance of the model.
"Hi Krishnendu Dey, You can use looping for it, but it's better to do hyperparameter tuning regarding the train test split and other different estimators of ml algorithm instead of the random state."
Hi, It doesn't matter if the random_state is 0 or 1 or any other integer. What matters is that it should be set the same value, if you want to validate your processing over multiple runs of the code.
Why do we choose random state as 42 very often during training a machine learning model? why we dont choose 12 or 32 or 5? ***Is there a scientific explanation?***
Very helpful! Clear explanation! Thank you
Glad it was helpful!
I totally agree
Thank you sir, very much clear to the point explanation
Glad it helped
Nicely and neatly explained the concept!
Glad you liked it!
Thanks a lot this 11 minute video beats every other explanation available on this topic on internet
Thank You
Ashok, very nicely explained. Thank you very much for clearing the concept.
Welcome and Best of Luck!
Thanks for giving such a comprehensive explanation of Random State.
My pleasure!
Subscribed after your explanation. thank you.
Thanks and welcome
I'm imprest with the way you explain in a so humanly manner
it is super easy to understand
thank you
Thank You! Keep Watching
wow, your teaching and explanation both are great, awesome..
You cleared my doubt sir..
this video is very helpful for me.
Thanks a lot sir jii👌👌👌👌😊😘🥰😍🙏🙏🙏
Glad it was helpful!
Simple and easy to grasp ..cheers
I was searching for this topic but was not satisfied anywhere. Sir has explained it very well
Thank You, Glad to hear
Very helpful Sir 🙏
Glad to hear that. Thank you
Very clear explaination, thank you sir
Glad it helped.
Nobody has ever explained this concept the way you have explained. Thank you. Learnt something new.
Glad it was helpful Thank you!
Thank you for the very useful and informative video
You are welcome
Thank you very much - Finally I understood what is random state. Stay blessed n happy
You're most welcome
Gracias por la explicación me ayudo mucho !!! bendiciones
Glad! It helped you.
it was really good explanation
Thank you!
thanks crisp and clear
You're welcome
Clear explanation with a simple example. Thank you!
you're welcome
Excellent explanation
Glad it was helpful!
good explanation .. thank you !
You are welcome!
great explanation
Thank you
Sir No need of Udemy , Coursera courses After watching your video .....Awesome content
Hi Vidharthi Ranjan, Glad to hear that. Thank You!
Thanks for your efforts. It is now clear to me.
Glad it helped
Well explained!
Thank you
so beautifully explained. thanks a lot
Glad you liked it
very helpful thanks
You're welcome!
thank you for the explanation.Please can you tell me what are the softwares do you use to make this wonderful writing??
me too I need
Man, pretty good explanation.
Thank You!
Thank you . Very well explained.
You are welcome!
very clear explanation thanks for sharing your content !
Glad you enjoyed it!
Great.
Thank u bro
Welcome
Simply.. Wao
Thanks
nice and clear explanation. thank you!
You are welcome!
Thanks a lot
Most welcome
nice explaination
Keep watching
Nice Explanation! Thank You. Is There Technique to select best Random State?
Hi Pushkar Patil, If you see it has significant impact on model's performance, then you can include that in hyperparameter tuning. Generally, it doesn't affect the performance.
superb bro!u explained it brilliantly!
Thank you so much!
thank you
You're welcome
I like what you teach
Thank you
got it, thank u
You are welcome.
Does the random state concept similar to the approach of seed which we use ??
Yes but it will consider your data.
Random state ensures that the splits that you generate are reproducible. Scikit-learn uses random permutations to generate the splits. The random state that you provide is used as a seed to the random number generator.
Thank you for such a clear explanation. I used random forest with same random state for my data which normalize with zscore and min max and conclude same result(f1 score & accuracy). I don't understand why the result are the same, could you guide me?
"Hi maryam sadat seifi, thanks for your comment.
F1 score is the harmonic mean of Precision and Recall while accuracy is the measure of all the correctly identified cases. Accuracy is used when the True Positives and True negatives are more important while F1-score is used when the False Negatives and False Positives are crucial. And in your case you get the same f1 and accuracy.
Suppose you have something like this:
>>> trueY = [0,1,0,1]
>>> predY = [0,1,1,0]
Here both accuracy and f1_score(binary) are same i.e both are 0.5
But when you have something like this:
>>> trueY = [0,1,0,1,0]
>>> predY = [0,1,1,0,0]
Here you will have accuracy=0.6 anf f1_score(binary)=0.5
I hope you understand."
From lot of videos and explanation, I find this is the best . I have question on how to find that how many random states created for the data set? Is there any API available ?
Hi VIGNESH SRIDHARAN, you can use the same procedure as used in the video to find the available number of random state for the given datasets.
great video, thanks! But why do people use mostly 42 or 0 as a random state in Random Forest?
There is no specific reason. It doesn't have any impact on performance. It's just being followed by many.
@@DataMites great, thanks for the reply!
Can you tell what is the significance of Random State in Kmeans, Sk learn library ?
If you ignore random_state in the code, then whatever your execution be, a new random value is generated and train-test dataset would have different values each time.
Is the model test score of (x_train,y_train) is greater than (x_test, y_test)?
"Hi Ravi Sharma, thank you for the comment.
Seems like you are asking about model score instead of model test score. Yes generally model score (i.e accuracy, f1-score, auc, roi) is higher in training dataset than in test dataset."
awesome
Thank you!
Wouldn't there be 119 possible states (not 120) when counting starts from zero, in the example mentioned in the video ?
Hi, yes you will have total 120 (0 to 119) different state.
I'm a new learner. Could you please tell me why we sometimes use train_test_split function, and sometimes not, thanks
Hi Bijaya Manandhar, you should always use train-test split while training to find how your model is actually performing with those inputs that that have not been used for training.
I realize that this depends on the data set, but would it be safe to assume that the higher the number of the random_state, the better "trained" the model would be?
Not always but yes trial and error you must do
Hey thats a very good explaination. but i have s doubt here....you said instead of using ramdom state in loop, do hyper parameter tuning of parameters in the model. So while tuning the parameters should we use random state or not
Also what if i havent set the random state and my accuracy varies largely for each run. What does that indicate?. Ideally the accurscy should not chsnge much
Hi akshay ranpise,you can use random state while tuning the parameters.
Yes, you can use seed() method to overcome that. For more: docs.python.org/3/library/random.html
Thank you for explanation, though isn't ML supposed to predict the same value for any data? I mean you train the data based on a training set, after validating the model you use it in a real world application where the random state is basically not necessarily existing! Or, am I missing something here?
"Hi thank you for your question.
Here random state is used so that you can have random training and test set. Random state 24 will produce a different outcome when compared to 42, which can be used to evaluate your experiment in distinct scenarios."
suppose in a model, with a random_state 19, I am getting greater accuracy. So should I stick on to that random state, ie should I deploy the model with that random_state? or should my model perform well with all other random_state?
"Hi, Antony Joy,
While deploying and predicting, you do not need to interact with randomness like you faced in training, you can deploy your model as it is."
sir how to find best random state ?
tell me with the code
@shushmakothapalli9657 When you set a random state to a specific value, the random number generator will produce the same sequence of random numbers each time you run the algorithm with that specific random state.E.g., X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42). Setting random_state=42 ensures that every time you run this code, you'll get the same split of data into training and testing sets
Sir, why everytime random state is selected as 42? What's the logic behind it?
When you use random_state, it gives you same data points for training and test test, no matter how many times you execute your code the result would be the same. And it doesnt matter what value you give. You can give any number. Since many practioners use 42, the leaners also follows this. Changing the values for random_state, is not going affect the performance of the model.
How to determine what value of random state would give me the best score for a given model ?
"Hi Krishnendu Dey, You can use looping for it, but it's better to do hyperparameter tuning regarding the train test split and other different estimators of ml algorithm instead of the random state."
If I've 8000 rows with 30 columns...how can I find which is the best random_state
Hi, It doesn't matter if the random_state is 0 or 1 or any other integer. What matters is that it should be set the same value, if you want to validate your processing over multiple runs of the code.
Why do we choose random state as 42 very often during training a machine learning model? why we dont choose 12 or 32 or 5?
***Is there a scientific explanation?***
Hi Mehmet Arslan, You can choose any random state. Please watch the whole video to understand the concept of random state.
@@DataMites thanks i will watch
noice
Please correct the formula nCr = n! / (n-r)! r!
Hi srikant raman, thanks for pointing it out. Calculation was done with (n-r) value but seems there was some error showcasing the formula.
@@DataMites No issues ! Greatly appreciate the response
wrong combi formula
nCr= n!/r!(n-r)! at 5:01 in the video, the formula should be changed to the given formula
good explanation. thank you!
You're welcome!
Thank you for such a clear explanation.
Glad it was helpful!
thank you
You're welcome