krish sir just one thing to say...i too teach myself sometimes to school children,the thing is the effort you are putting in making these videos at free of charge is commendable...May god bless you sir..I am gaining confidence too after seeing ur videos and and thus becoming a data scientist
Thanks for putting your efforts in making these in-depth videos which clarifies concepts in detail. Your videos are helping students like me who are very new to the ML and AI field.
I watched 10 videos but yet i didn't code anything, but i am sure whenever I will code. I will be perform in more clearly.because these are videos are focusing on more basics and defining the more depth of ANN. Thank you so much sir. 🥰🥰😘🇮🇳🇮🇳
Krish Sir you are my favorite Teacher...your lessons and explanation's are simple and easy to understand , me like B grade student also can understand the concepts. Thank you Sir.
Great stuff. But I have to listen several times to understand given our different dialects. Much appreciation for your work and explanations!! Excellent!
That's the good video Krishna, I never thought about the random forest playing a similar mechanism when the first time I was studying dropout. good, you've cleared my concept with this video. Thanks!
Thank you very much, You have been an angel for me. Please upload a video on the theory part of SVM, K-Means or other unsupervised ML. Thanks a lot once again. Hari Om
@@MrBemnet1 No bro.The way he teach is very complicated to me.So i decide to learn a new way.When i have a little bit knowledge that i understand,i will try to retake the course so it will be easier than before.What about u?
@@nabiltech1366 I dont some of the concepts right a way.i will check other resources then come back and view it again. I will finish it everything within 2 weeks
Extraordinary teaching style step wise.You made all my concepts clear , Can you please add some practical implementation of neural network models in which all these techniques can be used. like dropout, loss function , learning rate , regularization , optimizer in one model implementation..Thanks in advance...
Hi Krish, Thanks for making such nice videos and excellent explanation. Finally I have found somethingl I was looking for better understanding of deep learning.
Hello Krishna, first of all thanks you so much for the videos as lot of my queries are getting cleared up by watching your videos. I have a better understanding of Neural Networks now with all the maths behind it. I have one query though for this particular video : What is Batch Normalization in Neural Networks and how does it help in preventing over-fitting problems in a neural network?
I think during test time we should multiply the weights with keep probability value = (1- dropout rate). Intuitively keep probability means how many % of times we have used that weight or edge or connection to train our NN. please correct me if i am wrong Krish sir.
P-value in drop rate section of the middle layer would be 0.6 ( blocking 60% instead of 0.5 ( single value of 1.0 means no dropout and value of 0.0 is full dropout or no output from that layer)......u keep on repeating that, plz rectify it
The video explains the concept of dropout layers in deep neural networks, which helps prevent overfitting by randomly deactivating a subset of neurons during training. Key moments: 00:00 Artificial neural networks with many weights and bias parameters can lead to overfitting issues, dropout regularization helps prevent overfitting by randomly dropping units during training. -Explanation of overfitting in deep neural networks due to excessive parameters and the need for regularization techniques like dropout. -Comparison between underfitting in single-layer neural networks and the role of multiple layers in preventing underfitting in deep neural networks. -Introduction to dropout regularization as a technique to prevent overfitting by randomly dropping units during training, with a reference to the 2014 thesis by Srivastava and Hinton. 03:54 The video discusses the concept of dropout layers in neural networks, where a subset of features or neurons are randomly deactivated during training to prevent overfitting and improve model generalization. -Explanation of how dropout layers work in neural networks by randomly deactivating a subset of features or neurons during training to improve model generalization. -Comparison of dropout layers in neural networks to the concept of selecting subsets of features in random forests to create diverse decision trees for better model performance. 07:25 Dropout layer in neural networks randomly deactivates some neurons and activates others during training to prevent overfitting, similar to random forest's feature selection and majority voting. Test data connects all neurons without deactivation or activation, using weights multiplied by dropout probabilities for prediction. -Comparison of dropout layer with random forest for feature selection and majority voting to prevent overfitting in neural networks. -Explanation of how test data is handled in dropout layer, connecting all neurons without deactivation or activation, and using weights multiplied by dropout probabilities for prediction. -Selecting the dropout ratio (p-value) through hyperparameter optimization to prevent overfitting in deep neural networks, with a recommendation for p-value above 0.5.
Thank you Krish for the video, this is excellent!! One question, drop out will be applied at each epoch, then how does it combine the results from all the epoch?
I have a doubt - On every iteration drop out ratio of any particular layer remains same or not? If not then do we take average to multipy with weights for test data ?
just a question . during back propogration, for each neuron we get updated weights. Now when we back propogate to starting, and again random starting feature points are chosen, what happens to back propogated weights?
Hi Sir, I have a doubt. If we take p=0.5 half of the features which will be deactivated at 1st epoch will be reactivated in 2nd epoch and same goes on for other features in upcoming epochs as well Please explain
Great effort Krish! I like your passion. I have a one confusion about drop-out ratio. Why are you using drop-out ratio of 0.5 for input layer ? According to my knowledge that should be higher (i.e 1.0 or 0.9).
Hello @Krish Naik, You mentioned in Video that for test data w should be multiplied by P. Do we need to write a code for that in Model ? Does it happens aromatically?
Hi @Krish, I got asked in an interview, what if we remove one hidden layer instead of DropOut, wont it be good to remove one Hidden Layer instead of DropOut, Can you please help me with the Answer.
I just have a little query if we keep activating and de-activating neurons while training doesn't it cause overfitting when testing with all neurons activated at once which were trained in some different combinations during training
For training data suppose we are ignoring few features and neurons as per the drop out ratio and calculating the weights and with back propagation v r updating the weights. In the second step another set of features and neurons are selected randomly, Now if we are again calculating the new weights that doesn't make sense rights as this will keep on repeating with different random combinations.... Please correct me if I am wrong...Thanks in advance.
Sir as we are randomly selecting some features or neurons, then those are being updated according to that set of neurons in that particular FP and BP, so how come the model is going to predict the right answer when all the neurons are activated together for Test data as we have trained the weights of the neurons when there where less number of the activated neuron, like how, the model will sum up all the weights to give the right prediction(with least error).
I think there is a mistake in the explanation when dealing with test time. If p is the probability of dropping a neuron, then the weights should be multiplied by 1-p during test time
Sir I have a doubt that when the neurons are randomly selected base on p value then for next epochs from which neurons the random selection which will performed activated ones or all of them
if we apply drop out ratio is there any chance that the features which are selected first time get selected in second time..or new features get selected.
their weights are the same as before because you didn't update them using backpropagation. You only update the weights corresponding to neurons that are activated at an iteration. So in the next iteration, if we happen to activate the neuron which was not active on the last iteration it's weight will be the same until backpropagation updates it (because that neuron is active now and hence will get updated).
you have not explained why everything will be connected for test data. you explained the calculation after connected. but i would like to know why everything connected? what happens if we use dropout for the test data.
Krish i have a doubt. Suppose i have 5 inputs & 5 neurons in my 1st hidden layer. In training time, i have given drop out ratio as 0.5, & due to this suppose 2 inputs & 2 neurons got deactivated. In this case now we have 3 i/p & 3 neuron left, so 9 weights we have to train. But at testing time we have to multiply 'p' value with 25 weights as testing time all i/p & neurons exists. So how to do this?
krish sir just one thing to say...i too teach myself sometimes to school children,the thing is the effort you are putting in making these videos at free of charge is commendable...May god bless you sir..I am gaining confidence too after seeing ur videos and and thus becoming a data scientist
You are the mentor every aspiring data scientist needs, Thanks!!
I am Msc. student from Ethiopia, Really to tell you the fact I have learnt a lot from your videos. May God bless your mind!!
Just I can see ur face is full of happiness when u explains a concept
I guess u r like 🙏🙏
Thanks for putting your efforts in making these in-depth videos which clarifies concepts in detail. Your videos are helping students like me who are very new to the ML and AI field.
I watched 10 videos but yet i didn't code anything, but i am sure whenever I will code. I will be perform in more clearly.because these are videos are focusing on more basics and defining the more depth of ANN. Thank you so much sir. 🥰🥰😘🇮🇳🇮🇳
Krish Sir you are my favorite Teacher...your lessons and explanation's are simple and easy to understand , me like B grade student also can understand the concepts. Thank you Sir.
Great stuff. But I have to listen several times to understand given our different dialects. Much appreciation for your work and explanations!! Excellent!
This deeplearning series is extremely good.
I found it extremely useful, easier to understand than many known experts
That's the good video Krishna, I never thought about the random forest playing a similar mechanism when the first time I was studying dropout. good, you've cleared my concept with this video. Thanks!
i was alwasy confuse about deep learning beacuse of u i got clarity
This man makes ML a cakewalk!
Thank you very much, You have been an angel for me. Please upload a video on the theory part of SVM, K-Means or other unsupervised ML. Thanks a lot once again. Hari Om
Really Like the way you explain! I have just completed Udemy Bootcamp and you are definitely reinforcing what I have learned. Keep up the good work!
Hello Krish.Came to know about the use of random forest in deep learning.Thanks
Thank you. Much easier to understand than the one by Andrew Ng.
But can't ignore the fact , that he is God in AI
Do u take and finish Andrew Ng course?
@@nabiltech1366 half way . did you finish
@@MrBemnet1 No bro.The way he teach is very complicated to me.So i decide to learn a new way.When i have a little bit knowledge that i understand,i will try to retake the course so it will be easier than before.What about u?
@@nabiltech1366 I dont some of the concepts right a way.i will check other resources then come back and view it again. I will finish it everything within 2 weeks
The effort in these Videos !!!
Thanks Krish !!!
hats off to you sir,Your explanation is top level, THnak you so much for guiding us...
Thanks a lot Krish for your best explanation.
Great explanations, thank you very much sir
Thanks for the sessions... These are precise and organized...
You have a knack of making things short and simple and easy to grasp :)
sir i think your enjoying this teaching ?
your expressions indicating you are enjoying the teaching ...
Love the Deep Learning Series. Great Learning !!
Extraordinary teaching style step wise.You made all my concepts clear , Can you please add some practical implementation of neural network models in which all these techniques can be used. like dropout, loss function , learning rate , regularization , optimizer in one model implementation..Thanks in advance...
how simply he explained it .
Krish: You are the very best trainer
Hi Krish, Thanks for making such nice videos and excellent explanation. Finally I have found somethingl I was looking for better understanding of deep learning.
simple and clear explanation
your all videos are very useful ...thanks alot for this good work
Great service. Amazing Explanation!!
Hello Krishna, first of all thanks you so much for the videos as lot of my queries are getting cleared up by watching your videos. I have a better understanding of Neural Networks now with all the maths behind it. I have one query though for this particular video : What is Batch Normalization in Neural Networks and how does it help in preventing over-fitting problems in a neural network?
It's really very good lecture series
I am watching your videos from few months and I learned a lot, your channel deserve subscription, I subscribed your channel
You explain very good! Thank you!
Your lectures are superb
i really love your energy
very well explained. thankyou
Sir you are amazing! , you have cleared everything.
Can there be a better explaination? Simply perfect!!
Thank you. It was so helpful.
I think during test time we should multiply the weights with keep probability value = (1- dropout rate). Intuitively keep probability means how many % of times we have used that weight or edge or connection to train our NN. please correct me if i am wrong Krish sir.
You teach very well... Gr8 stuff about Data Science in your channel. Thanks Harish!
It's Krish buddy
Thanks Krish
Hi Krish, great work, real smooth and informative explanation
Thanks a lot, sir, very good explanation.
Good work as usual krish... Awaiting its implementation 🙏🙏
Great explanation 👍
guys please note that .....If you're dropping neurons or activation functions at the rate of p then 1-p will be multiplied at test phase.
thank u from Iraq .. Good Job brother
krish..you make my life easier
Amazing explanation but what happen if p=0 Or p=1?
P-value in drop rate section of the middle layer would be 0.6 ( blocking 60% instead of 0.5 ( single value of 1.0 means no dropout and value of 0.0 is full dropout or no output from that layer)......u keep on repeating that, plz rectify it
I have a doubt.
In test data which neurons are not activated we are doing p*w but which neurons are activated what will we doing in that case?
Hi, In this video, when we are going to apply for test data...what will be the weight of deactivated neurons
The video explains the concept of dropout layers in deep neural networks, which helps prevent overfitting by randomly deactivating a subset of neurons during training.
Key moments:
00:00 Artificial neural networks with many weights and bias parameters can lead to overfitting issues, dropout regularization helps prevent overfitting by randomly dropping units during training.
-Explanation of overfitting in deep neural networks due to excessive parameters and the need for regularization techniques like dropout.
-Comparison between underfitting in single-layer neural networks and the role of multiple layers in preventing underfitting in deep neural networks.
-Introduction to dropout regularization as a technique to prevent overfitting by randomly dropping units during training, with a reference to the 2014 thesis by Srivastava and Hinton.
03:54 The video discusses the concept of dropout layers in neural networks, where a subset of features or neurons are randomly deactivated during training to prevent overfitting and improve model generalization.
-Explanation of how dropout layers work in neural networks by randomly deactivating a subset of features or neurons during training to improve model generalization.
-Comparison of dropout layers in neural networks to the concept of selecting subsets of features in random forests to create diverse decision trees for better model performance.
07:25 Dropout layer in neural networks randomly deactivates some neurons and activates others during training to prevent overfitting, similar to random forest's feature selection and majority voting. Test data connects all neurons without deactivation or activation, using weights multiplied by dropout probabilities for prediction.
-Comparison of dropout layer with random forest for feature selection and majority voting to prevent overfitting in neural networks.
-Explanation of how test data is handled in dropout layer, connecting all neurons without deactivation or activation, and using weights multiplied by dropout probabilities for prediction.
-Selecting the dropout ratio (p-value) through hyperparameter optimization to prevent overfitting in deep neural networks, with a recommendation for p-value above 0.5.
Such awesome content and explanations!!!
Thank you Krish for the video, this is excellent!! One question, drop out will be applied at each epoch, then how does it combine the results from all the epoch?
Nice Explanation
I have a doubt -
On every iteration drop out ratio of any particular layer remains same or not? If not then do we take average to multipy with weights for test data ?
Great as always! Thank you :)
just a question . during back propogration, for each neuron we get updated weights. Now when we back propogate to starting, and again random starting feature points are chosen, what happens to back propogated weights?
In your sketch - did you really drop a couple of inputs out? Is this allowed in dropout approach?
Sir if we're dropping some input and also hidden layers,
It will not affect our output?
Mean correct predictions
Hi Sir,
I have a doubt.
If we take p=0.5 half of the features which will be deactivated at 1st epoch will be reactivated in 2nd epoch and same goes on for other features in upcoming epochs as well
Please explain
Amazing Sir
Great video 👏
In the next iteration, will the deactivated neurons get activated randomly???
Thank you for this excellent explanation! could you link the original research thesis you mentioned? (or maybe i'm just not finding in description)
Great effort Krish! I like your passion. I have a one confusion about drop-out ratio. Why are you using drop-out ratio of 0.5 for input layer ? According to my knowledge that should be higher (i.e 1.0 or 0.9).
Can you please share the URL of any report related to this regularisation technique
Sir you are great 💖
Hello @Krish Naik, You mentioned in Video that for test data w should be multiplied by P. Do we need to write a code for that in Model ? Does it happens aromatically?
Hi @Krish,
I got asked in an interview, what if we remove one hidden layer instead of DropOut, wont it be good to remove one Hidden Layer instead of DropOut,
Can you please help me with the Answer.
I just have a little query if we keep activating and de-activating neurons while training
doesn't it cause overfitting when testing with all neurons activated at once which were trained in some different combinations during training
Please suggest a good reference book for Deep learning.
Hi, You did not explain how the exploding problem can be corrected - is it through Same RELU ?
For training data suppose we are ignoring few features and neurons as per the drop out ratio and calculating the weights and with back propagation v r updating the weights. In the second step another set of features and neurons are selected randomly, Now if we are again calculating the new weights that doesn't make sense rights as this will keep on repeating with different random combinations.... Please correct me if I am wrong...Thanks in advance.
so after all of this is done the best set of features are selected for that particular output value i guess
Can you please provide link for the Machine Learning playlist?
thank you sir.
Hi, Sir. I would like to know in each epoch of training, does dropout have relations to batch_size?
Sir as we are randomly selecting some features or neurons, then those are being updated according to that set of neurons in that particular FP and BP, so how come the model is going to predict the right answer when all the neurons are activated together for Test data as we have trained the weights of the neurons when there where less number of the activated neuron, like how, the model will sum up all the weights to give the right prediction(with least error).
Sir all weights will be updated as (P*W) while testing data or the P value will be updated as (P*W) ? Please clear this.
I think there is a mistake in the explanation when dealing with test time. If p is the probability of dropping a neuron, then the weights should be multiplied by 1-p during test time
i have a question. we have to add different drop layers for different layers or we have to add once for all layer ?
Can you explain how it is helping to avoid overfitting problem.
Sir I have a doubt that when the neurons are randomly selected base on p value then for next epochs from which neurons the random selection which will performed activated ones or all of them
if we apply drop out ratio is there any chance that the features which are selected first time get selected in second time..or new features get selected.
Best explained:)
So when the neurons are reactivated what are their weights?
their weights are the same as before because you didn't update them using backpropagation. You only update the weights corresponding to neurons that are activated at an iteration. So in the next iteration, if we happen to activate the neuron which was not active on the last iteration it's weight will be the same until backpropagation updates it (because that neuron is active now and hence will get updated).
@@akashkewar thanks for the reply after 8 months😃😃😃♥️
@@shiffin_chippe :D "Better late than never". I hope you are doing fine in life and don't give up.
you have not explained why everything will be connected for test data. you explained the calculation after connected. but i would like to know why everything connected? what happens if we use dropout for the test data.
x0, x1,x2 should not be dropped -- according to Andrew ng
Wouldn't the weights in testing be w(1-p) rather than wp?
This sounds more like stochastic optimization than regularization.
Hi sir, Amazing explanation..
small doubt..
while multiplying 'p' value with weight 'w' for test data,do we include(add) bias value with input??
We have to include the bias..
p=0.7 so 0.7 will be selected or 0.7 will be dropped out?
same question i have.
Krish Naik: that's like the coolest name
Krish i have a doubt. Suppose i have 5 inputs & 5 neurons in my 1st hidden layer. In training time, i have given drop out ratio as 0.5, & due to this suppose 2 inputs & 2 neurons got deactivated. In this case now we have 3 i/p & 3 neuron left, so 9 weights we have to train. But at testing time we have to multiply 'p' value with 25 weights as testing time all i/p & neurons exists. So how to do this?
i think the drop out ratio for other deactivated neurons in the test set would be 0 i guess doesnt make sense though
great!!!