Life is stochastic gradient descent: - Thrashing? Slow down. - Stuck in a rut? Change your learning rate. - Once you get going, momentum carries you. - Not every step you take is in the right direction, but trust the process to get you where you need to go.
You are so skilled; you really know how to teach such complicated topics with such ease. Glad to found your channel. will recommend it to all my friends trying to learn machine learning.. Thank you.
I HAVE A DOUBT: Agar learning schedules use karenge to learning rate decrease karega after every epoch. But agar hamara function non-convex hua, to uss case mei chance badh jaega ki vo local minima par hi stuck ho jae aur global minima tak na pahunch paye. Kyunki learning rate decrease karega to vo kam deviate karega. Does this mean that learning schedules are not good for non-convex functions? Please reply if anyone knows about it..
just a thing suppose for 10 epochs and 100 rows batch gd will perform 1000 iterations and similarly sgd will also perform 1000 iteration i think maybe due to np.dot over entire matrix works fast therefore batch gd is faster than sgd for same epochs
The process of updating parameters (coeffs and intercept) for each row takes more time in SGD. GD will update them only 10 times meanwhile SGD have to update them 1000 times. Thus at such a low dataset SGD is slow.
TypeError Traceback (most recent call last) in ----> 1 reg.fit(X_train,y_train) TypeError: fit() missing 1 required positional argument: 'y' am getting this type of error . can any one help?
feels illegal to watch such great content for free
correct bro
Life is stochastic gradient descent:
- Thrashing? Slow down.
- Stuck in a rut? Change your learning rate.
- Once you get going, momentum carries you.
- Not every step you take is in the right direction, but trust the process to get you where you need to go.
Well said bro❤❤❤
You are so skilled; you really know how to teach such complicated topics with such ease. Glad to found your channel. will recommend it to all my friends trying to learn machine learning.. Thank you.
Well said 😊
after completing more than 50 videos of playlist now I like videos then I watch it.
28:00 the major time difference concept between the two versions of Gradient Descent is explained so well that I can never forget it
Best sir ho aap...
👍👍 May you touch skies in Life😎😎😎
27:38 the smile when SGD gave More Time......
Then well Defended ::: HERO
You have such an important skill will recommend your channel to my frnd
One of the BEST mentor out there
You are very underrated sir... Hats off to your efforts
Best explanation!!
Sir please explain about other optimizer like adam, adagrad etc
sir nothing to say as always best .....u are a great teacher ...💕💕💕💕💕💕💕💕
very good teaching and explaination....thanks!
thanks Sir Nitish.
completed on 13th September 2024, 10:15PM
Thank You Sir.
God level teaching
GOD bless you, your videos are very helpful :)
best explanation!
Really very helpful, great
Amazing 😮.. thank you so so much
Great Video Great Teacher
Great Explanation sir.
So, if the value of idx gets repeated like it's random then will it just keep updating the gradients of same row? BTW, great explaination sir.
36:50 If the line is not crossing the curve, it can also be a concave curve.
I HAVE A DOUBT:
Agar learning schedules use karenge to learning rate decrease karega after every epoch. But agar hamara function non-convex hua, to uss case mei chance badh jaega ki vo local minima par hi stuck ho jae aur global minima tak na pahunch paye. Kyunki learning rate decrease karega to vo kam deviate karega.
Does this mean that learning schedules are not good for non-convex functions?
Please reply if anyone knows about it..
Love this
just a thing suppose for 10 epochs and 100 rows batch gd will perform 1000 iterations and similarly sgd will also perform 1000 iteration i think maybe due to np.dot over entire matrix works fast therefore batch gd is faster than sgd for same epochs
The process of updating parameters (coeffs and intercept) for each row takes more time in SGD. GD will update them only 10 times meanwhile SGD have to update them 1000 times. Thus at such a low dataset SGD is slow.
Amazing
Here come the saviour 👍❤️❤️❤️
plz provide all of the notes that you write on One note day vise
Revising my concepts.
July 29, 2023😅
me too bro.. i have an interview next week
No words for you❤
Bestest❤
Sir if I don't use random index I just put 'i' so its give same result also not varry when I run it multiple time??
So can we use it??
Sir, do i need to write the complte code if i want to use the batch gradient descent?
you can use, for example, the SGDRegressor from sklearn.
what if we dont pick random points in stochastic G D ?
thankyou!!
done✅
❤❤
TypeError Traceback (most recent call last)
in
----> 1 reg.fit(X_train,y_train)
TypeError: fit() missing 1 required positional argument: 'y'
am getting this type of error . can any one help?
1:25 Problem with batch gradient descent
❤
I am already get 43% normal linear regression method..ofter I used GD at tha same 43% ..why I am use GD ,lot of code ?
GD is used in the case of large datasets. Since it is faster
Khan sir duplicate 😍❤️🤭
SGD is stochastic but your high teaching standard is static...
31:45
Sir if the batch gradient descent is not useful than why did you teach one hour