Hi everyone! The code and explanations behind this video are here - github.com/VikParuchuri/zero_to_gpt/blob/master/explanations/linreg.ipynb . You can also find all the lessons in this series here - github.com/VikParuchuri/zero_to_gpt .
Hi Vik! Thanks o much for the amazing work! Your content is always one of my best choices when it comes to learning DataScience and ML. I have a doubt though about the video in minute 40:56. You mention that in the init_params function, if we substract 0.5 from the result of np.random.rand() , it would rescale weights from -0.5 to 0.5. But wouldn't it just gives us (randomly) some negative values (depending also on the chosen seed) whenever the ones returned by np.random.rand() function are less than 0.5? Thanks so much again and please, keep on doing what you do! I've already come a long way thanks to all your work!
finally I have managed to implement the gr descent for linear regression myself :-).. almost with no looking back to Vik's notebook. Can consider now that I understand how it works and all math underlying. Just curious, why my final weights and bias are very different compare to that sklean is calculating ? I plot all three - original test labels, calulated via my own procedures and calculated via sklearn.. I see that my is less acurate vs sklearn. Why it could be ?
Congrats on implementing it yourself! Scikit-learn doesn't use gradient descent to calculate the coefficients (I believe they use analytical solutions in most cases). This would lead to a different solution. Even when using gradient descent, it is possible to use better initializations or optimizers (ie, don't use SGD). I would only be concerned if your error is significantly higher (say more than 50% higher), or your gradient descent iterations aren't improving over time.
@@Dataquestio thanks.. I played further with more iterations and got mae better than sklearm given. Just as I understand this doest matter much due possible overfitting.. right?
Exactly at 19:44, you mention that the derivative of loss function regarding b is the same as loss function but I don't think so, because derivative of : dL/db ( (wx+b) - y )^2 = 2((wx+b)-y) and dL/dw = 2x((ws+b)-y) can anyone help me out ?
@Seekersbay Learn to say that politely rather than a command when the man‘s actually putting content out there for everyone. Replace your ´should‘ with could and a please, it changes the tone a lot Ser…
Hi everyone! The code and explanations behind this video are here - github.com/VikParuchuri/zero_to_gpt/blob/master/explanations/linreg.ipynb . You can also find all the lessons in this series here - github.com/VikParuchuri/zero_to_gpt .
Please keep doing these, they are really excellent!
Is there a discord to discuss the projects on this channel?
i think the derivative of the loss function should be 160×(80w+11.99−y), instead of 2×(80w+11.99−y)
Hi Vik! Thanks o much for the amazing work! Your content is always one of my best choices when it comes to learning DataScience and ML. I have a doubt though about the video in minute 40:56. You mention that in the init_params function, if we substract 0.5 from the result of np.random.rand() , it would rescale weights from -0.5 to 0.5. But wouldn't it just gives us (randomly) some negative values (depending also on the chosen seed) whenever the ones returned by np.random.rand() function are less than 0.5? Thanks so much again and please, keep on doing what you do! I've already come a long way thanks to all your work!
Thanks :) np.random.rand returns values from 0 to 1 by default, so subtracting .5 will rescale that range to -.5, .5 .
love your work, so clear
👌 excellent
finally I have managed to implement the gr descent for linear regression myself :-).. almost with no looking back to Vik's notebook. Can consider now that I understand how it works and all math underlying. Just curious, why my final weights and bias are very different compare to that sklean is calculating ? I plot all three - original test labels, calulated via my own procedures and calculated via sklearn.. I see that my is less acurate vs sklearn. Why it could be ?
Congrats on implementing it yourself! Scikit-learn doesn't use gradient descent to calculate the coefficients (I believe they use analytical solutions in most cases). This would lead to a different solution.
Even when using gradient descent, it is possible to use better initializations or optimizers (ie, don't use SGD).
I would only be concerned if your error is significantly higher (say more than 50% higher), or your gradient descent iterations aren't improving over time.
@@Dataquestio thanks.. I played further with more iterations and got mae better than sklearm given. Just as I understand this doest matter much due possible overfitting.. right?
@@Dataquestio playing further I have implemented floating learning rate and got faster convergence as well as far better MSE :-)
Exactly at 19:44, you mention that the derivative of loss function regarding b is the same as loss function but I don't think so, because derivative of :
dL/db ( (wx+b) - y )^2 = 2((wx+b)-y)
and
dL/dw = 2x((ws+b)-y)
can anyone help me out ?
Very useful,Thanks
amazing
Thanks for the tutorial! Could you also add access to the data 'clean_weather.csv'
You should be able to download the file here - drive.google.com/file/d/1O_uOTvMJb2FkUK7rB6lMqpPQqiAdLXNL/view?usp=share_link
@Seekersbay Learn to say that politely rather than a command when the man‘s actually putting content out there for everyone. Replace your ´should‘ with could and a please, it changes the tone a lot Ser…
This is the easy form of the gradient, how about when we have a difficult form of cost function ?
That’s your job, to make the next step.
What i am neet step by step bussiness analyst