Gradient Boosting : Data Science's Silver Bullet

ritvikmath

Просмотров 72 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 23 дек 2024

Комментарии • 87

@ew6392 2 года назад ⁺⁵⁵
Man I've discovered your channel and am watching your videos non-stop. No matter which topic, it is ALL as if a stream of light shines and makes it all understandable. You've got a gift.
@zhenwang5872 Год назад
Agreed! You've got a gift to shine the light over topics.
@sELFhATINGiNDIAN 7 месяцев назад
No
@KameshwarChoppella 7 месяцев назад ⁺⁶
Non math person here and even i could understand this tutorial. Probably have to see it a couple more times because I'm a bit slow in my 40s now. But you really have a gift. Keep up the good work.
@shnibbydwhale 3 года назад ⁺¹¹
You always make your content so easy to understand. Just the right amount of math mixed with simple examples that clearly illustrate the main ideas of whatever topic you are talking about. Keep up the great work!
@mrirror2277 3 года назад ⁺⁹
Hey thanks a lot, was literally just searching about Gradient Boosting today and your explanations have always been great. Good pacing and explanations even with some math involved.
@nikhildharap4514 2 года назад ⁺¹
you are awesome man! I just love coming back to your videos every time. they are just the right length, and the perfect depth.. Kudos!
@jiangjason5432 Год назад ⁺⁵
Great video! A bonus for using squared error loss (which is commonly used) as the loss function for regression problems: the gradient of squared error loss is just the residual! So each weak learner is essentially trained on the previous residual, which makes sense intuitively. (I think that's why each gradient is called "r"?)
@samirkhan6195 3 месяца назад
Yeah, squared error is easily differentiable compared to others like root squared error, and is not dependent upon number of observation like mean squared error or root mean squared error does , if you want gradient exactly equal to residual , you can choose to take (1/2)(squares error) as loss function.
@hameddadgour 9 месяцев назад ⁺¹
This is a fantastic video. Thank you for sharing!
@ritvikmath 9 месяцев назад
Glad you enjoyed it!
@soroushesfahanian5625 6 месяцев назад ⁺⁶
The last part of 'Why does it work?' made all the difference.
@javierperezvargas9132 6 месяцев назад ⁺¹
totally agree
@АннаПомыткина-и8ш 3 года назад ⁺¹
Your videos on data science are awesome! They help me to prepare for my university exam a lot. Thank you very much!
@arjungoud3450 2 года назад
Man U r the 5th person, none has explained as simple and clear as you, thanks a ton
@MiK98Mx 2 года назад ⁺²
incredible video, you make understandable a really hard concept. Keep teaching like this and big things will come!
@alicedennieau5459 2 года назад
Completely agree, you are changing our lives! Cheers!
@honeyBadger582 3 года назад ⁺⁴
Great video as always! I would love If you could build on that video and talk about XGBoost and math behind it next!
@pgbpro20 3 года назад
I worked on this 5(?) years ago, but needed a reminder - thanks!
@jakobforslin6301 2 года назад
You're an amazing teacher, thanks a lot from Sweden!
@luismikalim2535 Год назад
Thanks for the effort u put in to help ur watchers understand, it really helped me understand the concept behind gradient descent!
@marcosrodriguez2496 3 года назад ⁺⁵
your channel is criminally underrated. Just one question. You mentioned using linear weak learners, i.e. f(x) is a linear function of x. In this case how would you ever get anything other than a linear function after any number of iterations? at the end of the day, you are just adding multiple linear functions. it seems this whole procedure would only make sense, if you pick a nonlinear weak learner.
@grogu808 Год назад
Unbelievable variety of topics in this channel! What is your daily job? You have an amazing amount of knowledge
@Andres186000 3 года назад ⁺¹
Thanks for the video, also really like the whiteboard format
@Sanatos98 2 года назад
Pls don't stop making these videos
@adityamohan7372 10 месяцев назад
Finally understood it really well, thanks!
@Ranshin077 3 года назад ⁺¹
Very awesome, thanks for the explanation 👍
@chau8719 14 дней назад
Waw thank you so much for this amazingly clear video explanation 🤗!!! Instantly subscribed :)
@Halo-uz9nd 3 года назад
Phenomenal. Thank you again for making these videos
@jonerikkemiwarghed7652 2 года назад
You are doing a great job, really enjoying your videos.
@MiladDana-b7h 5 месяцев назад
that was very clear and useful, thank you
@markus_park Год назад
Thank you so much! You just blew my mind
@ritvikmath Год назад
You're very welcome!
@joachimheirbrant1559 Год назад ⁺¹
thanks man you explain it so much better than my uni professor :)
@ritvikmath Год назад ⁺¹
Glad to hear that!
@rajrehman9812 2 года назад ⁺⁴
Can mathematics behind ML be less dreadful and more fun? Well yes, if we have a tutor like him... amazing explanation ❤️
@garrettosborne4364 2 года назад
Best boosting definition yet.
@dialup56k Год назад
well done - gee there is something to be said about a good explanation and a whiteboard. Fantastic explanation.
@ritvikmath Год назад
Thanks!
@benjaminwilson1345 Год назад
Perfect, really well done!
@ritvikmath Год назад
Thanks!
@domr.2694 2 года назад
Thank you for this good explanation.
@kaustabhchakraborty4721 Год назад ⁺¹
Just asking that is the concept of gradient Boosting similar to Taylor Series functions. Each term is not very good at predicting the function but as u add more functions(terms), the approximation to the function gets better.
@Sam-uy1db 11 месяцев назад
So so well explained
@Matt_Kumar 3 года назад ⁺²
Any chance your interested in doing a video on EM algorithm intro with a toy example? Love your videos please keep them coming!
@GodeyAmp 7 месяцев назад
Great video brother.
@estebanortega3895 2 года назад
Amazing video. Thanks.
@parthicle 2 года назад
You're the man. Thank you!
@Artem_Vashina 17 дней назад
Hi! Why do we use f2(x) instead of raw r1_hat? I mean why to make predictions of residuals and use them if we already have the exact value of gradient ?
@sohailhosseini2266 Год назад
Thanks for sharing!
@sophia17965 Год назад
Thanks! great videos.
@ИльдарАлтынбаев-г1ь 8 месяцев назад
Man, you are amazing!
@KevinGodfreyVerpula Месяц назад
one question , in Step 3 , is your target variable , the gradient with respective to the previous prediction? if so , dont you think there is a possibility of it becoming infinity and we try to fit something to infinity?
@jamolbahromov4440 2 года назад
Hi, thank you for this informative video. I have some problem understanding the graph at 5:27. How do you map out the curve on the graph if you have a single pair of prediction and loss function values. do you create some mesh out of the give pair?
@jeroenritmeester73 2 года назад ⁺¹
In words, is it correct to phrase Gradient Boosting as being multiple regression models combined, where each subsequent model aims to correct the error that the previous models couldn't account for?
@ianclark6730 3 года назад
Love the videos! Great topic
@EW-mb1ih 3 года назад
let's talk about the first word in gradient boosting..... boosting :D Nice video as always
@zAngus 9 месяцев назад
Thumbs up for the pen catch recovery at the start.
@ritvikmath 9 месяцев назад
😂
@bassoonatic777 3 года назад
Excellently explained. I was just reviewing this and was very helpful to see how someone else think through this.
@user-xi5by4gr7k 3 года назад ⁺¹
Great video! Never seen gradient descent used with the derivative of the loss function with respect to the prediction. Not sure if I understand it 100% but If the gradient were, for example, -1 for ri, would the subsequent weak learner fit a model to -1? Or would the new weak learner fit a model to (old pred -(Learning Rate * gradient))? Would love to see a simple example worked out for 1 or 2 iterations if possible. Thank you! :)
@7vrda7 2 года назад
great vid!
@ganzmit 4 месяца назад
nice video series
@tehgankerer 2 месяца назад
Its not super clear to me how or where the learning rate comes into play here and what its relation to the scaling factor gamma is.
@arjungoud3450 2 года назад
Can you please make a video on XGBoost and its advantages by comparing. Thank you.
@VictorianoOchoa Год назад
are the initial weak learners randomly selected? If so, can this initial random selection be optimized?
@chocolateymenta 2 года назад
great video
@mitsuinormal Год назад
Yeiii you are the best !!
@xmanxman1527 Год назад
Isn't gradient the partial derivative with respect to feature(xi), not with respect to the prediction(y^)?
@chiemekachinaka5236 3 месяца назад
Thanks man
@rickharold7884 3 года назад ⁺¹
Hmmmm v interesting. Something to think about. Thx
@m.badreddine9466 11 месяцев назад
move on so I can get screenshot 😂.
brilliant explanation ,well done
@ashutoshpanigrahy7326 2 года назад
after 4 hrs. of searching in vain, this has truly proven to be a savior!
@emirhandemir3872 5 месяцев назад
The first time I watched this video, I understood shit! Now the second time, I studied the subject and learn more :), it is much more clear now :)
@lashlarue7924 Год назад
Bro, it's late AF and I'm not gonna lie, I'm passing out now, but I'mma DEFINITELY catch this shit tomorrow. 👍
@ritvikmath Год назад
😂 come back anytime
@lashlarue7924 7 месяцев назад
@@ritvikmathWell, it's been a year, but I came back! 😂
@adinsolomon1626 3 года назад
Learners together strong
@gayathrigirishnair7405 Год назад
Come to think of it, concepts from gradient boosting apply perfectly to less mathematical aspects of life too. Just take a tiny step in the right direction and repeat!
@ritvikmath Год назад
yes love when math reflects life!
@saravankumargowthamv9338 Год назад
Very good content but then it would be great if you can stay at the corner allowing us to have a look at the board for us to understand otherwise great session
@ritvikmath Год назад
Thanks for the suggestion !
@regularviewer1682 2 года назад
Honestly, StatQuest has a much better way of explaining this. First he explains the logic by means of an example and then he explains the algebra afterwards. I'd recommend his videos on gradient boosting for anyone who didn't understand this. Without having seen his videos on it I would have been unable to understand the algebra.
@SimplyAndy 2 года назад
Ripped...
@sharmakartikeya 3 года назад
Hello Ritvik, are you on LinkedIn? Would love to connect with you!
@NedaJalali-tz7vw 2 года назад
That was amazing. Thanks a lot.

Следующие

Автовоспроизведение

Inverse Transform Sampling : Data Science Concepts