I can't remember the last time I commented on a youtube video as it was much too long ago, but I just had to because your videos on deeplearning are CRIMINALLY underrated. I have yet to find another resource that explains ResNet so intuitively as you break down each concepts to laymen terms and you take your time explaining them. You have an amazing way of explaining concepts and I sincerely hope your videos get all the recognition they deserve!
i have seen a lot of online lectures but you are the best for two reasons, the way you speak is not monotonous which give time to comprehend and process what your are explaining, and the second is the effort put in video editing to speed up when writing things down on board which doesn't break the flow of the lecture. Liked your video. Thanks🙂!
very very very good explanation. almost all explanations on this forget about the influence of random weights on the forward propagation and focus solely on the backward gradient multiplication. which is why i never understood why you needed to feed forward the input. thanks a lot
ResNets are tricky to conceptualise as there are many nuances to consider. Dr Bryce, you have done a great job here offering such a brilliant explanation that is both logical and easy to follow. You definitely have a gift of explaining complex ideas. Thank you!
This is the clearest video that I've ever seen which explains the resnet for a layman, while at the same time conveying all the very important and relevant information related to resnet - I couldn't understand the paper - but with this video finally understood it - thanks a lot Professor Bryce - hope you create more such videos on deep learning
I am writing a thesis on content-based image retrieval and I had to understand the ResNet architecture in-depth and by far this is the most transparent explanation ever!!
Until now, this is the best Residual Network tutorial I have found. As constructive feedback, I would like you to dive more deeply into how shape mismatches are handled because that part is not at par with the rest of the highly intuitive explanations of various things happening in a ResNet.
Excellent class! I watched many videos before I came to this video and none explained the concept of residual networks as clearly as you did. Greetings from México!
Wow This explanation is amazing. So clear! I saw some videos about resNets but none of them describes what skip connections mean inside, what is their inside structure and working logic. But your explanation gives me much more. You explained the way of thinking and inside structure and advantages. Wow!
This is such a clean and helpful video! Thank you very much! The only thing I still don't know is during the propagation, we now have two sets of gradients for each block? One for going through the layers, one for going around the layers, then how do we know which one to use to update the weights and bias?
Good question. For any given weight (or bias), its partial derivative expresses how it affects the loss along *all* paths. That means we have to use both the around- and through-paths to calculate the gradient. Luckily, this is easy to compute because the way to combine those paths is just to add up their contributions!
Thanks for nice explanation But I have one query, in part 16:00 where you said "each output neuron get input from every neuron across the depth of previous layer", here doesn't that make each output depth neuron same??
@csprof, By consistently including the original information alongside the features obtained from each residual block, are we inadvertently constraining our ResNet model to closely adhere to the input data, possibly leading to a form of over-memorization?
Thank you very much. I am not sure yet how residual block lead to faster gradient passing when the gradient has to go through both paths please? It means as I understand that this adds more overhead to compute the gradient. Please correct me if I am wrong. Also can you please add more how 1x1 reduce the depth or make a video please if possible? For example, I am not sure how the entire depth say of size 255 gives output to one neuron.
You're right that the residual connections mean more-complicated gradient calculations, which are therefore slower to compute for one pass. The sense in which it's faster is that it takes fewer training iterations for the network to learn something useful, because each update is more informative. Another way to think about it is that the function you're trying to learn with a residual architecture is simpler, so your random starting point is a lot more likely to be in a place where gradient descent can make rapid downhill progress. For the second part of your question, whenever we have 2D convolutions applied to a 3D tensor (whether the third dimension is color channels in the initial image, or different outputs from a preceding convolutional layer) we generally have a connection from *every* input along that third dimension to each of the neurons. If you do 1x1 convolution, each neuron gets input from a 1x1 patch in the first two dimensions, so the *only* thing it's doing is computing some function over all the third-dimension inputs. And then by choosing how many output channels you want, you can change the size on that dimension. For example, say that you have a 20x20x3 image. If you use 1x1 convolution with 8 output channels, then each neuron will get input from a 1x1x3 sub-image, but you'll have 8 different functions computed on that same patch, resulting in a 20x20x8 output.
Isn't this similar to RNNs where subsets of data is used for each epoch & in residual network, a block of layers is injected with fresh signal, much like boosting.
I would recommend you to read the official paper "Deep Residual Learning for Image Recognition". I found explanations there pretty clear + there are videos on youtube explaining this paper.
I can't remember the last time I commented on a youtube video as it was much too long ago, but I just had to because your videos on deeplearning are CRIMINALLY underrated. I have yet to find another resource that explains ResNet so intuitively as you break down each concepts to laymen terms and you take your time explaining them. You have an amazing way of explaining concepts and I sincerely hope your videos get all the recognition they deserve!
This is so true!
i have seen a lot of online lectures but you are the best for two reasons, the way you speak is not monotonous which give time to comprehend and process what your are explaining, and the second is the effort put in video editing to speed up when writing things down on board which doesn't break the flow of the lecture. Liked your video. Thanks🙂!
very very very good explanation. almost all explanations on this forget about the influence of random weights on the forward propagation and focus solely on the backward gradient multiplication. which is why i never understood why you needed to feed forward the input. thanks a lot
ResNets are tricky to conceptualise as there are many nuances to consider. Dr Bryce, you have done a great job here offering such a brilliant explanation that is both logical and easy to follow. You definitely have a gift of explaining complex ideas. Thank you!
Thank you professor. The best explanation, he includes influence of random weights on forward propagation.
This is the clearest video that I've ever seen which explains the resnet for a layman, while at the same time conveying all the very important and relevant information related to resnet - I couldn't understand the paper - but with this video finally understood it - thanks a lot Professor Bryce - hope you create more such videos on deep learning
I am writing a thesis on content-based image retrieval and I had to understand the ResNet architecture in-depth and by far this is the most transparent explanation ever!!
I am going to complete the entire playlist. Thanks, Bryce, you are a life saver
Love your explanation, very easy to understand the concept and the flow of the ResNet in 17 mins! Really appreciate it
Thank you for the clear and concise explanation.
Every single second of this video conveys an invaluable amount of information to properly understand these topics. Thanks a lot!
Awesome explanation. Got me through a learning hurdle that several others could not.
Great explanation, it helped me a lot. Thank you for taking the time to make this video!
The best explanation ever. Thank you professor
16 golden minutes.❤
Brilliant explanation! Thank you so much, Professor Bryce!
Wow, so clear! That was stellar, thank you!
Brilliant explanation.
That was amazing! So clear and concise explanation. Thanks!
Your explanation is great.
Best explanation of resnet on the internet
Thank you Prof. Bruce for explaining this thing with minimal complicated technicality
Thank you very much for putting the time and effort. This is one of the best explanations I've seen (including US uni. professors)
Thank you for such a clear explanation
You have my respect, Professor.
Thank you for your good explanation, helped me a lot on my deep understanding journey of all these mechanims 😊
Thank you for the clear, concise, yet comprehensive explanation!
This tutorial is so clear that I can follow along as a non-native English speaker. Thanks a lot!
Very nice video!
Best explanation i came across resnet so far.
Until now, this is the best Residual Network tutorial I have found. As constructive feedback, I would like you to dive more deeply into how shape mismatches are handled because that part is not at par with the rest of the highly intuitive explanations of various things happening in a ResNet.
Brilliant explanation, the 3D diagrams were excellent and I could understand some tricky concepts, thank you so much!
Such a great explanation. Love this!
Thank you professor Bryce, Resnets where brilliantly explained by you. I am looking forward for new videos on more recent deep learning architectures!
Thank you Professor! This introduction is really helpful and detailed!
So clear and well explained. Thank you!
Brilliant explanation. Thank you!
Thank you so much Mr Bryce.
Your explanations are very clear and well structured. Please never stop teaching.
Excellent class! I watched many videos before I came to this video and none explained the concept of residual networks as clearly as you did.
Greetings from México!
nice explanation, thank you very much Professor Bryce
Awesome explanation! Thanks a lot.
What an explanation
Brilliant explanation!!!
Amazing expalinaton. Thank you sir
Thanks so much! very informative brief explanation
awesome.Loved it clear and concise!
Omg this is so helpful! Thank you so much !!!
Thanks for your video.
Great video on this, super informative.
Great explanation, congrats.
Another example of a random youtuber with very less subscriber explaining a complex topic so brilliantly...
Thankyou so much sir
Thank you so much for this video!
this was fantastic - thank you
thank you for the great explanation
your explanation is clear and concise! Thank you so much
Wow This explanation is amazing. So clear! I saw some videos about resNets but none of them describes what skip connections mean inside, what is their inside structure and working logic. But your explanation gives me much more. You explained the way of thinking and inside structure and advantages. Wow!
great explanation, thank you!
Thank you sir great explanation
Amazing. Thanks a lot. Your explanation is so clear. Please keep making videos professor!🙏
Really Great explanation. Thanks Prof. ♥
you saved my life
AMAZING!!
thank you so so much for this video!
Awesome explanation!! Thank you for your effort :)
Thank you!!!
You are a star!
very nice! thank you!
tfw you click on a video and it's your old college professor lmao
That's awesome! Haven't seen anything by my professors other than what they shared in class, but you never know
Best Explanation
Great Explanation !!!!
Thx dude u are awesome !
you are brilliant!! Thank you for explaining this so well!!!!❤❤❤
Thank you 👏👏
Superb!
Awesome explanation
got a meaningfull insights from this video
Prof. Bryce is the GOAT!
Who is this teacher? Damn he is good. Thank you
Loss landscape looking super smooth .....
It was clear and useful. Tnx a lot
Waow. Thankyou
This is such a clean and helpful video! Thank you very much! The only thing I still don't know is during the propagation, we now have two sets of gradients for each block? One for going through the layers, one for going around the layers, then how do we know which one to use to update the weights and bias?
Good question. For any given weight (or bias), its partial derivative expresses how it affects the loss along *all* paths. That means we have to use both the around- and through-paths to calculate the gradient. Luckily, this is easy to compute because the way to combine those paths is just to add up their contributions!
Thanks
Coudn't understand how we can treat the shape-mismatch 13:40
Great lecture nonetheless, thank you sir !! Understood what Residual Networks are 🙏
great!
Thanks for nice explanation
But I have one query, in part 16:00 where you said "each output neuron get input from every neuron across the depth of previous layer", here doesn't that make each output depth neuron same??
@csprof, By consistently including the original information alongside the features obtained from each residual block, are we inadvertently constraining our ResNet model to closely adhere to the input data, possibly leading to a form of over-memorization?
10:10
Concerns: shape mis-match
nervous sweating
Thank you very much. I am not sure yet how residual block lead to faster gradient passing when the gradient has to go through both paths please? It means as I understand that this adds more overhead to compute the gradient. Please correct me if I am wrong. Also can you please add more how 1x1 reduce the depth or make a video please if possible? For example, I am not sure how the entire depth say of size 255 gives output to one neuron.
You're right that the residual connections mean more-complicated gradient calculations, which are therefore slower to compute for one pass. The sense in which it's faster is that it takes fewer training iterations for the network to learn something useful, because each update is more informative. Another way to think about it is that the function you're trying to learn with a residual architecture is simpler, so your random starting point is a lot more likely to be in a place where gradient descent can make rapid downhill progress.
For the second part of your question, whenever we have 2D convolutions applied to a 3D tensor (whether the third dimension is color channels in the initial image, or different outputs from a preceding convolutional layer) we generally have a connection from *every* input along that third dimension to each of the neurons. If you do 1x1 convolution, each neuron gets input from a 1x1 patch in the first two dimensions, so the *only* thing it's doing is computing some function over all the third-dimension inputs. And then by choosing how many output channels you want, you can change the size on that dimension. For example, say that you have a 20x20x3 image. If you use 1x1 convolution with 8 output channels, then each neuron will get input from a 1x1x3 sub-image, but you'll have 8 different functions computed on that same patch, resulting in a 20x20x8 output.
Isn't this similar to RNNs where subsets of data is used for each epoch & in residual network, a block of layers is injected with fresh signal, much like boosting.
Can you please talk about GANs and if possible stable diffusion
I am still confused.
What are the prerequisites to understanding this video.
I would recommend you to read the official paper "Deep Residual Learning for Image Recognition". I found explanations there pretty clear + there are videos on youtube explaining this paper.
Do you mean that RESNET is just a skip connection not an individual network ?????????
👍
speaking about the error value and calling it a Loss value using that term out of original context, makes this confusing to the new learner...
Free books
Anna university
great explanation, simple and straightforward.
Brilliant explanation. Thank you!