As a computer science graduate student, it's unbelievable how simple Alexander breaks down these concepts in such a way that I'm getting it from a fresh perspective
Congratulations! Very well organized lecture. High standard and easy to follow at the same time. I am a 68 years old mechanical engineer with average capabilities, but I enjoyed each minute watching and listening to your presentation. Well done! Thank you very much indeed.
I think that's why it's MIT. Usually I think of lecturer as scholar, someone with knowledge but not at the same time a good presenter, one of my lecturers spends most of class hours talking about how good he is, and not teaching.
Amazing Lecture! The best on CNN. The only lecture probably that I’ve seen that cleared as to why each operations are done and how they influence the outcome. Thanks a ton. Much needed.
Wow, that explanation with the X in the matrix of 1 and -1, and how we look for the mini sub features and compose a bigger picture out of them was awesome.
Hello Alexander, your presentation is very clean. I've seen many videos on CNN, and your presentation is actually very "clever" in the sense that you precisely tell about what's problematic and the reason why we solve issues this or this way. Pedagogically I admire your work, congrats.
Excellent exposition! The illustration of filters has helped me better understand the application of weights to learning. Please can you explain more about activation functions and their role with an example.
Clear and concise video on CNN. No need check another one on this topic. I am a student and performing plant disease detection and this would be very useful. Thanks !!
I was firsr introduced to cnn's yesterday and i was super overwhelmed specially becausd i was introduced to it with a pre trained image classification model instead of breaking it down piece by piece like this, now i feel way better about it
Great presentation, but those bright blue slides with white text gave me a tiny shock every time they came up because I thought I was looking at a bluescreen. Was that intentional to keep people focused?
Awesome Video! 20:28 doesn't pooling already bring in non-linearity? taking rectify for example. rectify(x)=max(0,x) leads to non-linearity like max pooling. right? Why exactly do we use pooling btw? Reasons I (think I) found in Goodfellow et at.: * whole regions are collapsed into one summary statistics, thus the network is more robust to noise * obv. we store less data and thus (most likely) need less parameters in the next layer (less time complexity..) Are there any additional reasons?
Pretty awesome that human brains can actually, in a split second, KNOW that 16x12 greyscale image was of abraham lincoln.. shame he didn't actually remark on that or invite the audience to guess who it was or anything..
@@AAmini It is given there's only 192 pixels there, and humans know THOUSANDS of faces, and that it's an instant process not something that we consciously reason about, or iterate over. We definitely have abraham lincoln neurons in our brains XD
Why would you lose spatial information when flattening a fully connected layer? I understand that the amount of parameters to retain spatial information when using fully connected layers is massive, but spatial information is not lost when you flatten the data. The indices still relate to a region of the image.
I have a question. How exactly the second convolutional layer is interfacing with the output of the first conv layer? Or, how is the filter of a layer-2 node slidden through the planes that make up the output of the first conv layer?
What is it about the negative numbers that makes the process want to reject them via the ReLu function? What is the reasoning behind rejecting those values?
Can you please clarify, if the RELU activation is applied individually on the outcome of every convolution operation for every filter. For example, if a 28x28 image is transformed into 26x26x3 convolution layer( with three separate 3x3 filters), the RELU activation is applied independently 2028 (26*26*3) times ?
Question: Does the filter weights are shared between all next layer neurons, or each of them has a different weights set? I tend to think that the second option is the right one, but am not sure
The input layer for the dense neural network here is referred as the hidden layer, maybe since its not actually taking the input and has already started applying the weight right in that layer.
Can anyone explain why there is additional dense layer in the classification ? I mean, the architecture for classification is composed with : Flatten->Dense(w/relu)->Dense(w/ softmax) . Why don't we just go : Flatten->Dense(w/softmax) ?
27:15 Listen to that slide again, especially at 27:46. The CNN extracts features from your image, but depending on the task, the use/importance of each feature is going to be different. That's the role of the dense layers in-between.
I was trying to understand your face recognition model and came acroos this line in the code: "loader = mdl.lab2.TrainingDatasetLoader(path_to_training_data)" What does this line do exactly. I see that it calls a package created by you guys. But what is it doing with the images?
This line simply loads the training images into memory. The actual code to feed the images through the model, train, test, visualize, etc comes later in the code. If you're interested, we open-sourced the entire MIT Deep Learning (mdl) package so you can see exactly what the internal calls do: github.com/aamini/introtodeeplearning/tree/master/mitdeeplearning
As a lecturer myself, I'm blown away by the quality of the slides.
True that, my friend. He knows how to create slides!
As a computer science graduate student, it's unbelievable how simple Alexander breaks down these concepts in such a way that I'm getting it from a fresh perspective
This is probably the best explanation of CNN that I have been through yet, it cleared almost all of my queries.
Congratulations! Very well organized lecture. High standard and easy to follow at the same time. I am a 68 years old mechanical engineer with average capabilities, but I enjoyed each minute watching and listening to your presentation. Well done! Thank you very much indeed.
I think that's why it's MIT. Usually I think of lecturer as scholar, someone with knowledge but not at the same time a good presenter, one of my lecturers spends most of class hours talking about how good he is, and not teaching.
@@shacklemanwarts1305 and just like that, he's gonna turn 69
Menhhh! This is arguably the best video on convolutional neural networks!!!
Amazing Lecture! The best on CNN. The only lecture probably that I’ve seen that cleared as to why each operations are done and how they influence the outcome. Thanks a ton. Much needed.
I never really understood convolution, but now I finally do! Thanks for this awesome lecture.
Wow, that explanation with the X in the matrix of 1 and -1, and how we look for the mini sub features and compose a bigger picture out of them was awesome.
19:02 Feature Extraction
19:50 CNNs for classification
You cannot get a better lec for CNNs. Thank You MIT.
This lecture is so clear and powerful. I simply love it. Respect
just going to say...........it was mind blowing................hats off.......
By far one of the very best lectures of Computer vision.
Handsome guy + Tech = Wholesome! Love from India!!!!!!!!!!
Hello Alexander, your presentation is very clean. I've seen many videos on CNN, and your presentation is actually very "clever" in the sense that you precisely tell about what's problematic and the reason why we solve issues this or this way. Pedagogically I admire your work, congrats.
Simplistic way to teach with awesome presentation contents. Thank you for your hardwork. Happy Coding !!
Best lecture on cNN that I have seen. Thanks mate!
Awesome. The best explanation I have ever seen. Congrats!
Awesome content and so clear explanation. Thank you very much for sharing these lectures with us.
Need to use a convolution is beautifully represented in image at 13:28.
This was amazing. Thank you Alexander!
Excellent exposition! The illustration of filters has helped me better understand the application of weights to learning. Please can you explain more about activation functions and their role with an example.
Clear and concise video on CNN. No need check another one on this topic. I am a student and performing plant disease detection and this would be very useful. Thanks !!
Wonderful, loved the explanations!
CNNs always intimidated me! This video made it really clear. 😀😀
I was firsr introduced to cnn's yesterday and i was super overwhelmed specially becausd i was introduced to it with a pre trained image classification model instead of breaking it down piece by piece like this, now i feel way better about it
Amazing lecture, thank you so much!
I know I will watch this several times over.
This is actually better lecture than that of Stafford I hope other lectures are available as well
Very useful video to start!
Thank you for sharing a great lecture!
It helped me so much
Amazing lecture , thank you for sharing
Thanks. Very nicely explained.
Revisited all the basics. Great content!
Great presentation, but those bright blue slides with white text gave me a tiny shock every time they came up because I thought I was looking at a bluescreen.
Was that intentional to keep people focused?
waiting anxiously :)
Amazing. Thank you Alexander.
Awesome content. Very intutive explanations.
Awesome Video! 20:28 doesn't pooling already bring in non-linearity?
taking rectify for example. rectify(x)=max(0,x) leads to non-linearity like max pooling. right?
Why exactly do we use pooling btw?
Reasons I (think I) found in Goodfellow et at.:
* whole regions are collapsed into one summary statistics, thus the network is more robust to noise
* obv. we store less data and thus (most likely) need less parameters in the next layer (less time complexity..)
Are there any additional reasons?
Great lecture, and thank you for sharing your knowledge!
Can't wait to see this😋😋
Thank you sir good Explaination of convolutional
thnks Alexander and MIT
Omg 😍mind blowing content
Feeling waiting in the dining room.... ready to have a big meal .... :)))
what
@@ananyab4883 That was for the time. We had waited for the class video to be uploaded :)
Pretty awesome that human brains can actually, in a split second, KNOW that 16x12 greyscale image was of abraham lincoln.. shame he didn't actually remark on that or invite the audience to guess who it was or anything..
Great point! The fact that I took this for granted demonstrates your point even further. But indeed, this is awesome :)
@@AAmini It is given there's only 192 pixels there, and humans know THOUSANDS of faces, and that it's an instant process not something that we consciously reason about, or iterate over. We definitely have abraham lincoln neurons in our brains XD
Great content and even a better explanation
yo this lecture is awesome
This is just amazing. The slides and the presentation are of such a high quality! As a complete beginner in the field, it's truly helpful.
Great lecture! Thanks for sharing
great lesson teacher thank you a lot.
Excellent lecture, so intuitive and easy to understand! 👍
Wait, where can I see the labs? I want to practice too
Great content make more videos like this 👍👍
one doubt, in minute 16:20, the filter has some numbers. Where dou you get this numbers from?
This is one-epic lecture on CNN
Why would you lose spatial information when flattening a fully connected layer? I understand that the amount of parameters to retain spatial information when using fully connected layers is massive, but spatial information is not lost when you flatten the data. The indices still relate to a region of the image.
waiting for the new updates!
I love you, Alexander! Thanks for sharing these videos with us!
Amazing!
thanks sir
Great video
awesome, thanks a lot!
I have a question. How exactly the second convolutional layer is interfacing with the output of the first conv layer? Or, how is the filter of a layer-2 node slidden through the planes that make up the output of the first conv layer?
Awesome!
What is it about the negative numbers that makes the process want to reject them via the ReLu function? What is the reasoning behind rejecting those values?
Great it will help my project work thank u
Very informative. Thanks!
Thank you!
ahhhh, can't wait to see this
God bless you ...
How can we choose the number of filters in a CNN?
by the way, thanks for the extraordinary explanation
15:49 How do we get that nine? when we perform multiplication it is not 9..
The '9' is achieved when you add all the elements of the output matrix [[1, 1, 1], [1, 1, 1], [1, 1, 1]].
Great video!
Neural Network: "Red light stop, green light go, yellow light go very fast."
21:04
20:43
good one
Can you please clarify, if the RELU activation is applied individually on the outcome of every convolution operation for every filter. For example, if a 28x28 image is transformed into 26x26x3 convolution layer( with three separate 3x3 filters), the RELU activation is applied independently 2028 (26*26*3) times ?
yes. for each W*x its applied independently
nice
I lOVE IT
Awesome lecture! Thanks too, hopefully I can learn at MIT someday
Question: Does the filter weights are shared between all next layer neurons, or each of them has a different weights set?
I tend to think that the second option is the right one, but am not sure
23:30 belum jelas ?
Can't wait to see this😋😋
Great video!
great lecture and "Khaste Nabashi" :)
Thanks
Yeah!
Why does he keep saying connected to the hidden layer? Aren't the flattened images connected to the input layer, not the hidden layers?
The input layer for the dense neural network here is referred as the hidden layer, maybe since its not actually taking the input and has already started applying the weight right in that layer.
Very intuitive content! Thanks!
Lol. Waiting for the next great course!
Continuous label haha that was funny
Can anyone explain why there is additional dense layer in the classification ?
I mean, the architecture for classification is composed with : Flatten->Dense(w/relu)->Dense(w/ softmax) .
Why don't we just go : Flatten->Dense(w/softmax) ?
27:15 Listen to that slide again, especially at 27:46. The CNN extracts features from your image, but depending on the task, the use/importance of each feature is going to be different. That's the role of the dense layers in-between.
@@hoaqyn8544 thanks I've got the concept!
I was trying to understand your face recognition model and came acroos this line in the code: "loader = mdl.lab2.TrainingDatasetLoader(path_to_training_data)" What does this line do exactly. I see that it calls a package created by you guys. But what is it doing with the images?
This line simply loads the training images into memory. The actual code to feed the images through the model, train, test, visualize, etc comes later in the code. If you're interested, we open-sourced the entire MIT Deep Learning (mdl) package so you can see exactly what the internal calls do: github.com/aamini/introtodeeplearning/tree/master/mitdeeplearning
Great streamlined content
Finally. Letsss gooo
Learning how to do MNIST 🤖
The "convolution" component of CNNs is nothing more than a performance optimization.
💚💛❤
With such lecture quality I feel you could send a donkey to MIT and it would come out a scientist.
5:47
18:35