Convolution Neural Networks - EXPLAINED

CodeEmporium

Просмотров 140 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 1 дек 2024

Комментарии •

@HafeezUllah 3 года назад ⁺¹
I had no idea about CNN at all, this was great and given me immense confidence in learning about CNN. Great video. scratch to end explained beautifully.
@taihatranduc8613 4 года назад ⁺⁴
you made me realize there are indeed other RUclipsrs "don't really know much about" what they're saying (0:17). You explain the best way in youtube especially about the structure of the CNN
@prodbreeze 5 месяцев назад ⁺¹
YOU HAVE MADE ME ACTALLY LIKE ML DL for the first time
@darasingh8937 3 года назад ⁺¹
Thanks a lot for not having a superficial touch of the topic. Keep it up!
@sneha_more 2 года назад
The way you explained made me feel like I didn't know so much about CNN. I wonder when did you read so many papers. Thanks for sharing your knowledge. Helps a lot.
@ozancanacar8237 6 лет назад ⁺³¹
Thank you so much! Everyone just explaining like : ""So this is convolution and that generates this numbers and this is our feature cubes and you apply pooling and get that... lets jump in to the python code i wrote in 5 weeks but imma explain in 15 seconds". You've explained all these concepts clearly and one by one. Can you make a video about training the CNN, it would be awesome.
@SuperMaDBrothers 2 года назад
5 weeks? Nah bro they're not as dumb as you are lol. But seriously code is a shit way of explaining something. You should check out lectures from universities though, this video was pretty shit too
@neillunavat 4 года назад ⁺¹
You explain better than well established organizations boi!! Keep it up.
@insidiousmaximus 3 года назад ⁺⁴
mate I have been working as junior AI engineer for over a year now and I have successfully deployed custom built CNNs on nvidia hardware but I am still learning from your videos! Just discovered and watched them all back to back. Best videos I have found and I watch a hell of a lot of videos on this topic! I have also read some hardcore books on it. Your videos are par excellence please keep making them! Would love to see some practical examples, there are many tutorials on how things like segmentation and superpixels WORK but nobody wants to show us how to actually implement them into a custon network and display the results. ie. detect flame or smoke. When it comes to practical solutions nobody really goes beneath the provided API examples! Very frustrating.
@abhilasht6471 5 лет назад ⁺¹⁴
thank you so much for an amazing video even after going through several videos I did not get the concept clear after this video all of my doubts are clear
please make hands-on tutorials it's a humble request, hope to see you soon
small correction @16:40 calculation of 12.5, not 13.5 == (26-2+1)/2 = 12.5
@JohnUsp 4 года назад ⁺⁹
17:00 - From 13x13x32 to conv3x3,64. How the volume/deep of 32 is handle? I understand the result of 11x11x64(filters) but those 32 layers are summed/packed and send to conv3x3x64?
@thomasmarsden1870 3 года назад ⁺³
lmao I have the same question. pretty sure there are 64, 3*3*32 filters.
@xxdxma6700 2 года назад
Such an amazing video man. The best educational I have watched in a while
@swedenontwowheels Год назад
very well explained! good job! thank you so much for putting the effort in this video!
@CodeEmporium Год назад
Thanks so much!
@anemoiacApache 5 лет назад ⁺¹¹
Should've found this a month ago before i proceeded to try and learn this on the fly and just embarrassed myself in front of my department
@abdulcustom 3 года назад ⁺¹²
This is a great video. I have one small doubt. @17:11 How do you apply 64 kernels on 32 response maps and get 64 response maps in the next layer?
@gentix8564 2 года назад ⁺²
remember the depth of each filter is 32. so actually, you apply 64 3*3*32 filters, which is why the output depth is 64.
@npip99 2 года назад
Thank you for this question! Wondering the same thing!
@npip99 2 года назад
Ah thank you, so each takes the 3x3 over all of the previous filters.
@ttb1513 Год назад
17:27 Out.width = 13 - 2 + 1 = 11. Something is wrong here, as 13-2+1 is 12.
@TheRealJackfrog 4 года назад
Well done! Your voice and method left me wanting a more detailed explanation from you.
@TheRealJackfrog 4 года назад
Maybe you could give that explanation over a cup of hot chocolate by the fire as we cuddle up, listening to the latest episode of the Lex Fridman Podcast together. We laugh as Lex goes off on some profound tangent about how the human mind is hard to understand. "That's not the only thing that's hard" I think to myself, as you spoon me ever so gently. It's a perfect night. Just you and me, by the fire, as the sky darkens outside the cabin windows. I know that you could never leave me wanting more...
Sorry, I got a paper due in 9 days that I don't want to write.
@Bilangumus Год назад
Still relevant today, thanks.
@Geoters 6 лет назад ⁺¹⁰
Sorry, one moment is not clear. After first convolution (and maxpool) we end up with 13x13x32. When applied conv3x3,64. How did it work? We had 32 layers (feature maps). If we apply conv3x3,64 to each layer we would end up with 32x64 layers. But we end up with only 64 layers. thanks
@CodeEmporium 6 лет назад ⁺¹
When we have a 13×13×32 volume, and apply one filter of 5×5×32, then we get a 11×11 feature map. So if we apply 64 such filters to the 13×13×32 volume, we end up with 64 such 11×11 feature maps. In other words, an output of 11 × 11 × 64
@Geoters 6 лет назад
Sorry, allow me rephrase the question. At 4:50 you apply the convo filter 3x3x1 to image 5x5x1. Basically just weighting and adding pixels that fit into 3x3 square. How would you apply 3x3x1 filter to image 5x5x2 (2 layers 5x5x1 ) ? Weighting and adding pixels from both layers.
@CodeEmporium 6 лет назад
Depth of the filter and the input should be the SAME. 3 x 3 x 1 filter convolves with a 5 x 5 x 1 image as they have the same depth (1). But in the case of 5 x 5 x 2, we NEED to apply a filter of shape 3 x 3 x 2. A 3 x 3 x 1 filter will only convolve with one of the 5 x 5 x 1 layers. We don't take the average of both layers as they represent different data. Hope that makes sense.
@Geoters 6 лет назад
15:35. After first convolution and pooling we end up with 13x13x32. So how do we apply convolution 3x3x64 to it? We got 32 layers of 13x13 grid. So now we apply 3x3 convolution filter 64 times and end up with 64 layers. How do we do it since we have 32 layer in the source?
@CodeEmporium 6 лет назад ⁺³
We don't apply convolution with a 3 x 3 x 64 filter. We apply convolution for 64 filters of shape 3 x 3 x 32, each with the input 13 x 13 x 32. The result of each convolution will be a 11 x 11 output. Since we have 64 such convolution operations, we end up with 11 x 11 x 64. Just note the OUTPUT depth is equal to the number of filters chosen for convolution. And the depth of filter is equal to the depth of INPUT.
@sharpshootoyaj 4 года назад
This is genuinely a brilliant explanation. Many thanks
@SuryadiputraLiawatimena 6 лет назад ⁺⁸
Please explain again why we have 32 and 64 layers (feature maps)? from where these number, are they calculated or just pick numbers? thanks.
@manishsharma2211 4 года назад
Sir. It depends how many feature vector do you need. These num are majorly used
@ravikumarhaligode2949 3 года назад
I am also having same query, how to decide how many filters are required
@mehdisoleymani6012 2 года назад ⁺¹
Be careful !!! thank you, at 17:28 time of the clip there is a mistake in the equation (13-3+1=1 is true however you have typed 13-2+1=11
@sciWithSaj 3 года назад
Thanks you very muchh.
Cleared lots of doubts.
@Hassan.Wahba.97 3 года назад ⁺¹
I just noticed that we round up when pooling, we don't floor. cause (26 - 2 + 1)/2 is 12.5 not 13.5
@psychotropicalfunk 2 года назад
7 months later but I noticed the same. Either that or by mistake calculated using the first output and took 28 instead of 26: (28-2+1)/2 = 13.5
@shrutiprasad3354 4 года назад
greatest of all the other videos
@CodeEmporium 4 года назад
Thanks for the compliments :)
@IndiaNirvana 10 месяцев назад
Great videos. One small question at 5:07 how did you select the weights of the 3 by 3 filter
@ahmedsabbir5862 4 года назад ⁺²
@17.25 , Output (width) = 13-3+1/1. So the result will be 11
@CodeEmporium 4 года назад ⁺²
You are right. Will like this so others can see it. Nice catch!
@ahmedsabbir5862 4 года назад
@@CodeEmporium You're welcome. You should do some tutorials on Kaggle Problem solving, it will be helpful.
@fahnub 2 года назад
this is just so good. thank you for this.
@robertcohn8858 4 года назад
I think the value of this video is not so much that you will be able to sit down and use CNN from the get-go. Rather, it demonstrates some of the key concepts quite well (convolving layers for example). Looking at the final example is helpful and should probably be viewed several times to get the full meaning. But in all, the video is - when used with other information sources - a good start to learning CNN.
@manishsharma2211 4 года назад
Bang on. Explained very good
@reasoning9273 Год назад
Actually, CNNs were introduced bit earlier. I recall it was LeCun's 1989 paper.
@TawhidShahrior 2 года назад
man you are a genius.
4 года назад
you provide references, thank you very much. yours videos is great.
@sokiprialajonah4932 4 года назад ⁺¹
this video really help me alot
@smealzzon 5 лет назад ⁺¹
Great video, filled in a lot of gaps of understanding.
@raghavamorusupalli7557 2 года назад
Location independence is an important feature
@mohammedbenaissa1278 21 день назад
I have never understood cnn like I do after this video.
@sujithtumma6754 2 года назад
Awesome explanation. Loved it. Just a little correction , at 17:24 I think "hwidth" is 3 not 2 .
@CodeEmporium 2 года назад
Thanks for the catch! Yeah there are definitely a few typos here that you and some others called out. (Also thanks for the compliments) :)
@GKS225 3 года назад
Awesome video! Keep it up!
@honeyrulesintheworld 2 года назад
hi can you tell me how to find confusion matrix for image retrival using CNN?
@artinbogdanov7229 4 года назад
Great explanation. Thank you!
@mpcr9799 4 года назад
I know how a filter in a Convolutional Neural Network "scans" the input image and multiplies the values of the kernel with the corresponding receptive field in the input image and adds it all up to get a new pixel in the output activation map. But Im unsure how the numbers in a filter is decided.
Is the kernel a patch from the image that is chosen? Like a 5x5 patch of the image that the network must decide to be good to be used as a filter? Or are they random numbers that backpropagation will soon change to fit best with the data? And would these numbers in the filter be considered as the weights of the network?
Thanks for any help.
@barnabyroberts7950 3 года назад
The values in the kernel are randomly initialised and altered via backpropagation. If you know about simple densely connected networks, then you can consider a single weight in this type of network to be analogous to a 2D kernel that convoles a single channel in the input image. If you consider a 3-channel image as the input to a layer, and a single channel as the layer output, then the output (a 2D image) is taken by convolving each input channel with its own K*K kernel and summing (superimposing) the resulting 3 images. This is analagous to a simple densely connected network except each weight in the layer is a K*K kernel rather than a scalar. However it makes more sense to consider a K*K*3 kernel rather than summing 3 K*K kernels for the 3 input channels. If N is the number of input channels, M the number of output channels and K the width of a kernel, then you have K*K*N*M parameters for a single layer.
@miladmfarid 3 года назад ⁺¹
16:47 you explained the pooling width output and in the equation used 26-2+1/2 which will be 12.5 but you said it will be 13.5 ! and I don't know how you get to 13 ? can you please explain?
@anjanichowdaryoleti5425 3 года назад
{[Filter length - pooling value length]÷stride} +1 formula
@anjanichowdaryoleti5425 3 года назад
Then {[26 - 2]÷2} + 1 =13
@ishaquenizamani9800 2 года назад
your videos are great please make a video on U-net plz
@manoharrengasamy4174 2 года назад
Thanks,good explanation @ filters. can you refer links :how filters/kernels prepared ?.For a object how many filters minimum required?, development and updation of filter upto latest yolo model
@rangaeeee 3 года назад
About CNNs url is broken ... Pls update the latest one
@swarajshinde3950 4 года назад
Yann Lecun is great
@natjimoEU 4 года назад
great video mate.
@MrStudent1978 6 лет назад
Excellent explanation
@yashpandit832 4 года назад ⁺¹
One doubt: In the last image shown will what will the width of each filter be in the second conv. layer? My understanding is that it will be 32 as the input width is 32 i.e. the filter of 3x3x32. Am I right or is there something wrong I have understood? Plz help.
@prateeksasan9759 4 года назад ⁺¹
i have the same question. have you figured it out?
@himanshusrihsk4302 5 лет назад
Please make a video on visual question answering
@sambarajuchiluveru8444 6 лет назад ⁺¹
hello dear, thank you for video i have question how to deal with pooling in one dimensional input case?
@konstantin7596 Год назад
I think at 16:32 the +1 should be outside the fraction in the end again?
@SkullcandygirlSuchi 3 года назад
Hey can I get the whole content with diagram
@ocnarfchan4857 4 года назад
How does back propagation work for Convolutional Neural Network?
@GamingGleeSquad 5 лет назад ⁺¹
Why is the Filter size 3x3 @8: 06? Can we take some different size for the Filter?
@nikolasdrn 5 лет назад
Yes, you can
@abhijitmahapatra8024 4 года назад
Hello AJ, today I discovered your channel( subscribed long back but never explored this much) and guess what you provide much simple intuition of topics that’s hard to grasp within minutes. Can you do the same for some Machine learning part like ARIMA and other predictive models..!! Anyhow great content. Really appreciate your effort and knowledge.
@CodeEmporium 4 года назад ⁺¹
Ive been playing around with time series models recently too. Not sure if there is enough drive for a video at this time. But will definitely keep this in mind
@abhijitmahapatra8024 4 года назад
CodeEmporium That would be a great help. thanks for the reply AJ can’t thank enough for your efforts.
@krishnamishra8598 4 года назад
why do we use convolution ??? why not just simple ANN in case of image ?? main question is what is need of convolution in CNN?? please Answer....
@amithm3 3 года назад
ANN takes 1D input and thus loses the spatial details of the image, but in cnn those are extracted and presented to ANN in a more meaningful and trainable manner
@scientistgeospatial 5 лет назад
Well done! Thanks buddy.
@nurfaizahmusa496 5 лет назад ⁺¹
Great video, this is really helpful and detailed. Loved it!!!
@sathishp6257 6 лет назад ⁺¹
17:25 how come h(width) is 2 and after doing arithmetic Out(width) is 11.. and as per my observation while doing conv3x3, 64 kernal size (h (width)) should be 3 right?
@CodeEmporium 6 лет назад ⁺¹
When we have a 13×13×32 volume, and apply convolution with one filter of 3×3×32. This will give us an 11×11 feature map (as the stride is 1). Apply 64 such kernels, we get 64 such 11×11 feature maps i.e. a 11×11×64 volume.
@wlxxiii 5 лет назад
Mistake in the slide: should be 13 - 3 + 1 = 11
@deepakkumarshukla Год назад
@@CodeEmporium where does this 3*3*32 filter come from? did I miss something or is something missing in the images shown?
@alexfourie6491 4 года назад
Nice video, quick question though. How do you determine the weights in each filter? I would assume they are randomly assigned like the weights in a normal neural network on the first feed-forward pass.
Follow up question:
How would one then go about updating the weights in each filter?
Thank you
@MustafaHoda 6 лет назад
The 32 Filters that are demonstrated at 8:46, are those filters in the other layers behind the first the same or different?
@Nuns341 2 года назад
how is h-height change from 3 to 2?
@changqunzhang1277 2 года назад
Thank you very much! This is great video containing many helpful information. Really appreciate the time and effort you spent on making this video. Here is a question, when conv 3*3, 64 applied on 13*13Z*32 images, isn't the result 11*11* (64*32)? for each 32 layers, the filters that is 64 times were applied. One more thing, I believe 13-2+1 = 11 is not correct (should be 12) @17:29
@chriswalsh5925 2 года назад
Yes! I thought the same... it is confusing enough as it is! :D ... maybe a mistake or something not mentioned about how the convolution works?
@인나-h2f 4 года назад
thank u, teacher
@elrosspangue7443 6 лет назад
Question, why is there an increase of kernels for every convolution layer and where are those kernels coming from? What is the basis of those kernels?
@CodeEmporium 6 лет назад ⁺²
The network tries to understand features of the input (image). The shallower layers extract high level features (edges, strokes, shadowing, texture, etc). The deeper we go, lower level features are extracted (could by anything. Most likely not human interpretable). Such lower level features are more complex. Hence we need more parameters to learn them. So the deeper we go, the more kernels we use.
@elrosspangue7443 6 лет назад
@@CodeEmporium Follow up question, where can I get the parameters? What is the basis of these parameters? Are parameters and features the same?
Just also wanna give appreciation and thanks to your videos and answer! The backstory of this questions is, me and my thesismates are creating a CNN model that revolves on genre classification with some enhancement of new techniques and methodologies. This video was actually our basis from learning how CNN works and it's specifics in terms of layers - from nothing to almost intuitively knowing the basics.
@clearwavepro100 5 лет назад
gonna need to subscribe bc multiple videos about audio and cnns ! :) yes!
@adam_sporka 4 года назад
Thank you very much!
@samarpitasnani7996 3 года назад
can i get the slides of this.
@videoinfluencers3415 4 года назад ⁺¹
Whoaa!!!!
@malihafarahmand75 4 года назад
how to calculate 512 and 512 dense
@bankawat1 4 года назад
good one
@anandachetanelikapati6388 4 года назад
May I know how to calculate the input, output and learnable parameters in the following case?
Assumptions:
- Input size is (32, 32, 3)
- No padding for all convolutions
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Layer Type Kernel Stride Neurons/feature maps input size output size No. of parameters
-------------------------------------------------------------------------------------------------------------------------------------------------------------
1 Conv (3, 3) (1, 1) 16 (32, 32, 3)
2 Pool (2, 2) (2, 2) 16
3 Conv (5, 5) (1, 1) 32
4 Pool (2, 2) (2, 2) 32
5 Conv (3, 3) (1, 1) 64
6 Dense -- -- 128
7 Dense -- -- 2
--------------------------------------------------------------------------------------------------------------------------------------------------------------
thank you
@mohammedhassan7770 5 лет назад
Good job, thanks.
@RKYT0 3 года назад
hey man,
is it somehow possible to ask you some questions in terms of my master thesis? ;)
@amithm3 3 года назад
finally the video i wanted, how to convert the deep volume matrix into ANN input. I have one doubt, suppose we have an image of 28x28 pixel and the first cnn layer with 3 kernel, we will get 3 feature maps, now in the next layer if we have "64" kernels how many feature map do we get, is it 64 * 3 or is it just x no of feature maps. if it is only 64 no of maps then how do we convolve the 3 feature maps into 64 feature maps using only 64 kernels, should we sum the 64 * 3 maps we get into 64 maps??
@MrRameeez 5 лет назад
What is dense layer, why it is 512??
@LovedbyGod4ever 3 года назад
Thank u bro
@baskorobaskoro7972 6 лет назад
How to set value in filter (kernel)? Is it set by randomized?
@CodeEmporium 6 лет назад ⁺¹
Initially, yes. They take on random values, which are later "learned".
@SuryadiputraLiawatimena 6 лет назад
how do they 'learned'? do you have this cnn code in Keras?
@danishnawaz7869 5 лет назад
Thank you!
@santhoshkolloju 6 лет назад
Hey Can you do intuitive explanation of CNN on text data
@CodeEmporium 6 лет назад ⁺¹
Sure. Maybe a future video.
@giahuytrinh7195 2 года назад
ty
@amirulsadikin8716 5 лет назад
Thank you soo much ...you saved me alot of reading time....
@CodeEmporium 5 лет назад
Perfect! Glad it helped
@StevenSmith68828 5 лет назад
Where does 32 come from?
@louerleseigneur4532 4 года назад
merci
@mnsnliu9317 5 лет назад
good
@anwarulislam6823 2 года назад
Someone sending me conversation like AI Chatbot through all of actions in neural networks by inner voice using brain!!! Is it possible or not, if it is than how can I control this thing??
#Thanks in advanced.
@elinaakhmedova9407 6 лет назад
Thanks for this video! You are cool, keep going 🤗
@CodeEmporium 6 лет назад
Yay! Thanks! Imma keep it up ;)
@reggaebin 4 года назад
@17:25 13-2+1=11 is not correct.
@mdyzma 5 лет назад ⁺¹
17:21 your filter in round 2 convolution is (3, 3). So it should be 13-3+1=11. Not 13-2+1, which is 12.
@XX-vu5jo 3 года назад
And my fake PhD supervisor don’t even know or understand a single thing about this!!!! Damn those quacks! My country sucks!
@shreyjain6447 3 года назад ⁺¹
Which country?
@sunidhinayak6413 5 лет назад
can you please make a video on Keras - container
@XX-vu5jo 3 года назад
Dude study on your own lol
@RedShipsofSpainAgain 6 лет назад
16:34 shouldn't that be 12.5, not 13.5? (26-2+1)/2 = 12.5
@gh0oo 5 лет назад
Yes
@macsenwyn7223 4 года назад
13-2+1 is not 11 its 12
@zhenzhen8766 3 года назад
memo 13:30
@Leon-pn6rb 4 года назад ⁺⁷
poorly explained the layers. The same surface level explanation with no intuition behind it for the core concepts
The easier concepts were explained well but that wasn't why people watch these vids
@bishwasapkota9621 4 года назад
Poorly explained!! Anyway a good try

Следующие

Автовоспроизведение