C4W1L06 Convolutions Over Volumes

Поделиться
HTML-код
  • Опубликовано: 2 дек 2024

Комментарии • 79

  • @purpleturtledotcom
    @purpleturtledotcom 4 года назад +157

    Found this gem after wasting my time on several 'fancy' deeplearning video tutorials.
    "If you can’t explain something in simple terms, you don’t understand it."
    - Feynman

    • @leilalovegood4131
      @leilalovegood4131 4 года назад +3

      can't agree more

    • @nithinsai2250
      @nithinsai2250 3 года назад +4

      yeah they all just use fancy words like keras, tensorflow blah blah blah

  • @ericksonramos4622
    @ericksonramos4622 6 лет назад +41

    THANK YOU for ending my 4 days 9 hours search on understating CNN first layer input data structure/computation.... Moving on to the next step

  • @muneshchauhan
    @muneshchauhan 2 года назад +11

    The way Andrew deconstructed the 3D convolution into a simple series of steps just goes in to say how great teachers can accelerate learning by manifolds.

  • @JoaoPedro-pi9ee
    @JoaoPedro-pi9ee 3 года назад +7

    Best explanation I've found about convolutions over multiple channels. Thanks.

  • @__dekana__
    @__dekana__ Год назад +2

    He explains this so well that I want to binge the entire playlist.

  • @the_random_noob9860
    @the_random_noob9860 9 месяцев назад +1

    Blessed are the people who are passionate about nn and just made it into stanford to attend lecture given by this legend

  • @cypherecon5989
    @cypherecon5989 12 дней назад

    Such a calm, clear and graphically nice represented explaination. Thanks.

  • @tomWil245
    @tomWil245 Год назад

    Finally, someone who can clearly explain the material!

  • @majinfu
    @majinfu 4 года назад +1

    Thank you so much! This video helped me to understand CNN very much!

  • @harshdevmurari007
    @harshdevmurari007 2 года назад

    The most effective way of explaining depth(no of channels) of CNN

  • @mitakshra1
    @mitakshra1 3 года назад

    thankyou sir for having great people like you in this life

  • @sammyj29
    @sammyj29 3 года назад +3

    By far the best explanation I have ever seen. Such simple and crisp!
    I had one doubt though professor, can we use CNN with data apart from images? If so, what does the filter size represent then? And how do we interpret the features of the data in terms of number of input channels?

  • @sau002
    @sau002 6 лет назад +17

    Excellent. Convolution over volumes was bugging me for a long time.

  • @cem_kaya
    @cem_kaya 2 года назад +1

    thanks for clarifying that the filter is channel deep

  • @redash3861
    @redash3861 4 года назад

    Dude I really was searching this for 2 days but there was no clear explanation on volumes thanks a lot

  • @DrN007
    @DrN007 4 года назад +5

    Great! So a conv64 basically applies 64 different filters on segments of the input.

  • @ketilmalde3402
    @ketilmalde3402 5 лет назад +9

    The formula in the summary is wrong, it should be (n x n x c) input, (f x f x c x z) filter, and (n-f+1 x n-f+1 x z) output dimensions - for z output filters and c input channels. So the convolution is a 4d tensor.

    • @the_random_noob9860
      @the_random_noob9860 9 месяцев назад

      You just accounted for the fact that there could be more than one filter and so the same number of output channels. I think the prof wrote with regard to having one filter in the summary. Not necessarily wrong ig

  • @mohammadkhubaibnasir6198
    @mohammadkhubaibnasir6198 5 лет назад +2

    First Nine Numbers from red channel then 3 beneath green channel then 3 beneath blue channel? i didn't understand that aren't we taking 3x3 from each color channel?

  • @-MuhamadFahmiAmmar
    @-MuhamadFahmiAmmar Год назад

    WOW, paham juga akhirnya, thanks

  • @devanshgoel3433
    @devanshgoel3433 2 года назад

    thank u sir! You are the real hero.

  • @mohitpandey5190
    @mohitpandey5190 5 лет назад +1

    Deep learning k one and only Jeetu bhaiya :)

  • @nikhilbadveli6
    @nikhilbadveli6 2 года назад +2

    Can we use different filter sizes in the multiple filter case? And what will be the output shape then?

  • @PietroMarcon
    @PietroMarcon 7 лет назад +1

    ..so, in every 4 X 4 convoluted matrix's pixels , u put the sum of the products of the kernel pixels for the respective 3 X 3 of the imput image, for every channel (RGB)? meaning u sum the output of the dot product (kernel by respective pixels on the image) of every channel the number of one pixel in the convoluted matrix ?

  • @rhysm8167
    @rhysm8167 11 месяцев назад

    great video. Thank you !

  • @BSelm05
    @BSelm05 5 лет назад

    very clear explanation thank you

  • @amitnair92
    @amitnair92 4 года назад

    ok, so at first i was a little confused by what adding all the filters at last mean. say pixal at position (0,0) for RGB are 20,10,30 after applying filter adding all the channels means [20,10,30] and not [60] . correct me if i am wrong.

  • @ritwikamajumdar5967
    @ritwikamajumdar5967 8 месяцев назад

    Thank you so much sir

  • @tetouaniabdellah6714
    @tetouaniabdellah6714 10 месяцев назад

    Thanks for video , i have a question , why does convolving a 6x6x3 * 3x3x3 = 4x4 ( which is a 2D ) we convolved 3D objects , so the output should be in 3D ?

  • @T-She-Go
    @T-She-Go 5 лет назад

    You are my hero. Thank you so much

  • @GagarineYuri
    @GagarineYuri 4 года назад +1

    @3:11 : So do we add the 3 convolution to output the value of the 4x4 feature map ?

    • @robbellis5944
      @robbellis5944 4 года назад +3

      Yes. Instead of thinking of it as 3x 2D convolutions added together, try thinking of it as 1x 3D convolution. It's still an element-wise product and sum of the cube of filters (or kernel) and a 3D portion of the stack of images.

    • @muhammadmaazwaseem7452
      @muhammadmaazwaseem7452 Год назад

      Why do we add the 3 convolutions, why not take thake their average value?

    • @muhammadmaazwaseem7452
      @muhammadmaazwaseem7452 Год назад

      ​@@robbellis5944Why do we add the 3 convolutions instead of taking their average value?

  • @elgs1980
    @elgs1980 4 года назад

    5:51, I don't understand why the result is not 4x4x3, but 4x4. So where are the 3 layers?

    • @agueconfle4889
      @agueconfle4889 4 года назад +1

      it seems like each layer resulted from dot product is added up to a single number. That means, you have 3x3x3 (27) multiply operations that sums up.

    • @elgs1980
      @elgs1980 Год назад

      3:21 answered my question, add them all those numbers.

  • @franco521
    @franco521 5 лет назад +4

    Why is the RGB convolution output not a 4x4x***3*** image?

    • @Vishnupratap
      @Vishnupratap 5 лет назад

      The filter is applied to all 3 layers at once in a step, to get a single output. Simple

    • @sammathew243
      @sammathew243 5 лет назад

      Since after applying the 3-channel filter across the 3 channels of the image, we get a single output, so **1** is the 3rd dimension of the output!

    • @bofloa
      @bofloa 4 года назад +1

      @@Vishnupratap you know that not possible to apply that filter to all 3 layers at once programmatically it must be done iteratively, but I think what Andrew did not say is that when you apply the filters to each layer s you get single value a summation of the 3 filter outcome goes to the 4 x 4 matrix, that is why you don't get 4x4x3 but 4x4x1...

  • @luisanaya8210
    @luisanaya8210 5 лет назад

    Thank you , very well explained :)

  • @kebakent
    @kebakent 2 года назад +2

    It's funny how concepts like this can be so confusing when you don't know it. I had no idea the conv layers had an extra unconfigurable dimension and going from 3d to 2d confused me.

  • @littletiger1228
    @littletiger1228 10 месяцев назад

    beautiful

  • @jacksonvaldez5911
    @jacksonvaldez5911 Год назад

    Why is the output 2 dimensions? If you convolve over a 2d image with a 2d filter, you get a 2d output. Wouldnt this mean if you convolve over a 3d image(R, G, B) with a 3d filter, then the output should be 3 dimensions as well right?
    Edit:
    I think I get it now. It's because the size of the 3rd dimension is the same for both the filter and the rgb image, so it only has to convolve over the z axis once, producing a 3rd dimension size of 1 in the output. So technically the output is 3 dimensions, it's just that the 3rd dimension is a size of 1 which is basically just 2d
    If you convolved over an rgb image with a 2x2x2 filter, than the output would then be 3 dimensions.

  • @bobo0612
    @bobo0612 4 года назад

    brilliant!

  • @aiinabox1260
    @aiinabox1260 4 года назад

    Awesome. Hv 4 questions, scratching my head for the last 2 weeks. In my conv layer 1, I mentioned 32 filter , does that mean 32 diff features will be extracted from each image sequentially, am using greyscale image 28x28x1. Is it possible to make the filters to apply in parallel . Next, In the case of multiple filters , can the filters applied on the image in parallel or in sequential ? How to influence the conv layer to use multiple filters ? Next question is, how to override the default filter by custom filter type ?

    • @aiinabox1260
      @aiinabox1260 4 года назад

      @MattAufF5 thanks a lot. But still I hv one nagging question... Let's say if 32 filters ( feature detectors) applied on a single image won't it cause any contention ?

    • @aiinabox1260
      @aiinabox1260 4 года назад

      @MattAufF5 awesome, thanks a ton

  • @strongsyedaa7378
    @strongsyedaa7378 3 года назад

    From 3×3 convolution how comes 4x4?

  • @MuhannadGhazal
    @MuhannadGhazal 4 года назад

    6:02, i was expecting the output to be 4 x 4 x 3. why it was just 4 x 4 ?

    • @adhoc3018
      @adhoc3018 3 года назад +1

      It think that it is because he is using the 3 filters as a cube. Thus, after the multiplication, you should sum everything. For the output to be 4 x 4 x 3 I think it would be necessary to have 3 filters for each channel

  • @sandipansarkar9211
    @sandipansarkar9211 3 года назад

    nice explanation

  • @pedrovelazquez138
    @pedrovelazquez138 3 года назад

    Thank you!!!

  • @johanverm90
    @johanverm90 5 лет назад

    Thanks a lot!!

  • @abrahamowos
    @abrahamowos 2 года назад

    Are the filter values trainable?

    • @fndTenorio
      @fndTenorio Год назад +1

      That is the whole point.

  • @wiz7716
    @wiz7716 6 лет назад

    Why are you stacking the features on each other? I don't get it!
    normally don't we just SUM UP the features so we have only one layer of features (e.g. horizontal + vertical edges)?

    • @ericksonramos4622
      @ericksonramos4622 6 лет назад

      That really confused me as well. I had to step back and understand how a computer reads an image. Computers reads an imagine as an example 6x6x3 volume. Breaking it down you have matrix of 6x6 for red color, 6x6 for green color and 6x6 for blue. They refer to the colors as 'depth' or 'channel'. With that being said, when you convolve the filters with the input image, you have to apply it to all 3 'channel' (colors). That's why one filer is again as an example 3x3x3. Watch just the introduction part in this video ruclips.net/video/umGJ30-15_A/видео.html

    • @amitnair92
      @amitnair92 4 года назад

      @@ericksonramos4622 so at last Adding all three filter what does it mean, does the RGB [23,45,23] concerts to single value 51 ??

    • @ericksonramos4622
      @ericksonramos4622 4 года назад

      @@amitnair92 i dont quite follow what you said. Elaborate more. You dont add the filter data together. You slide or convlve them with the input image.

  • @אליהולוי-ד4ה
    @אליהולוי-ד4ה 3 года назад

    !thank you so much

  • @jayshah4016
    @jayshah4016 6 лет назад +3

    Are these 3D convolutions ?

  • @kavitabhosale4861
    @kavitabhosale4861 6 лет назад

    is it possible that input 1 X 1 X 155 and filter 1 X 1 X 155 for pixel classification

    • @ragibishrak1310
      @ragibishrak1310 6 лет назад

      Kavita Bhosale I might be wrong, but I think that won’t be of much use. Since such network will just learn to match the input with the training images. It won’t be able to extract lower level features such as edges etc. It probably will show impressive performance on the training set but would not generalise well. Hoping for feedback from specialists on the topic.

    • @md.jahidhasan9337
      @md.jahidhasan9337 6 лет назад +1

      1p x 1p is so much much much tinny input not generalize

  • @rs9130
    @rs9130 4 года назад +2

    output of rgb channels after convolution must be 4x4x3 right?

  • @thealgorithm7633
    @thealgorithm7633 4 года назад

    Is it possible that the number of filter channels greater than the number of input channels?

  • @salmahayani5683
    @salmahayani5683 5 лет назад

    HEllo please is it possible to use 256*256*3 images for LeNet architecture .?

  • @ВасЯПронин-щ2э
    @ВасЯПронин-щ2э 2 года назад

    anda perlu menjelaskan kandungan

  • @jesuispac
    @jesuispac 5 лет назад

    a god

  • @latifahouria9120
    @latifahouria9120 5 лет назад

    I am a beginner in the field of deep learning if there is anyone who can help me in my project