C4W2L06 Inception Network Motivation

Поделиться
HTML-код
  • Опубликовано: 16 янв 2025

Комментарии • 78

  • @Lhtokbgkmvfknv
    @Lhtokbgkmvfknv Год назад +1

    you ease our minds that are complicated by other professors and we are thankful for that!! 🙏

  • @jimmylee2197
    @jimmylee2197 6 лет назад +24

    at 2:05, how does a pool operator change the channel size 192 to 32? Does pooling over channel make sense?

    • @hantong3108
      @hantong3108 6 лет назад +1

      Think NIN was applied after the pooling to make the number of channels match, but not sure why its heights and weights are still 28*28 after pooling

    • @zuozhou8329
      @zuozhou8329 6 лет назад +3

      Pooling and CONV are acturally similar, the output shape of each filter of them can be calculated by (n(l-1)+2p(l)-f(l))/s(l)+1. Sometimes we want to keep the output as the shape of input, that was saying n(l)=n(l-1) where n(l)=[n(l-1)+2p(l)-f(l)]/s(l)+1. Usually we set s=1, that's simplified as n(l-1)=n(l-1)+2p(l)-f(l)+1, that means we could keep the shape by setting padding as p=0.5[f(l)-1]. For exapmle, if we have a pooling filter shaped 5 by 5, our padding should be 2.

    • @zuozhou8329
      @zuozhou8329 6 лет назад

      The size of filters in pooling would be 28 * 28 *192 , and the amount of filters would be 32.
      But i dont have ideas for you second question, sorry.

    • @parnianshahkar7797
      @parnianshahkar7797 6 лет назад +1

      that's because we are using the same padding! And by the way what is a NIN?

    • @xinye8585
      @xinye8585 6 лет назад

      I am also confused about it.

  • @arunyadav8773
    @arunyadav8773 7 лет назад +20

    best explanation available online 👍

  • @fumihio
    @fumihio 5 лет назад +9

    How to choose the number of filters? 3:50
    Why 1x1 uses 64, 3x3 uses 128, and so on?

    • @akashkewar
      @akashkewar 5 лет назад +10

      well, there is no hard rule for that, Rule of thumb is, More the number of filters, More feature you are extracting. Also, keep in mind, not all the features are important (some people think more the features are better the model will be, this is not true at all) and this could lead to overfitting and computational overhead. And It is totally problem-specific (choosing filter size). It is hyper-parameter in itself. Lastly, You could do a hyper-parameter search to get the best filter size (that would be insane because you have multiple layers and each layer has a filter).

    • @tanhoang1022
      @tanhoang1022 3 года назад

      you can change P(padding) to have the same 28x28

    • @gourabmukhopadhyay7211
      @gourabmukhopadhyay7211 2 года назад

      @@akashkewar Hey, could you explain how come 28*28 remains to be 28*28 after 3*3 filters and also for others? I get for 1*1 it remains to be 28*28 as it is (28-1+1)=28.In the similar manner is not it like( 28-3+1)=26 for 3*3?

    • @RohanPaul-AI
      @RohanPaul-AI 2 года назад +2

      @@gourabmukhopadhyay7211 Good point indeed.
      And this ( 28*28 remains to be 28*28 after 3*3 filters ) is done, by setting padding='same'.
      So every time the output shape will be 28 * 28.
      Checkout out below code see the result.
      ```py
      from keras.layers import Conv2D
      from keras.models import Sequential
      models = Sequential()
      models.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(28, 28, 192), padding='same'))
      models.add(Conv2D(64, kernel_size=(3, 3), activation='relu', padding='same'))
      models.add(Conv2D(128, kernel_size=(3, 3), activation='relu', padding='same'))
      models.summary()
      ```
      OUTPUT
      ```
      Model: "sequential_1"
      _________________________________________________________________
      Layer (type) Output Shape Param #
      =================================================================
      conv2d_3 (Conv2D) (None, 28, 28, 32) 55328
      conv2d_4 (Conv2D) (None, 28, 28, 64) 18496
      conv2d_5 (Conv2D) (None, 28, 28, 128) 73856
      =================================================================
      Total params: 147,680
      Trainable params: 147,680
      Non-trainable params: 0
      _________________________________________________________________
      ```

    • @gourabmukhopadhyay7211
      @gourabmukhopadhyay7211 2 года назад +1

      @@RohanPaul-AI Yes, I also figured that out that padding was same. But still thank you for making time to comment here as it helped me to confirm.

  • @harshniteprasad5301
    @harshniteprasad5301 Год назад +1

    7:59 the value of the convoluted matrix should be 24*24*32, since the 28*28 when convoluted with a 5*5 filter will return (28-5+1) 24.

    • @aangulog
      @aangulog Год назад +2

      Not necessarily, padding can lead to a matrix of the same dimensions.

    • @harshniteprasad5301
      @harshniteprasad5301 Год назад

      @@aangulog but it wasnt mentioned that we are using padding , yes you are correct tho we can get that output using padding

    • @aangulog
      @aangulog Год назад

      @@harshniteprasad5301 Maybe it's implied, because you can say the same for the stride.

  • @marimbanation4118
    @marimbanation4118 5 лет назад +26

    give this guy a good mic

  • @JuliusUnscripted
    @JuliusUnscripted 4 года назад +3

    question regarding the 1x1 conv strategy at 6:00
    i understand that this trick reduces the number of parameters. but what i don't understand is how it is comparable to the original 5x5 conv.
    from my understand this would create completely different features because it does not use the original input of the layer but the output of the 1x1 conv. So what's the point?
    Update: Ah okay he mentions this thought at the end of the video. It seems there is no big impact on performance "if you choose the reduction right".

  • @muhammadharris4470
    @muhammadharris4470 6 лет назад +11

    3:12 rap right there :D

  • @gorgolyt
    @gorgolyt 4 года назад +2

    Absolutely lucid, as ever. 👏

  • @suryanarayanan5158
    @suryanarayanan5158 4 года назад +4

    what does "same" mean? Does it mean have the same height and width as the previous layer?

    • @alanamonteiro5381
      @alanamonteiro5381 4 года назад +5

      Exactly, as he mentions, you will need to add padding for that

    • @Joshua-dl3ns
      @Joshua-dl3ns 4 года назад

      essentially just applies filter, then pads such that the output image has same width and length as the input

  • @CppExpedition
    @CppExpedition 3 года назад

    Andrew is HUGEEE!☺

  • @rafibasha4145
    @rafibasha4145 3 года назад +1

    @8:84,why we need to multiply with output 28*28*16 instead of 1*1*192*16

  • @rehabnafea5058
    @rehabnafea5058 2 года назад

    That was very useful for me, thank you so much

  • @SagesseValdesDongmoVoufo
    @SagesseValdesDongmoVoufo 10 месяцев назад

    thank you very very much🥲🥲🥲🥲🥲🥲

  • @ahmeddrief3103
    @ahmeddrief3103 Год назад

    why did you use max pooling and do the same padding i thoughout the utilization of max pooling is to divide the dimension ??

  • @anandtewari8014
    @anandtewari8014 3 года назад

    GREAT SIR

  • @shvprkatta
    @shvprkatta 4 года назад

    amazing sir..thank you

  • @mike19558
    @mike19558 2 года назад

    Really helpful!

  • @sandipansarkar9211
    @sandipansarkar9211 4 года назад

    nice explanation.need to watch again

  • @YigitMesci
    @YigitMesci 6 лет назад +1

    What i dont understand is how an input image could have 192 multiple channels..? Is there a common type of usage where inputs are not only consist of R, G and B channels?

    • @kartikmadhira
      @kartikmadhira 6 лет назад +4

      I think the input layer he is talking about is the inception module that resides somewhat deeper in the inception network. If you look closely to the overall inception model, there is a lot of hidden layers before this model kicks in. So it's actually the 'general' inception model that he is talking about rather than the overall architecture itself.

    • @kushalmahindrakar01
      @kushalmahindrakar01 6 лет назад

      If you are familiar with the idea of edge detectors, then these 192 multiple channels are used to detect
      many different features from the image or in other word extract features. I guess you are watching the videos from the middle I suggest you, go through the whole playlist and watch videos one by one.

    • @angelachikaebirim8894
      @angelachikaebirim8894 5 лет назад +1

      Also don't forget that this 28x28x192 input could be the concatenated output from the previous inception module and probably occurs quite deep in the model so that's why the number of channels is high

    • @codderrrr606
      @codderrrr606 Год назад

      I was having the same doubt but here 192 represents the the concatenation of results from different kernals passed over the image

  • @Ghumnewali
    @Ghumnewali 2 года назад

    Now I gotta watch Inception again.. 🤔

  • @oktayvosoughi6199
    @oktayvosoughi6199 Год назад

    what I can not understand is that how after applying 5x5 or 3x3 filter still we have 28x28 output as we saw in earlier lecture we can found it by nh-2p-f+1/s.

  • @manuel783
    @manuel783 4 года назад +2

    Inception Network Motivation *CORRECTION*
    At 3:00, Andrew should have said 28 x 28 x 192 instead of 28 x 28 x 129. The subtitles have been corrected.

  • @strongsyedaa7378
    @strongsyedaa7378 3 года назад

    Why he's soo many filters? Can anyone explain me?

  • @kivique519
    @kivique519 5 лет назад +4

    Why the output dimension is still 28*28

    • @justforfun4680
      @justforfun4680 5 лет назад +3

      Same Padding. You add the exact amount of padding so that your output dimension is the same as your input

    • @valentinfontanger4962
      @valentinfontanger4962 4 года назад +1

      @@justforfun4680 I also think so

  • @kirandeepsingh9144
    @kirandeepsingh9144 4 года назад

    I have a question. Let At first convolution layer if we apply 32 filters on a gray scale image then output of first layer would be 32 matrixes or say 32 filtered images. Then at second layer if we are applying 64 filters then does it mean that we are applying 64 different filters over each of 32 filtered images???? And output of second layer would be 64*32=2048 filtered images???. Plz let it clear if anyone can

    • @manu1983manoj
      @manu1983manoj 4 года назад

      you apply filter for feature extraction and not to recreate filtered images

    • @kirandeepsingh9144
      @kirandeepsingh9144 4 года назад

      @@manu1983manoj then what would it be?

    • @manu1983manoj
      @manu1983manoj 4 года назад

      @@kirandeepsingh9144 based on filters it will extract features which will give you scaled down martrices. Dimesnsions will depend on the filter dimensions.

    • @sahajpareek6352
      @sahajpareek6352 2 года назад

      The 64 filters must have a lower dimensionality than the 32 activation maps...A simple rule is that when you decrease the dimensionality of a filter the no. of activation maps(outputs) from that filter increases keeping in mind a constant stride is taken into account. Basically to extract more precise features out of the input activation maps, you increase the no. of filters and reduce their dimensionality.

  • @맥그슨
    @맥그슨 6 лет назад +1

    Is it a way to reduce 28 * 28 * 16 to the maximum?
    Is it possible to reduce to 28 * 28 * 1?

    • @dom23rd
      @dom23rd 5 лет назад

      Yeah, I'm wondering too.. Is it hurt the data to reduce such a low 3rd dimension at the bottleneck layer?

  • @rushiagrawal9667
    @rushiagrawal9667 5 лет назад +1

    Would have been nice if a comparison of computations required for 1x1 and 3x3 convolutions were provided

  • @essamaly5233
    @essamaly5233 3 года назад

    There are some nasty and offensive commercials comes during viewing this video, I think Andrew *should* do something about it.

  • @agneljohn6093
    @agneljohn6093 4 года назад

    When I try to work on Coursera . Artificial intelligence using tensorflow . When I run the. Assignment number 3 . It says kernel died and will restract automatically

  • @pallawirajendra
    @pallawirajendra 6 лет назад +1

    He keeps skipping most of the topics.

    • @kushalmahindrakar01
      @kushalmahindrakar01 6 лет назад +1

      No, he does not skip any topics. These videos are from coursera and have questions in between the videos so that is the reason there are cuts between the video.