219 - Understanding U-Net architecture and building it from scratch

Поделиться
HTML-код
  • Опубликовано: 26 май 2021
  • Understanding U-Net architecture and building it from scratch.
    This tutorial should clear any doubts you may have regarding the architecture of U-Net. It should also inform you on the process of building your own U-Net using functional blocks for encoder and decoder.
    Example use case: Segmentation of mitochondria using only 12 images and about 150 labeled objects.
    Dataset: www.epfl.ch/labs/cvlab/data/d...
    Code generated in the video can be downloaded from here:
    github.com/bnsreenu/python_fo...
  • НаукаНаука

Комментарии • 111

  • @cvformedicalimages6466
    @cvformedicalimages6466 Год назад +3

    Thanks for the detailed explanation. This is the first time I am understanding how a Unet works! Thanks 🙂

  • @XX-vu5jo
    @XX-vu5jo 3 года назад +10

    I would love to see a video on 3D U-Net from scratch as well. That will really help on understanding it better.

  • @deividrumiancev7356
    @deividrumiancev7356 4 месяца назад +2

    Great tutorial!! Way better to learn here than via my uni lectors and teachers!! Keep it up mate! You are the best!

  • @1global_warming1
    @1global_warming1 Год назад +1

    Thank you very much for such a clear explanation of how to build a U-net architecture from scratch

  • @abeldechenne6915
    @abeldechenne6915 День назад

    that was crystal clear, thank you for the good explanation!

  • @channagirijagadish1201
    @channagirijagadish1201 8 месяцев назад +1

    Excellent Tutorial. Much appreciated!

  • @SeadoooRider
    @SeadoooRider 2 года назад

    Your channel is gold. Thank you 🙏

  • @Vikram-wx4hg
    @Vikram-wx4hg Год назад +5

    Yes, really enjoyed it!
    Sreeni, you are a fantastic teacher and your tutorials bring out the concepts with reamarkable simplicity and clarity.

  • @rishabgangwar9901
    @rishabgangwar9901 3 года назад +2

    Thank you so much sir for crystal clear explanation

  • @rohit_mondal__
    @rohit_mondal__ 2 года назад

    Your explanation is actually very good sir. Thank you. Happy to have subscribed to your channel .

  • @AmitChaudhary-qx5mc
    @AmitChaudhary-qx5mc 3 года назад

    Sir i am very much greatful to your expanation on semantic segmentation.
    You make everything so easy and sublime.

  • @dyahtitisari7206
    @dyahtitisari7206 Год назад

    Thank you so much Sir. It's very great explanation

  • @madeleinedawson8539
    @madeleinedawson8539 Год назад

    Loved the video!!! So helpful

  • @msaoc22
    @msaoc22 6 месяцев назад

    Thank you for nice simple explanation

  • @IqraNosheen-ek3nk
    @IqraNosheen-ek3nk Год назад

    very good explanation, thanks for making video

  • @ericthomas4072
    @ericthomas4072 8 месяцев назад

    Very helpful! Thank you!

  • @antonittaeileenpious8653
    @antonittaeileenpious8653 2 года назад +2

    Sir,according to what i have understood in all the layers we are getting some features and applying maxpooling to actually reduce the features extracted and in the upsampling we increase the spatial dimensions,where do we actually classify the labelled pixels,and vary their weights,and apply a particular threshold to get to our desired ROI.

  • @edmald1978
    @edmald1978 3 года назад

    Thank you very much for this video really amazing the way you explain. Thank you for your great Channel!!!!!!!!!!!!!!!!!!!!!!!!!!!!

  • @shivamchaurivar2794
    @shivamchaurivar2794 3 года назад +2

    I really love your videos, Hope to make a video on stateful Lstm. Its very tough to find good video on it.

    • @nikhilmudgal8541
      @nikhilmudgal8541 3 года назад +1

      Seems interesting. I hardly find any videos explaining Stateful LSTM myself

  • @abderrahmaneherbadji5478
    @abderrahmaneherbadji5478 3 года назад

    Great explanation

  • @caiyu538
    @caiyu538 2 года назад

    excellent lectures.

  • @orioncloud4573
    @orioncloud4573 Год назад

    thx for the clear application.

  • @pycad
    @pycad 3 года назад

    Thank you for this great explanation

  • @geethaneya2452
    @geethaneya2452 3 года назад

    I would like to see video on TransUNet. That will really help to understand its concept better.

  • @amintaleghani2110
    @amintaleghani2110 3 года назад

    @DigitalSreeni , thank you for your effort making this informative video. I wonder if we can use ResNet for Time Series data prediction. If so, Could you pls make video on the subject. Thanks again

  • @hadyanpratama
    @hadyanpratama 3 года назад

    Thank you, very clear explanation

  • @RRP3168
    @RRP3168 2 года назад +3

    Great video, but I have a question: What if I want to segment my own images, how do I get the masks for training the UNET?

  • @torikulislam23
    @torikulislam23 2 года назад

    Well thank u ,it was really obliging ❤️

  • @cutedevil173
    @cutedevil173 3 года назад

    Hi, its really interseting and educational. It would be really helpful if you train Unet on Automated Cardiac Diagnosis Challenge (ACDC) using .nifty kind of dataset

  • @anshulbisht4130
    @anshulbisht4130 Год назад

    loved ur code. i knew unet architecture but when u showed it with running code n images , it was awesome . will reimplement with some other data and try to see if it works. just one confusion what is ground truth when we are applying adam how loss is getting calculating for backprop to work.

  • @rezatabrizi4390
    @rezatabrizi4390 3 года назад

    thank you so much

  • @mithgaur7419
    @mithgaur7419 3 года назад

    I came looking for copper and I found gold, it would've saved me a lot of time if I found this channel earlier thnx for the awesome content. I'm currently working on a U-Net project using google colab and I can't figure out how to define a distribution strategy for tpu. What is the correct way to do it on this code?

  • @anorderedhole2197
    @anorderedhole2197 Год назад

    I tried making images with very narrow masks with a line a pixel in thickness. I noticed that when I resize the images the line will get broken up. Does this become more severe when the image is down sampled in the Unet model? Do you need the mask to have a very broad pixel widths to be useful?

  • @fatmagulkurt2080
    @fatmagulkurt2080 3 года назад +2

    Thank you for your effort to teach. I really appreciate your videos. I learning so much about coding. But I couldn't find any code anywhere for classifying multiclass images with DenseNet201. And also how can I do 5 fold - validation when runing theese deep learning codes. I wish you can help me. It will be so helpfull for me.

  • @ARCGISPROMASTERCLASS
    @ARCGISPROMASTERCLASS Год назад

    Excellent happy to subscribe your channel

  • @lemondragon8184
    @lemondragon8184 4 месяца назад

    awesome

  • @pallavi_4488
    @pallavi_4488 2 года назад

    doing an amazing job

  • @jetsdiver
    @jetsdiver Год назад

    For segmentation, for example, to detect things like, flood or fire, or smoke or clouds. Better to use grayscale or colored images?

  • @arshadgeo8829
    @arshadgeo8829 Год назад

    Hello Sreeni, I wanted a favor that I would like to see the complete implementation of Segnet for satellite imagery and should have idea for Segnet+Resnet (using or without transfer learning). Can you help me out?

  • @dhaferalhajim
    @dhaferalhajim Год назад

    What's the number of classes in this structure? I saw one in the input and output

  • @07jyothir
    @07jyothir 3 года назад

    Sir, Recently joined as your student. Couldn't thank you enough for this teaching. Could you please explain how to create and use a custom dataloader for large datasets?

    • @DigitalSreeni
      @DigitalSreeni  3 года назад +1

      I plan on recording a video soon but not sure when it is going to happen. Until then you may find this useful: ruclips.net/video/VNGRlf6ZlQA/видео.html

  • @talha_anwar
    @talha_anwar 2 года назад

    The decoder part should be same as encoder, but in the reverse direction. but when we concatenated, how thins thing maintained ?

  • @random-yu5hv
    @random-yu5hv 3 года назад

    I really appreciate your videos. Will you check segAN network in medical image segmentation? Best regards.

    • @DigitalSreeni
      @DigitalSreeni  3 года назад +1

      GANs are generative networks so I am reluctant to use them for segmentation. Besides, U-nets do a great job so I haven’t found a reason to find an alternative.

  • @nayamascariah776
    @nayamascariah776 3 года назад

    your videos are really amazing.. I am really thankful for your efforts.. sir I have one doubt.. if I want to add dice coefficient as a loss function.. how can I add..??

    • @DigitalSreeni
      @DigitalSreeni  3 года назад +1

      Please check my video 215 for an answer. I also covered it as part of videos 210, 211, and 214. But I wrote my own few lines for dice coefficient in video 215, so you may find it useful.

  • @apekshagopale7095
    @apekshagopale7095 Год назад

    Can you please tell how to create masks for SAR images?

  • @davidyao2856
    @davidyao2856 2 месяца назад

    can this be applied to a dicom type dataset ?

  • @computingyolo5545
    @computingyolo5545 3 года назад

    There is one aspect that is blocking me, at the line #12,
    small_dataset_for_training/images/12_training_mito_images.tif
    small_dataset_for_training/masks/12_training_mito_masks.tif
    it's not specified in this lesson, whether the large image and large mask stacks have to be left undefined as address. In other words, how could I address folders with many pictures and masks to be picked up? A simple example, please? Brilliant explanation, Doctor, long life to you!

  • @sahartaheri1032
    @sahartaheri1032 2 года назад

    great thanks

  • @deepak_george
    @deepak_george 3 года назад

    Good work @digitalsreeni ! Which tool do use to view image mask? Since in normal image viewer it shows all black.

    • @DigitalSreeni
      @DigitalSreeni  3 года назад +1

      Use imageJ.

    • @deepak_george
      @deepak_george 3 года назад

      @@DigitalSreeni Where is the option in ImageJ to configure to see the mask? Couldn't find the video in which you mentioned this.

  • @xichen7867
    @xichen7867 Год назад

    Hello teacher! Can you add a Chinese subtitle or offer a course on a Chinese video site, your courses are of very high quality! Thank you!

  • @anikashrivastava8228
    @anikashrivastava8228 5 месяцев назад

    Sir, can we seperate a u-net, in sense that can we train a u-net and then save weights of encoder, decoder bottleneck separately also, and then use it separately? will we get same reconstruction of a test dataset if we do it by u-net (entire architecture) and when we do it by feeding it to enocder, then bottleneck then decoder? Please help.

  • @tarasankarbanerjee
    @tarasankarbanerjee Год назад

    Dear Sreeni, thanks a lot for this awesome video. Just one question, shouldn't the 'decoder_block' call the 'conv_block' twice?

    • @tahaben-abbou7029
      @tahaben-abbou7029 Год назад

      No actually the encoder block has already two convs layers the Decoder should call it one time not two. Thank you

    • @tarasankarbanerjee
      @tarasankarbanerjee Год назад

      @@tahaben-abbou7029 Thanks Taha for your comments. But if you look at the UNet architecture, the Decoder block also has 2 conv layers; just like the Encoder block. Hence the question.

  • @user-cm8qc1ug5b
    @user-cm8qc1ug5b 3 месяца назад

    Thanks a lot!
    But there's a question I cant't understand: why do we use padding="same" in a decoder block and have upsampling situations? I mean our shape is not the same, it become larger. Can sb help please?

  • @drforest
    @drforest 4 месяца назад

    Thanks! If you had changed all the numbers of layers to say, 50, 100, 200, etc would that work, just with different designated layer numbers and whatever associated change in performance. Feels like that might have made the numbers a little easier to follow. But great work.

  • @talha_anwar
    @talha_anwar 2 года назад

    best

  • @rajithakv4449
    @rajithakv4449 3 года назад

    Sir I have used the unet model for segmentation of filamentous structures. Though it give a good prediction, the predictions are wider than the groung truth. What could be the reason for this. Also the IOU value is around 0.33. I have also added drop out with 0.5.

    • @DigitalSreeni
      @DigitalSreeni  3 года назад

      Try increasing threshold values for your filamentous class; I assume the probability around the wider regions are lower. If that is not the case then please verify your labels, may be they are also exaggerated? If not, check whether you are working on images of similar size showing features in a similar dimensions. Finally, try 3D U-Net as the prediction can benefit from additional information from the 3rd dimension.

  • @biplugins9312
    @biplugins9312 3 года назад

    My only choice is to run your software on Colab. It uses the latest tensorflow and I had no desire to drop back to version 1.x.
    To correct an error, I had to change the directory structure on keras.utils and instead of trying to import from
    unet_model_with_functions_of_blocks, I did a %run on the program from inside colab. The changes are
    !pip install patchify
    %run '/content/drive/My Drive/Colab Notebooks/unet_model_with_functions_of_blocks.py'
    #from unet_model_with_functions_of_blocks import build_unet
    from keras.utils.np_utils import normalize
    I don't know why but on colab it seems to be running about 1/2 the speed you are seeing in spyder.
    Epoch 25/25
    40/40 [==============================] - 58s 1s/step - loss: 0.0383 - accuracy: 0.9853 - val_loss: 0.1793 - val_accuracy: 0.9589
    It complained that "lr" and "fit_generator" were deprecated so I fixed them to:
    model.compile(optimizer=Adam(learning_rate = 1e-3), loss='binary_crossentropy', metrics=['accuracy'])
    history = model.fit(my_generator, validation_data=validation_datagen,
    but it didn't help. In any case, it does work in colab, with the latest tensorflow.

  • @lucasdiazmiguez8680
    @lucasdiazmiguez8680 Год назад

    Hi! Very nice video, just a question, do u have the link to the original paper?

  • @rishabgangwar9901
    @rishabgangwar9901 3 года назад

    I wanted to know more about .tif format

  • @olubukolaishola4840
    @olubukolaishola4840 3 года назад +1

    👏🏾👏🏾👏🏾👏🏾👏🏾👏🏾

  • @XX-vu5jo
    @XX-vu5jo 3 года назад

    Are you familiar with the attention module? Is it possible to implement such with u net? Would love to watch a video about it.

    • @DigitalSreeni
      @DigitalSreeni  3 года назад

      It is coming soon, please stay tuned.

    • @XX-vu5jo
      @XX-vu5jo 3 года назад +1

      @@DigitalSreeni i am always tuned in woah! Thanks

  • @CRTagadiya
    @CRTagadiya 3 года назад

    Could you please add this video under your image segmentation playlist?

  • @akshaybatra1777
    @akshaybatra1777 Год назад

    Does unet only works with 3 channels? I have breast mammography in dicom format, they have 1 channel (grayscale). Can I still use uNET?

    • @DigitalSreeni
      @DigitalSreeni  Год назад

      You can use it for any number of input channels.

    • @akshaybatra1777
      @akshaybatra1777 Год назад

      what about the image size? My images are 4000x3000. is it possible to use unet on them>

  • @sorasora3611
    @sorasora3611 2 года назад

    How write u_net is algorithem step?

  • @nandankakadiya1494
    @nandankakadiya1494 3 года назад

    Thank you for great explanation sir. Code is not available in GitHub. It would be great if you upload this.

    • @DigitalSreeni
      @DigitalSreeni  3 года назад +1

      It will be there soon... usually 6 to 8 hr. delay as I need to upload manually.

    • @nandankakadiya1494
      @nandankakadiya1494 3 года назад

      @@DigitalSreeni ok thanks for the great tutorial

  • @Luxcium
    @Luxcium 3 месяца назад

    21:09 I do prefer the functional programming approach… classes are useful to describe functors, monads, maybe and even some “eithers” 😏😏😏😏 this is way easier to understand for me but I don’t say FP is better than OOP or any such… 😅😅😅😅

  • @jithinnetticadan4958
    @jithinnetticadan4958 3 года назад

    Will this work for 256*256 rgb images or should I increase the layers and start from 32/16?

    • @DigitalSreeni
      @DigitalSreeni  3 года назад

      U-net is a framework where you convert an autoencoder architecture into U-net by adding skip connections. There is no right or wrong and the network can be customized for your specific application. The example I provided will work for 256x256 RGB images, you just need to define the number of channels as 3.

    • @jithinnetticadan4958
      @jithinnetticadan4958 3 года назад

      Thanks for the reply..
      I tried using the same but my single epoch takes upto 30 mins to complete. (without gpu) Is it normal?

    • @DigitalSreeni
      @DigitalSreeni  3 года назад

      Depends on the amount of data. It will be painfully slow without GPU. Try using Google colab where you get a free GPU.

    • @jithinnetticadan4958
      @jithinnetticadan4958 3 года назад

      Thanks a lot. Actually my dataset contains 7200 images including the masks so its impossible to make use of google colab, only option is to reduce the size of my dataset.

    • @jithinnetticadan4958
      @jithinnetticadan4958 3 года назад

      Also sir in your video you had mentioned about increasing the layers so I tried increasing the layers by 2 (16,32) but the number of parameters remains the same. What could be the reason?

  • @antonittaeileenpious8653
    @antonittaeileenpious8653 2 года назад

    Sir,is the last layer a FCN layer.

    • @DigitalSreeni
      @DigitalSreeni  2 года назад

      U-net is a fully convolutional network, so there are no FCN layers.

  • @guitar300k
    @guitar300k 2 года назад

    Is u-net the best for image segmentation?

    • @DigitalSreeni
      @DigitalSreeni  2 года назад

      It is the most widely used framework for image segmentation where a lot of papers have been published. So we know it works.

  • @himanimogra6824
    @himanimogra6824 Год назад

    Can we pas an input size of 224 * 224 to U-Net?

    • @himanimogra6824
      @himanimogra6824 Год назад

      224*224*1

    • @DigitalSreeni
      @DigitalSreeni  Год назад +1

      Yes. You can pass any image size - U-Net is fully convolutional.

    • @himanimogra6824
      @himanimogra6824 Год назад

      @@DigitalSreeni Thank You for the reply sir.
      I have one more doubt when I am training my model my kernel is getting dead again and again at the start of 1st epoch itself. What should I do? I have resized my images in 224*224*224 dimension

  • @effeff3253
    @effeff3253 2 года назад

    Can you please explain these two doubts:
    1) Why is the number of feature maps has been reduced to half in each layer of the expansion phase?
    2) Say for the 1st layer of the expansion phase, the input is 16x16 with 1024 feature maps then how does it become 32x32 with 512 feature maps after applying a simple up-convolution of 2x2. I mean up-convolution is simply copying the data into a larger block so number of feature maps should have nothing to do with this copying and be same as 1024 only.When doing up-conv 2x2, which 512 feature maps have been taken out from 1024 feature maps?

    • @DigitalSreeni
      @DigitalSreeni  2 года назад

      The number of feature maps has nothing to do with the convolution kernel. The number of feature maps is defined by you, as part of your model. If you define your Conv. as - Conv2D(512, (2, 2), strides=2), you are defining the number of feature maps as 512 and kernel size for the convolution operation as 2x2 and stride as 2. This means your output would have 512 feature maps and the output image dimensions would be whatever you get with a 2x2 kernel and stride 2. Most people have a misunderstanding about this concept and I am glad you asked.

    • @effeff3253
      @effeff3253 2 года назад

      @@DigitalSreeni Thanks for replying but my doubts still remain. For e.g. in the first layer of contraction phase, the output is 64 images of 256x256. When it is subjected to max pooling, the size of each image tile is reduced to half i.e. we now have 64 images of size 128x128. Now in the 2nd layer, I have 128 filters. Are these 128 filters applied to each of the 64 images of 128x128? If it is, for each of the 64 images of size 128x128, I have 128 output images. i.e. I have a total of 64x128 images of size 128x128. which keeps on growing after each convolution operation.

  • @HafeezUllah
    @HafeezUllah 2 года назад

    Thank you for this great explanation