226 - U-Net vs Attention U-Net vs Attention Residual U-Net - should you care?

Поделиться
HTML-код
  • Опубликовано: 5 окт 2024
  • Is there a clear advantage of modified U-Net modules such as Attention U-Net and Residual U-Net over the standard U-Net? Watch the video to find out.
    Code generated in the video can be downloaded from here:
    github.com/bns...
    Dataset from: www.epfl.ch/la...
    Images and masks are divided into patches of 256x256.

Комментарии • 95

  • @sanosay
    @sanosay Год назад +2

    I feel sooo disappointed that I just stumbled on your videos and not before my PhD.
    Amazing work in general, thank you a lot for sharing!
    ps: great repo as well

  • @maryselvi2580
    @maryselvi2580 2 года назад +1

    Thank you Sreeni. Very useful and informative. Also addictive to your voice and slang

  • @kavithashagadevan7698
    @kavithashagadevan7698 3 года назад +3

    This is very informative. Thank you very much for creating great content.

  • @Thetejano1987
    @Thetejano1987 3 года назад

    Cool breakdown, and love the honesty in saying the differences are likely not statistically significant. Would have liked to see a quick comparison of inference times of each at the end, but great video nonetheless.

    • @DigitalSreeni
      @DigitalSreeni  3 года назад

      Thanks. Just trying to keep it real :)

  • @vassilistanislav
    @vassilistanislav 3 года назад +12

    Dear Sreeni, If possible can you please cover the topic of 3D reconstruction , 3d labeling and multi layer classification of 3d models.

  • @Mach89
    @Mach89 Год назад

    Another great tutorial!
    Just one tip from my side - instead of "if batch_norm is True:" you could simply state "if batch_norm:". It is because batch_norm is boolean anyway. :)

  • @ZhiGangMei
    @ZhiGangMei 3 года назад +2

    One suggestion regarding the IoU score, using the IoU score for mitochondria class instead of the meanIoU score. The meanIoU score is the average of the IoU scores of mitochondria and background.

  • @AlainFavre-n4k
    @AlainFavre-n4k Год назад

    Ah ah, sorry, did not see this video. The link was not obvious on your list of tutos. Very nice results.

  • @TECHNEWSUNIVERSE
    @TECHNEWSUNIVERSE 3 года назад +3

    Amazing work sir. you got a new subscriber for life time. Can you please do a video on how to evaluate multiple object detectors like SSD, YOLO, R-FCN and Faster R-CNN in terms of speed and mAP. I have searched whole RUclips an didn't found such concepts. Thank You in advance Sir.

  • @ParniaSh
    @ParniaSh 2 года назад +8

    Amazing video! It's very useful. I've subscribed to your channel. A minor thing you might want to change: concatenation and addition are different operations. The first one stacks two feature maps but the second one does the mathematical addition operatoin. Therefore, you might want to rename the variable in line 150 from concat_xg to add_xg.

    • @DigitalSreeni
      @DigitalSreeni  2 года назад +6

      Thanks for the tips! I really appreciate. I sometimes tend to use conversational English terms that may end up confusing someone with limited knowledge in the filed.

    • @fahd2372
      @fahd2372 10 месяцев назад

      He's wrong about the gating signal too, that's not how you get it. The gating signal is literally just the upsampled layer from beneath. It's the same tensor that you concatenate with the attention's output. I don't know where he got this code from but its not correct

  • @syedjaved833
    @syedjaved833 Год назад

    Thank you for the lovely contents you create, Sir. It is really helpful and I appreciate the effort you have put here.

  • @cplusplus-python
    @cplusplus-python 3 года назад +1

    Awesome job, Thank you. I was wondering what if we face some data which are not labeled! Would be great if you give us hints and how to deal with unsupervised deep models. Thanks again Professor.

    • @DigitalSreeni
      @DigitalSreeni  3 года назад +1

      Unsupervised image segmentation can be done by using clustering on features but that approach will not result in robust segmentation. It works fine on images that are a bit simple but not very well in most scenarios.

  • @ninjaclappy
    @ninjaclappy 8 месяцев назад

    Thanks for the nice explanation!
    There are two details I was wondering about. The gating_signal function is applied to change the number of features in the gating (so that it matches the number of features in x). However, the same is happening in the attention_block too. Doesn't that mean that the gating_signal function is redundant in this case? Shouldn't it be enough to apply the 1x1 convolution in the attention_block to change the number of feature maps? Or is the additional 1x1 convolution (with relu and batchnorm ) implemented in case one wants to reduce the number of channels before the attention_block (to save computational ressources)?
    Also, the gating is upscaled, which also seems redundant, since x is downsampled to match the spatial dimensions of the gating. Is that implemented for a general case (e.g., if you want to use a gating from a deeper part of the network)?

  • @I77AGIC
    @I77AGIC Год назад

    if you had added in augmentation and trained until the validation losses fully flattened out you would see a bigger difference between the 3 types. Any time you compare models with different numbers of parameters you have to compare them to each other after training is fully done. Sometimes the model that would be best in the long run actually starts out worse.

  • @vijayakotari2933
    @vijayakotari2933 3 года назад

    Thanks a lot for the clear explanation sir. keep going

  • @angelceballos8714
    @angelceballos8714 3 года назад

    Great video, thanks!

  • @hfarmani
    @hfarmani Год назад

    If I could, I would give you thousands of likes

  • @dossierfichier7313
    @dossierfichier7313 3 года назад +2

    Hi Sir, first of all thanks for this huge work. i want to know if we can use this for multiclass segmentation and different applications ? is there a method and can we search best parameters like activation function, learning rate, dropout, init weights, deep block number... and so on for it ? and is usable for segmentation model library ?

  • @KibitokBett
    @KibitokBett Год назад

    Thanks a lot for the amazing video. can you make a video on Siamese network with UNET for damage detection with pre and post-scenario images

  • @happy_kids4670
    @happy_kids4670 Год назад

    that was so useful thank so much sir, but I can't find the generated code in the repositry

  • @anandsrivastava5951
    @anandsrivastava5951 2 года назад

    Dear Sreeni
    Greetings and really nice explanation! Great Job!
    Is it possible that you could hare the codes using which you have explained video.
    Many Thanks in Advance !

  • @marjanfaraji6610
    @marjanfaraji6610 2 года назад

    Hi sir, Thank you for your fantastic video, But I couldn't find the code in mentioned link, Can you guide me from where to access the code?

  • @Seye-lb2rf
    @Seye-lb2rf 7 месяцев назад

    Thank you for the video. I'm unable to find the code in the github repository shared. Has it been removed, or could you please point me in the right direction. Thanks

  • @nisrinadinda5253
    @nisrinadinda5253 2 года назад

    Hi sir, amazing explamation! In the video, why did you set axis=3? Is it because the image channels is 3? If i use grayscale image then i will use the axis=1?

    • @DigitalSreeni
      @DigitalSreeni  2 года назад

      Please read my notes in the code I shared. Here is what I wrote in the code...
      "Note: Batch normalization should be performed over channels after a convolution,
      In the following code axis is set to 3 as our inputs are of shape
      [None, height, width, channel]. Channel is axis=3."

  • @rs9130
    @rs9130 3 года назад

    hello again, Sreeni,
    upsample_g and upsample_psi is new compared to the last video. can we skip it?
    thank you

  • @nghethuatsong
    @nghethuatsong 2 года назад

    Thank you for your useful video. How do we have the data folder 14.42? Becasuse I downloaded .tif files. Please guide me. Thank you.

  • @AlainFavre-n4k
    @AlainFavre-n4k Год назад

    I've tried to use this: waste of time, so much problems with the microscopist package....

  • @a.h.s.2876
    @a.h.s.2876 10 месяцев назад

    Thanks

  • @rs9130
    @rs9130 3 года назад

    can you please make video on how to use otherbackbones like vgg16, resnet with unets. not using custom libraries but, excluding top layers and adding it to unet. thank you. also need fcn implementation please.

  • @mansisharma1245
    @mansisharma1245 3 года назад

    I have a query. In the unit_model code line no. 62 and 63, when I am running this line is it showing the error "ValueError: with n_samples=0, test_size=0.1 and train_size=None, the resulting train set will be employ. Adjust any of the aforementioned parameters". How should I resolve this error?

  • @kapilarora2764
    @kapilarora2764 2 года назад

    Hello Sir,
    You give a very clear understanding of the concepts. Thanks alot!
    I am trying to implement the above code and I encountered the following error while performing model.fit()
    Epoch 1/50
    ---------------------------------------------------------------------------
    TypeError Traceback (most recent call last)
    in ()
    5 batch_size = batch_size,
    6 validation_data=(X_test, y_test_cat),
    ----> 7 shuffle=False)
    8
    9 stop1 = datetime.now()
    1 frames
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/func_graph.py in autograph_handler(*args, **kwargs)
    1145 except Exception as e: # pylint:disable=broad-except
    1146 if hasattr(e, "ag_error_metadata"):
    -> 1147 raise e.ag_error_metadata.to_exception(e)
    1148 else:
    1149 raise
    TypeError: in user code:
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1021, in train_function *
    return step_function(self, iterator)
    File "", line 43, in dice_coef_loss *
    return -dice_coef(y_true, y_pred)
    File "", line 27, in dice_coef *
    intersection = K.sum(y_true_f * y_pred_f)
    TypeError: Input 'y' of 'Mul' Op has type float32 that does not match type int64 of argument 'x'
    I am not able to understand this error.
    please reply in solving the above error.
    Thanks

  • @nghethuatsong
    @nghethuatsong 3 года назад

    Thank you DigitalSreeni. I download the dataset already but I can't open it. The file format is "volumedata.tif". How can I unzip it to image folder and masks folder? Please help me have the data? Thank you.
    14:21 Can you explain clearly, How do we drop the dataset in order to have images folder and "mask folder". Thank you.

  • @salimibrahim459
    @salimibrahim459 3 года назад

    Just curious would these models work well for bone marrow fibrosis segmentation?

  • @mansisharma1245
    @mansisharma1245 3 года назад

    Sir I want to calculate average precision of these models. I am getting the PR curve as a horizontal line. My precision has only one value that is 1. Is it ok to have horizontal line for PR curve?

  • @ZhiGangMei
    @ZhiGangMei 3 года назад

    Dear Screeni, I used your codes to work on my datasets for semantic segmentation. I noticed there are some issues with the Jacard coefficient during training. After running 100 epoch, the Jacard coefficient is only 0.19 for the test datasets. However, the meanIoU score calculated using the trained model is about 0.8. I don't know why there is such a huge difference for Jacard coefficient and meanIoU score for the same test datasets. I thought they are the same. Could you let me know your suggestions/comments?

    • @DigitalSreeni
      @DigitalSreeni  3 года назад

      Jaccard and IoU should be the same. If you see different values, please check your calculation. It almost looks like you are getting 1-IoU (1-0.8) for your Jaccard.

  • @ndin1620
    @ndin1620 3 года назад

    Hi, amazing explanation! But, I can’t find this code on your github repo? Whats the file name ya? Thank you so much..

    • @DigitalSreeni
      @DigitalSreeni  3 года назад +2

      I just checked, the code is there. Please look for files with the same number as the video. For example, this video will be associated with any files containing 226 in the prefix. In fact, look for 224_225_226 prefix as the same code applies to all these videos.

  • @toutou18061
    @toutou18061 3 года назад

    Thanks you very much. How do you add a dense layer to your Attention ResUNet ?

    • @DigitalSreeni
      @DigitalSreeni  3 года назад

      Unet is fully convolutional, not sure why you’d like to add dense layers.

  • @vimalshrivastava6586
    @vimalshrivastava6586 2 года назад

    Awesome video👌👌

  • @mrspixel1
    @mrspixel1 3 года назад

    Hi, i wanted to say thank you for your effort and these videos! great help
    i wanted to ask if you have any leads on cnn 3d for mri images that are nifty .nii files.

    • @DigitalSreeni
      @DigitalSreeni  3 года назад +1

      I’m working on it. I anticipate releasing videos in mid August.

  • @aritahalder9397
    @aritahalder9397 7 месяцев назад

    Why are you doing upsampling_g...?? this was not there in the previous video, also isnt phi_g and theta_x the same dimension? what is the need for transpose conv2D?

    • @ninjaclappy
      @ninjaclappy 7 месяцев назад

      Hi! I was wondering about the same thing (see my comment from last month). Maybe the code is designed to work for a general case. E.g., if you want to calculate the attention based on a deeper layer. In that case you have to upsample the gating too in order to match dimensions.

    • @ninjaclappy
      @ninjaclappy 7 месяцев назад

      I also thoroughly tested the attention resunet with this attention and without the (seemingly) redundant gating signal and upsampling. The performance was very similar. I now work without upsampling and gating signal, since it seems more in line with the original publication mentioned here. However, they also used different methods for the attention (e.g., in one case they used the deepest layer as gating for ALL other layers)

  • @vincente_z6139
    @vincente_z6139 2 года назад

    Thanks!

    • @DigitalSreeni
      @DigitalSreeni  2 года назад

      Thanks for your generous contribution Vincente. Please keep watching.

  • @sharifimroz6231
    @sharifimroz6231 2 года назад +1

    Greetings! First of all, MILION OF THANKS for those diamond equivalent recourses that you have created for free. Would you like to make a video for BREAST CANCER ULLTRASOUND image segmentation using U-Net? I will be eagerly waiting for your response.

    • @DigitalSreeni
      @DigitalSreeni  2 года назад +1

      Can you please refer me to the data set with labels? I can make a video if the training data is readily available.

    • @vincente_z6139
      @vincente_z6139 2 года назад

      Hi, I don't know if my answer will be helpful because you asked your question seven months ago.
      I'm working on Liver cancer CT-scan image segmentation using UNET, so if you're always on your project, you can contact me if necessary.

    • @sharifimroz6231
      @sharifimroz6231 2 года назад

      Send me your email

  • @josebarrera6313
    @josebarrera6313 2 года назад

    Dear Sreeni, why is the number of classes is 1, shouldn't be 2? For binary classification

    • @DigitalSreeni
      @DigitalSreeni  2 года назад +2

      Why should it be 2? If you are trying to classify an image as either a cat or a dog do you really need 2 parameters? What if I tell you that the image does not belong to a cat class, doesn't that mean it is a dog class? In other words, if I have a single output and if the probability 0 corresponds to dog and 1 corresponds to cat, that is enough for me to perform binary classification. Anything above probability of 0.5 is a cat and below 0.5 is a dog. Most people get confused about this topic and you are not alone. So thanks for asking the question. By the way, you can formulate it as a multiclass question with 2 outputs but then you need to convert your data into categorical. Treating it as binary (single output) is the easiest way.

    • @josebarrera6313
      @josebarrera6313 2 года назад

      @@DigitalSreeni Thank you very much for your answer!

  • @aleenasuhail4309
    @aleenasuhail4309 2 года назад

    I am using the same model for 512X512 2D brain MRA images(MIPs) using the binary focal loss gives me absurd results (loss starts at 0.019 val recall and precision as well as Jaccard score in log or 0) what could possibly be wrong?

    • @vincente_z6139
      @vincente_z6139 2 года назад

      Hi, I don't know if my answer will be helpful because you asked your question seven months ago.
      So, first, I think you should check the input tensor of your metrics methods (if you're using tensorflow-gpu); and second, check your learning rate value

  • @MrPinku18
    @MrPinku18 3 года назад

    Dear Sreeni, Many thanks for the informative video. I tried to implement this code for multi class problem by changing sigmoid to softmax. But it shows some error "ValueError: Dimensions must be equal, but are 32768 and 163840 for '{{node mul_1}} = Mul[T=DT_FLOAT](Reshape, Reshape_1)' with input shapes: [32768], [163840].". could you please let me know, if its something problem with my data? Many thanks for your help.

    • @DigitalSreeni
      @DigitalSreeni  3 года назад

      Tough to troubleshoot with just one sentence error. You are working with arrays and please trace back to where you are getting a mismatch. Also, I hope you are converting your masks into categorical as you're trying to do multiclass segmentation.

    • @MrPinku18
      @MrPinku18 3 года назад

      @@DigitalSreeni Thank you for reverting back Sreeni. I will have a look it again, if I get the same error I will try to use the discord platform to discuss it.

  • @rs9130
    @rs9130 3 года назад

    hello,
    your old unet implementation had
    Total params: 1,254,622
    Trainable params: 1,254,622
    Non-trainable params: 0
    this one has
    Total params: 31,404,502
    Trainable params: 31,392,666
    Non-trainable params: 11,836
    how is this?

    • @DigitalSreeni
      @DigitalSreeni  3 года назад +1

      The new U-Net uses different number of filters. In the new one we start with64 filters and work our way to 1024 and back to 64. In the old one we start with 16 filters and go up to 256 and back to 16 before getting to the final output layer. The new U-net from this video uses lot more filters in each convolution and hence the increased number of trainable parameters. You can define any number of filters based on the complexity of the features in your images.

    • @rs9130
      @rs9130 3 года назад

      @@DigitalSreeni thank you very much. Great content

  • @rs9130
    @rs9130 3 года назад

    please make video on custom data generators for large dataset, my memory gets full while loading data. also the best datatype for image and mask labels is not clear. which dtype is best for faster processing. also please make video the use of tensor data or tf records. is it efficient way? or numpy is better. thank you.

    • @DigitalSreeni
      @DigitalSreeni  3 года назад +1

      I just did while explaining Brats segmentation. Please watch the video ruclips.net/video/PNqnLbzdxwQ/видео.html

    • @rs9130
      @rs9130 3 года назад

      @@DigitalSreeni thank you I will check it

  • @salmahayani2710
    @salmahayani2710 3 года назад

    Hello dear Sreeni, i want to if this comparaison is would the same thing for the 3D case ? Did try it for 3D ?

    • @DigitalSreeni
      @DigitalSreeni  3 года назад +1

      I did not do the same exercise for 3D but I have no reason to suspect why it would be any different.

    • @salmahayani2710
      @salmahayani2710 3 года назад +1

      @@DigitalSreeni Thankx for ur answer sir, i did the first test and it seems to be same logic

  • @shamlabeevia9436
    @shamlabeevia9436 Год назад

    how to get ur preprocessed data?

  • @texasfossilguy
    @texasfossilguy 2 года назад

    Can you input arrays with shapes greater than 3 channels into these networks? r g b H S V L A B for example? i assume the computations might be much greater for that, but dropout of channels could lessen that cost...

    • @DigitalSreeni
      @DigitalSreeni  2 года назад

      Yes, of course. Not much added cost to computation. Deep learning creates hundreds of features anyway so a few additional channels is not a big deal.

  • @alisultan3174
    @alisultan3174 5 месяцев назад

    WoW

  • @padmavathiv2429
    @padmavathiv2429 3 года назад

    hello sir... still not getting this modified code in your GitHub repo?
    thanks in advance

    • @DigitalSreeni
      @DigitalSreeni  3 года назад

      Just checked, the code is there. Same files for videos 224, 225, and 226.

    • @padmavathiv2429
      @padmavathiv2429 3 года назад +1

      @@DigitalSreeni yes... got it thank you sir.......

  • @mansisharma1245
    @mansisharma1245 3 года назад

    Is it possible to have different file format for images and masks? The file format for images are jpg and for masks are png.

    • @DigitalSreeni
      @DigitalSreeni  3 года назад +1

      The images and masks can be any format. Once you read them, they will be numpy arrays anyway.

    • @mansisharma1245
      @mansisharma1245 3 года назад

      @@DigitalSreeni Thank you Sir.

  • @nathanhorton2613
    @nathanhorton2613 2 года назад

    the video image is too poor, you need to fix it more

    • @texasfossilguy
      @texasfossilguy 2 года назад

      change your settings to advanced and pick 720p or higher resolution, its the ... icon on the youtube app.

  • @teddy911
    @teddy911 3 года назад +1

    22:26 if you care about the result