159b - Pretrained CNN (VGG16 - imagenet) features for semantic segmentation using Random Forest

Поделиться
HTML-код
  • Опубликовано: 3 янв 2025

Комментарии • 97

  • @cirobrosa
    @cirobrosa Год назад +1

    The greatest teaching skills in one guy. Many thanks!

  • @shabinaa6407
    @shabinaa6407 3 года назад +2

    Sir, you are providing real training with coding makes your video exceptional

  • @paulikhane
    @paulikhane 3 года назад +1

    Just found your channel and must say I am enjoying the way you make it easy. I will like to be in your network so I can share ideas with you.

    • @DigitalSreeni
      @DigitalSreeni  3 года назад

      Everything in life easy if you understand the fundamentals. I am trying my best to communicate the fundamentals so you feel more comfortable with coding. Thanks for watching and you can follow me on Twitter or connect over LinkedIn. (Both: @digitalsreeni)

  • @tapansharma460
    @tapansharma460 3 года назад

    sir awsm videos all are .........I am suggested many of my frns to go with Digital Sreeni.

  • @samarafroz9644
    @samarafroz9644 4 года назад +2

    Thankyou so much sir you're the best

  • @sharmakartikeya
    @sharmakartikeya 2 года назад +2

    1. Is this also suitable for large datasets? If so, then why do we even bother using UNet?
    2. What are its limitations that are fulfilled by training a UNet model from scratch?

  • @gurdeepsinghbhatia2875
    @gurdeepsinghbhatia2875 4 года назад

    So unique , thanks sir so much , huge respect n support

  • @hiankun
    @hiankun 2 года назад

    This is gold. Really.

  • @jordancaraballo-vega1265
    @jordancaraballo-vega1265 3 года назад

    NVIDIA Rapids framework provides CuML for GPU accelerated machine learning algorithms. CuDF is the GPU implementation of Pandas ( dataframes ).

  • @8147333930
    @8147333930 2 года назад

    thank you so much Sreeni😇

  • @tonihullzer1611
    @tonihullzer1611 3 года назад +1

    Wow thank you, I am confused, that your masks only have some cells highlighted, as you have shown in the beginning, and in the end the predicted one has more or less segmented all cells of the brain, but in the training there were only a few, I don't understand how the network could learn to classify the others not as background?

    • @DigitalSreeni
      @DigitalSreeni  3 года назад +1

      This is what machine learning is - you train a machine to do tasks by learning from a few examples and then extending the learning to other cases.

    • @hiankun
      @hiankun 2 года назад +1

      I am also surprised by the same thing. It didnt' fit to my knowledge and I feel that I have to do some experiments to update my understanding of ML. :-D

  • @mulugetashitie7282
    @mulugetashitie7282 Год назад

    interesting
    but there is error on features=new_model.predict(X_train)
    KeyError: 'pop from an empty set' what could be ?

    • @DigitalSreeni
      @DigitalSreeni  Год назад

      It says 'empty set' - looks like there is an issue reading images or masks.

  • @chaosdesigner123
    @chaosdesigner123 4 года назад +1

    I was a bit confused at 9:11 where you said that opencv is reading images in bgr, and you want to convert them to rgb. But what you are doing in line 40 is converting from rgb to bgr, which is exactly the opposite thing? Am I misunderstanding smth. here?

    • @samtux762
      @samtux762 3 года назад

      Cv2 swaps rgb to bgr when reading inages. RGB2BGR is the same as BGR2RGB.

  • @SachinKumar-jy1jj
    @SachinKumar-jy1jj 3 года назад

    God bless you man

  • @mohammadhosseinsadeghi6285
    @mohammadhosseinsadeghi6285 4 года назад +2

    Thank you so much for your amazing educational channel.
    In this video,You dropped pixels that had zero label because of fastening the process.
    But if we dropped background pixels in training,How can the model learn to distinguish background pixels?

    • @DigitalSreeni
      @DigitalSreeni  4 года назад +4

      In this example pixel value 0 does not represent background, it represents unlabeled pixels so I dropped them as they do not represent any real features (including background) in my image.

  • @rezaebrahimi6757
    @rezaebrahimi6757 2 года назад

    Hello Mr. Sreeni. when I changed images and run it. this error show ValueError: Length of values (3059712) does not match length of index (1019904)

  • @sando_7
    @sando_7 3 года назад

    This video is extremely helpful for the beginner! Thank you very much :)

  • @roby1251
    @roby1251 2 года назад

    Hello Sreeni,
    I'm a bit confused about your statement that UNets do not work well if you don't have 'tremendous amounts' of training data.
    However, according to the UNet paper by Ronneberger, UNets are specifically designed to "be trained end-to-end from very
    few images". In fact, it repeatedly states that this architecture has been created precisely because biomedical tasks have very little training data, the authors intended to tackle this issue of successful training of deep networks requiring many thousand annotated training samples.
    So, as far as how I understand it, UNets can be used as a way of bypassing the issue of limited training data. Or did I actually completely misunderstand what Ronneberger et al. said in their paper, did I confuse some things there? Do you think there are any contradictions on this matter? Please help me out on this. Thanks in advance!

    • @DigitalSreeni
      @DigitalSreeni  2 года назад +2

      "UNets do not work well if you don't have 'tremendous amounts' of training data" - Here, tremendous refers to the data size in comparison with the data required for Random Forest. With traditional machine learning, you just need a few scribbles of ground truth from your images. With U-net, you need a lot more than a few scribbles. I did a video on the topic of limited training data for U-net. In the video, I've demonstrated using only 12 images (fully annotated) for U-net segmentation. While we ended up with acceptable results, there was still a lot of room for improvement. In summary, U-net does require a lot more training data than random Forest. And U-net may be efficient with smaller training datasets compared to other semantic segmentation deep learning architectures.

  • @NS-te8jx
    @NS-te8jx 2 года назад

    but why are you combining the 8 images together? 8 images of size 1024x996x3 convolved to 1024x96x64. so technically making 8 images of 1024X996 with 64 channel features. so its logical to combine and flat 1024X996. But why would you combine all 8 train images to flattening? i don't understand that, could you explain?

  • @sharmakartikeya
    @sharmakartikeya 2 года назад

    For some reason, my GPU memory get overflowed if I have more than 8 images in my dataset. I have 4GB VRAM GTX 1650Ti and images are reduced to 256x256. I tried using batch size = 1 as well but no difference. Please help.

  • @nkechiesomonu8764
    @nkechiesomonu8764 Год назад

    Dr Sreeni good day sir. Thank you for the video. please I have a question. what effect does convolutional filter size has on SVM. Thanks

  • @seilkwon1095
    @seilkwon1095 3 года назад

    Thank you so much. I'm getting the following type error on the line "feature = new_model.predict(X_train)" (the shape of my 'X_train' is (163, 450, 300, 3)): "TypeError: tf__predict_function() missing 19 required positional arguments: 'x', 'y', 'batch_size', 'epochs', 'verbose', 'callbacks', 'validation_split', 'validation_data', 'shuffle', 'class_weight', 'sample_weight', 'initial_epoch', 'steps_per_epoch', 'validation_steps', 'validation_batch_size', 'validation_freq', 'max_queue_size', 'workers', and 'use_multiprocessing'". Would you perhaps know what my problem is? Thank you.

  • @yeakub_sadlil
    @yeakub_sadlil Год назад

    There are 10k videos of echoNet dataset. How can I segment so many images?

  • @caiyu538
    @caiyu538 3 года назад

    Would you please provide a training video about apeer, this tool? Thank you.

  • @ZhiGangMei
    @ZhiGangMei 4 года назад

    Excellent video. I wonder if there is a way to quantitatively evaluate the model accuracy for image segmentation as the case for classification.

    • @DigitalSreeni
      @DigitalSreeni  4 года назад +1

      Yes, use IOU for evaluating semantic segmentation. I’ll try to make a video on this topic.

  • @aliali-sm3dq
    @aliali-sm3dq Год назад

    Hi , thank you for sharing this video. Can we use this model *vgg16+RF) for rgb pictures? I think in this model we could not do Upsampling and transpose convolution, Am i right?

  • @kaluleramanzani9212
    @kaluleramanzani9212 4 года назад

    Thank you so much. Is the number of masks necessary have to represent each image in the training set. Or does the number of masks have to be the same as that of the training samples

    • @DigitalSreeni
      @DigitalSreeni  4 года назад

      Every image must have a corresponding labeled mask. So yes, the number of images and masks need to be the same.

  • @AhmedEmamAI1
    @AhmedEmamAI1 4 года назад

    You are greaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaat

    • @DigitalSreeni
      @DigitalSreeni  4 года назад +1

      Lots of love from US to Germany. I miss travel, otherwise I'd be at Hofbräuhaus in München by now....

    • @AhmedEmamAI1
      @AhmedEmamAI1 4 года назад

      @@DigitalSreeni Christmas market is missing you already :D

  • @mohyminislam5113
    @mohyminislam5113 3 года назад +1

    Dear Sir,
    All very nice and amazing lesson. Thank you so much.
    I would like to know in tutorial 91 you mention that up to the 'block5_conv3' the CNN almost learn about the image how to classify. And in particular this tutorial you took up to "'block1_conv2" so far my question is: 1. Can this amount of feature is really enough to classify a new image? 2. And, If I want to take more features probably up to 'block5_conv3' how can I create the data set?
    TIA
    Isalm

    • @DigitalSreeni
      @DigitalSreeni  3 года назад +1

      block1_conv2 in VGG16 gives 64 features by keeping my original image size the same. So I don't have to do any additional reshaping of my image arrays. This is why I picked that block. 64 filters (features) are more than enough, in my experience. You can try deeper features but you need to reshape arrays to match input shape.

    • @ปริษาดํารงศิริ
      @ปริษาดํารงศิริ 3 года назад

      @@DigitalSreeniDo you have any example to reshape this?

  • @sudeepph9350
    @sudeepph9350 3 года назад

    I just want to know how to find out the accuracy for this model and to plot it .... will you please help me to do this??

  • @jamesren9100
    @jamesren9100 4 года назад

    Hi Sreeni, thank you for sharing this video. I was wondering if I have a picture that has more than 3 channels, is there any way I can get pre-trained weight? If not, what may be the best way to extract features. Can you please help me? Thank you!

  • @rajeshwarsehdev2318
    @rajeshwarsehdev2318 3 года назад

    Can we use this approach using few layers in pretrained model and built multi classifier?

  • @edmald1978
    @edmald1978 3 года назад

    Hi Sir when you use this pre-trained model, you need to pre-process the images in the same way that images were pre-processed to train the VGG16. In this way, why did you not perform this in your tutorial? it is necessary ? Thank you in advance.

    • @DigitalSreeni
      @DigitalSreeni  3 года назад

      You don't have to preprocess images the same way as the original model unless you are using the original model in its entirety for prediction. Here, I am just using pre-trained weights as feature extractors so it does not matter whether I scale or normalize or follow my own pre-processing steps.

  • @jithunair2042
    @jithunair2042 4 года назад

    Thanks for this informative content.
    Can we use any other label/ image format other than tiff, in this code? My dataset is not microscopy related but i tried to label using apeer tool, but failed to do it. So i was wondering if i could use any other tool like labelme to annotate my data and then use your code for semantic segmentation. Loooking forward for your reply. Thanks in advance!

  • @louiyo
    @louiyo 4 года назад

    Hello sir, I wanted to ask if this combination of VGG16 and Random Forest could be used for road segmentation. I have satellite images and masks for where the road are located. Could it work ?
    Thank you !

    • @DigitalSreeni
      @DigitalSreeni  4 года назад

      Yes, absolutely. I see no reason why it would not work.

    • @louiyo
      @louiyo 4 года назад

      @@DigitalSreeni Our dataset is (300, 608, 608, 3), thus resulting in a out of memory problem... Maybe I will try with batches. Thank you

  • @jayayadav-ih1yz
    @jayayadav-ih1yz Год назад

    Hi,
    Really appreciate your videos which is very helpful for beginners. I would like to know if an array of float64 can be given as input to vgg16?

  • @patilvinod555
    @patilvinod555 3 года назад

    i just want know that how to drop more lable from dataframe

  • @abderrahmaneherbadji5478
    @abderrahmaneherbadji5478 4 года назад

    Thank you so much for your great efforts.
    Please how one can get top-5 accuracy of a classifier (e.g. RF or SVM)

    • @DigitalSreeni
      @DigitalSreeni  4 года назад

      Just use top_k_categorical_accuracy from tensorflow or keras.
      Link: www.tensorflow.org/api_docs/python/tf/keras/metrics/top_k_categorical_accuracy
      I will probably record a video bt here is how to implement it...
      from keras.metrics import top_k_categorical_accuracy
      def top_5_categorical_accuracy(y_true, y_pred):
      return top_k_categorical_accuracy(y_true, y_pred, k=5)
      #Add this as a metric to track during training
      model.compile(optimizer = 'rmsprop',loss = 'categorical_crossentropy', metrics = ['accuracy', top_5_categorical_accuracy])
      When I run it on my system I see this during training...
      Epoch 1/2
      386/1000 [==========>...................] - ETA: 2:37 - loss: 1.8852 - accuracy: 0.3315 - top_5_categorical_accuracy: 0.8160

  • @marcsilviu5665
    @marcsilviu5665 3 года назад

    at line 108 in your code i get this error message "ValueError: Length of values (0) does not match length of index (12238848)" i am trying to get rid of it for some time now, do u think you can help?

    • @DigitalSreeni
      @DigitalSreeni  3 года назад

      Sounds like an issue with reading your masks. Please make sue they are being properly read with the right dimensions. Check whether you see what you expect in the output from the previous line where you print out the unique pixel values.

    • @marcsilviu5665
      @marcsilviu5665 3 года назад +1

      @@DigitalSreeni You were perfectly right. Thank you!

    • @DigitalSreeni
      @DigitalSreeni  3 года назад +1

      I’m glad it helped 😌

  • @kashifullah7487
    @kashifullah7487 4 года назад

    sir can I do this for landslides prediction ? based on remote sensing images.
    I have 256 landslides points and my predictors factors are 15 and want to predict landslides hazards zones

    • @DigitalSreeni
      @DigitalSreeni  4 года назад

      I am not familiar with your application but if you want to segment pixels in an image to display regions of specific interest (e.g. landslide prone) then this approach may work for you.

    • @kashifullah7487
      @kashifullah7487 4 года назад

      thank you sir I will try it

  • @stefanAH97
    @stefanAH97 4 года назад

    Transfer learning is amazing, thank u for explanation, I wonder if I can run that on CPU with a fair speed for video processing.
    A dedicated video for using 'Apeer' would be very nice. If u plan to do, please use most common images as example, like cars and airplanes, cats and dogs, nuts and screws etc. so it would be easier for us run the same procedures as shown in the video :)

    • @DigitalSreeni
      @DigitalSreeni  4 года назад

      I did the code in my video on CPU so I hope it works even for for video processing. There are a lot of videos on APEER on its channel, just look for apeer_micro on RUclips.

  • @gerhardheinzerling9880
    @gerhardheinzerling9880 4 года назад

    Thank you so much for this excellent video! It would be great if you could add an "evaluation modul" at the end of your code. :-)

    • @DigitalSreeni
      @DigitalSreeni  4 года назад +1

      Evaluation for semantic segmentation using model.accuracy is useless, you need IoU. I will record a video on that topic soon.

  • @shubhangichaturvedi2251
    @shubhangichaturvedi2251 4 года назад

    Sir, I was trying this on 1000 images in google colab but RAM is getting exhausted. How can I resolve it?

    • @DigitalSreeni
      @DigitalSreeni  4 года назад +3

      Machine learning demands a lot of resources and unfortunately the only way is to find a way to get additional resources. Saying that I wonder why you need 1000 images? Try 100 images first and see how the result looks, if it is not good then increase a bit more. I never ever used 1000 images for semantic segmentation, that is a lot of data and may not be needed to begin with.
      I got excellent results with 10 images, each 1kx1k size.

  • @savin1999
    @savin1999 3 года назад

    Where can we find the dataset?

  • @AhmedEmamAI1
    @AhmedEmamAI1 4 года назад

    can you make some examples on Hyperspectral images?

  • @AbdulQayyum-kd3gf
    @AbdulQayyum-kd3gf 4 года назад

    Great video and content. How can we train model from scratch or fine tuned DL models and extract features to pass traditional ml model for semantic segmentation? Can you make video on that, sometime transfer learning might be not perform well on medical images. Thanks in advance

    • @DigitalSreeni
      @DigitalSreeni  4 года назад

      If you want to train a model from scratch for semantic segmentation then please my watch my Unet videos. Also, watch my videos on traditional segmentation, videos 67 and 67b. Training your own neural network and using it for feature generator doesn't make sense as VGG16 and others spent thousands of hours doing the same on many images. Of course, you can always train a network yourself, save the model and follow the process from this tutorial.

  • @anishjain3663
    @anishjain3663 4 года назад

    Sir great videos series, sir how to image segmentation for 3d images or you suggest some guids to follow

    • @DigitalSreeni
      @DigitalSreeni  4 года назад +1

      For 3d images you can consider them as a stack of 2d images. That way you can process one 2d image at a time and put them into a stack. If you have features in the 3rd dimension where having the extra dimension helps then you need to consider using 3d kernels and computation will be very heavy. Keras already has 3d conv that you can use out of the box.
      from keras.layers import Conv3D

  • @syedamjad1271
    @syedamjad1271 4 года назад

    Hi Sreeni!Thanks for sharing this video you teach better than my college Professors. I love to learn from you. How to use SVM in this case as of now you used random forest. Can I use Pretrained CNN (VGG16 - imagenet) for the classification of Microscopic images.

    • @DigitalSreeni
      @DigitalSreeni  4 года назад +1

      Syed, thanks for your complements. For SVM just swap Random Forest with SVM, that easy!!!
      Yes, of course this process can be used for microscope images. Pretrained CNN (VGG16) is trained to 'understand' various image features. With this approach we are just using those as feature generators (digital filters).

    • @syedamjad1271
      @syedamjad1271 4 года назад

      @@DigitalSreeni Thank you. So I can use the same model (Pretrained CNN (VGG16 - imagenet)) for feature generator and for classification I can use SVM. Please correct me if I am wrong. Also please let me know how to generate a graph for accuracy comparison between the training
      and validation of the model that is Epoch Accuracy and Epoch Loss

    • @syedamjad1271
      @syedamjad1271 4 года назад

      @@DigitalSreeni Hi Sreeni Sir,Will you Please let me know how to use vggnet on this image dataset www.kaggle.com/c/recursion-cellular-image-classification/data and classify the cells.Can you please make a tutorial.I would be very grateful to you.

  • @anishjain3663
    @anishjain3663 4 года назад

    Would you recommend me some guidance or blog videos

    • @DigitalSreeni
      @DigitalSreeni  4 года назад

      Well, my videos should be useful :)
      Otherwise, just search for content on RUclips. For most people it is easy to learn via videos than reading text.

  • @ravikshdikola6089
    @ravikshdikola6089 3 года назад

    this model🤔 outperform unet or not?

    • @DigitalSreeni
      @DigitalSreeni  3 года назад

      Please use this approach if you have limited training images. If you have lots of training data, I recommend trying U-net.

  • @kanui3618
    @kanui3618 4 года назад

    can I detect a person with this method?

    • @kanui3618
      @kanui3618 4 года назад

      Can you make this tutorial with non-microscopies dataset?

    • @DigitalSreeni
      @DigitalSreeni  4 года назад

      It doesn't matter what data you have, this approach should work. So please label your own images and try it yourself. If you have natural scenes with a lot of information in the scene then this approach may not be ideal. You need full deep learning approach (e.g. U-Net).

  • @SourceCodeProjects
    @SourceCodeProjects 4 года назад

    I would suggest to change your channel name. I think low viewers are one reason for this. Your videos are gold btw.

    • @DigitalSreeni
      @DigitalSreeni  4 года назад +1

      Thanks for the tip. I was considering it and thinking about naming the channel same as my social media profile name. Looks like a lot of people think this is only for microscopy related topics.

  • @mehdisoleymani6012
    @mehdisoleymani6012 2 года назад

    Thanks a lot for your great courses, is it possible for you to explain my question? How should we add non-image features to our CNN model (features like object prices) to our flatten layer? Does the CNN model new added features belong to which input image?

  • @goksuceylan8844
    @goksuceylan8844 3 года назад +1

    *Pooing*