PyTorch Tutorial 15 - Transfer Learning

Поделиться
HTML-код
  • Опубликовано: 4 окт 2024

Комментарии • 107

  • @ross951
    @ross951 3 года назад +54

    For those new to transfer learning: ideally we would like to freeze all of the layers other than the newly added head layer, and train for n epochs, then unfreeze the preceding layers, and train the entire network using a sliced learning rate, where the parameters of the later layers are updated faster than the parameters of the earlier layers. This is how libraries like fastai handle transfer learning out of the box.

    • @patloeber
      @patloeber  3 года назад +2

      thanks for the tip!

  • @georgesebastian2005
    @georgesebastian2005 4 года назад +8

    Best PyTorch tutorial series available on Internet ! Could you please make a video on "Training models using Tensorboard" as well?

    • @patloeber
      @patloeber  4 года назад +4

      Hi, thank you for the feedback :) Yes this is indeed on my list and I want to do the tutorial in the next few weeks...

  • @shelfinyoung56
    @shelfinyoung56 3 года назад

    The best pytorch tutorial I've ever seen,very foudmamental and good for pytorh beginer

    • @patloeber
      @patloeber  3 года назад

      thanks, glad you like it!

  • @안경환-x8t
    @안경환-x8t 3 года назад

    i'm from Korea. it was wonderful lectures ever!! thank you.

    • @patloeber
      @patloeber  3 года назад

      glad you like it! greetings from Germany

  • @cibinjohnjoseph7229
    @cibinjohnjoseph7229 4 года назад +2

    Thanks for creating this playlist. It has been really helpful. Well done, and keep going...

  • @pranjalsaxena2680
    @pranjalsaxena2680 4 года назад +1

    Finally, it was a great journey with you bro. :) Thanks a ton

    • @patloeber
      @patloeber  4 года назад +2

      I'm glad you like it :)

  • @bluebox6307
    @bluebox6307 3 года назад +1

    I didn't know about this concept before and was super impressed. That is so damn amazing! Nice explanation as always :D

    • @patloeber
      @patloeber  3 года назад +1

      glad you like it! yes transfer learning is a very important concept!

  • @fabienmathieu7430
    @fabienmathieu7430 2 года назад

    Thank you for this really nice, simple and efficient course which has become my reference for learning PyTorch and practice deep learning.

  • @abderrahmanebououden5173
    @abderrahmanebououden5173 4 года назад

    thanks so much,bro, it is really good explanation and helpful

  • @VjayVenugopal
    @VjayVenugopal 4 года назад +4

    for thos who dont know, how these standard_dev and mean are calculated before handedly
    1) apply the to_tensor() transformation to the dataset
    2) load the dataloader
    3) keep the batch size as one for the train dataset
    data = iter(train_data_loader)
    data = next(data)
    print(data[0].mean(), data[0].std())

    • @zt0t0s
      @zt0t0s 3 года назад

      If I use a Pretrained network, trained with a certain normalisation factors(mean and std), should I use those values or the values calculated with the method explained above from my dataset ? @python engineer

    • @VjayVenugopal
      @VjayVenugopal 3 года назад

      @@zt0t0s Yeah you should calculate individual mean and std for each datasets

    • @straighter7032
      @straighter7032 Год назад

      Why calculate the mean and std for one batch only?

  • @camus6525
    @camus6525 4 года назад +2

    Awesome job, thanks!
    An LSTMs Pytorch tutorial would be welcome...

    • @patloeber
      @patloeber  4 года назад

      Thank you! And thanks for the suggestion, I will add it to my list

    • @krishnachauhan2822
      @krishnachauhan2822 3 года назад

      @@patloeber hey sir , great content on pytorch. when are you making vedio on lstm and autoencoders?

  • @MW-vg9dn
    @MW-vg9dn 4 года назад

    Excellent course! Danke!

    • @patloeber
      @patloeber  4 года назад +1

      glad you like it :)

  • @noamills1130
    @noamills1130 Год назад +1

    The link in the description that says "More about Transfer Learning" is actually the link to the source code that Patrick copied and provided minimal revisions to (including deleting the comments specifying the original author and licensing info). The original code has a BSD license which requires attribution. Saying "more information here" in the description isn't the same as saying "source code from here". Please provide appropriate attribution in your videos and on your github.

  • @TheOraware
    @TheOraware 3 года назад +3

    Thanks for your teaching , normally when we transform train data then we use the same transformation on validation data , but here i am seeing train data transformation is transforms.RandomResizedCrop(224) and validation data has transforms.Resize(256). Same with Flip with train data but not on validation data, i am confused?

  • @nougatschnitte8403
    @nougatschnitte8403 2 года назад +2

    Hey, appreciate the tutorials, but you keep saying that you explained everything for the training loop in the previous tutorials. However I find it differs significantly with what you covered sofar. E.g. an explaination of what use it is to set the model in training or eval mode is completely missing. I am wondering what that is for, since we didnt need it in the previous tutorials.
    Also the line with best_model_wts = copy.deepcopy(model.state_dict()) leaves a question mark with me. Could you clear that up ? :)

  • @stanislav4607
    @stanislav4607 2 года назад +2

    Great stuff, but after tutorial 14 about CNNs your code explanation got more chaotic/confusing. Until tutorial 14 the code was perfectly explained. I used to write the code alongside with you until tutorial 14 so that I can have some muscle memory and intuition about how to code NNs. Now I'm a little confused.

  • @deepudeepak1390
    @deepudeepak1390 3 года назад

    Thanks you so much !! so clean and clear , .. one request from my side please make some viedos on transfer learning in NLP .. please

  • @alanzhu7538
    @alanzhu7538 3 года назад +1

    just wonder why the mean and std are np.array with three elements?

  • @darthdaenerys
    @darthdaenerys 2 года назад

    Thanks for sharing. Can you make a video on GANs

  • @aldocamargo3357
    @aldocamargo3357 Год назад

    Thanks for the effort to put all these videos together, could you tell me where can I get the code of the tutorials ?
    Have a nice one,
    Aldo

  • @SandeepSingh-hr8lm
    @SandeepSingh-hr8lm 4 года назад +1

    Great Job!!! Thanks. Could you please upload LSTM with "Attention" using Pytorch. Keep it up!!!!

    • @patloeber
      @patloeber  4 года назад

      Thanks for the suggestion, I will consider this!

  • @tusshar747
    @tusshar747 4 года назад +1

    Nice Tutorial. It would be nice if you can explain transfer learning on imbalance datasheet using sampling method

    • @patloeber
      @patloeber  4 года назад

      Thanks for the suggestion!

  • @jasephmason7200
    @jasephmason7200 4 года назад

    Nice tutorial

  • @prabaldutta1935
    @prabaldutta1935 3 года назад

    Thanks for your videos!
    Could you make a video, showing how to use nn.Block for building networks instead of nn.Module?

    • @patloeber
      @patloeber  3 года назад

      Thanks. I do not know nn.Block. Can you point me to the resource?

    • @prabaldutta1935
      @prabaldutta1935 3 года назад

      @@patloeber the approach is very similar to nn.Module colab.research.google.com/github/d2l-ai/d2l-en-colab/blob/master/chapter_natural-language-processing-pretraining/bert.ipynb#scrollTo=5uDSQyw9j7UC

    • @patloeber
      @patloeber  3 года назад

      @@prabaldutta1935 Thanks. I'll have a look at that. But this is not PyTorch code, this is mxnet: from mxnet.gluon import nn, and nn.Block...

  • @davidwu3247
    @davidwu3247 3 года назад

    dude i love you

  • @ahnafmunir6106
    @ahnafmunir6106 3 года назад

    I couldn't find any previous video in this playlist that discusses dividing the data into training and evaluation sets. Did I miss anything?

  • @owenlie
    @owenlie 2 года назад

    The normal way of doing something like this as far as I know is having *train,* *validation ,* and *test* sets. But here I see only train and val. I watch this video for several times but I still don't understand the val is stands for 'validation' or for 'evaluate'?

  • @736939
    @736939 2 года назад

    How to deal with tabular data and the transfer learning? I have a model that has been trained, and suddenly, the number of features will be increased, so now, how to deal with trained model and various head part of features (the first, dynamic layer of dense neural network) ? Thank you.

  • @theonethatcant
    @theonethatcant 4 года назад +1

    Very useful tutorial! Especially appreciate the explanation of how to keep the training from updating pre-trained weights. One question, the validation accuracy here is higher than the training accuracy. I used a different data set and also get a higher accuracy on my validation set. Then with two data sets, same thing. I don't see anything wrong with the code, but that's not the typical expectation no, or am I overlooking something?

    • @theonethatcant
      @theonethatcant 4 года назад

      Ah, I forgot that the transforms applied to the "train" dataset introduce more variability

    • @patloeber
      @patloeber  4 года назад

      Good observation!

  • @jphitidis
    @jphitidis 2 года назад +1

    Amazing tutorial. Could you also 'freeze' the earlier weights by simply only passing model.fc.parameters() to the optimizer?

  • @renatoviolin
    @renatoviolin 4 года назад +1

    Thanks for sharing. I'm coming from TensorFlow and are amazing how easy PyTorch is.
    A doubt: when you are predicting, wouldn't be necessary to apply sigmoid on the outputs:
    _, prediction = torch.max(F.sigmoid(outputs), 1)

    • @patloeber
      @patloeber  4 года назад +2

      Thanks for watching! Good question! It depends on the loss function, because some loss functions in PyTorch already apply an activation function like sigmoid or softmax. In this case nn.CrossEntropyLoss() applies the softmax, that's why we don't need another one. I think I explained this in tutorial #11 or some other one.

  • @harissajwani2583
    @harissajwani2583 4 года назад

    Thanks for your videos.
    Can you make a video on "Distillation learning in Pytorch", which is somewhat similar to transfer learning.
    It follows the Teacher - Student principle.

    • @patloeber
      @patloeber  4 года назад +1

      I will have a look into that

    • @sandorkonya
      @sandorkonya 3 года назад

      @@patloeber plus one for distillation learning!

  • @ibtissamsaadi6250
    @ibtissamsaadi6250 2 года назад

    really it is amazing tutorial ever !! thank you so much , i have one question ? i used pre-trained model for vision transformer model from timm and i finetuned the last layer for num_features and number of our classes m but when i run the code is take a lot of time than normal training!! i don't know what's the problem exactly?? can you help me?

  • @jacobjonm0511
    @jacobjonm0511 2 года назад

    3:39, that is more like a fly than a bee.

  • @javeriaehsan369
    @javeriaehsan369 Год назад

    how you calculated mean and std? I have to transfer learning the alexnet for mnist data but how to calculate mean and std for that

  • @swastiktiwari7066
    @swastiktiwari7066 3 года назад

    In the code, shouldn't labels range from 0 to 1 instead of 1 to 2, and therefore following line be added after labels = labels.to(device): labels=labels-1

    • @patloeber
      @patloeber  3 года назад

      Why should it be 0 and 1? I think it won't make a difference in this example, you just need two different class labels

    • @swastiktiwari7066
      @swastiktiwari7066 3 года назад

      @@patloeber _, preds = torch.max(outputs, 1). doesn't this return preds values as 0 or 1 which we are using in comparing preds and labels in running_corrects += torch.sum(preds == labels.data) ? I was getting error on running the code which went away when I did labels=labels-1

  • @ubaidmanzoorwani6254
    @ubaidmanzoorwani6254 4 года назад

    where do you find the method with model eg model.fc or model.fc.in_features??

  • @muizuvais
    @muizuvais 3 года назад

    How do you do a single prediction?

  • @sendjasniabderrezzaq9347
    @sendjasniabderrezzaq9347 3 года назад

    Thank you so much for these tutorials. I have a question regarding the normalization of data. The mean and std you gave are the same for any use case, or they are model-dependent?

    • @patloeber
      @patloeber  3 года назад

      they are dataset dependent. It should use the mean snd std dev of the training dataset (In this video this was pre-calculated)

    • @sendjasniabderrezzaq9347
      @sendjasniabderrezzaq9347 3 года назад +1

      @@patloeber Thanks for the reply.

  • @bht9871
    @bht9871 Год назад

    your code only allows to train 2 classes, how about multiple classes?

  • @deepakkumar5394
    @deepakkumar5394 4 года назад

    This is a great tutorial, i just need one clarification, instead of using cross entropy, can we use it as binary classification, how we can do dis with same folder structure

    • @patloeber
      @patloeber  4 года назад +1

      You can use sigmoid as last layer and then BCELoss, same as in logistic regression tutorial #8

    • @deepakkumar5394
      @deepakkumar5394 4 года назад

      @@patloeber thanks

    • @deepakkumar5394
      @deepakkumar5394 4 года назад

      Also, request you to please add video on imbalance class weighting for both binary and multi class classification problem

    • @patloeber
      @patloeber  4 года назад

      thanks for the suggestion!

  • @zt0t0s
    @zt0t0s 3 года назад

    If I use a Pretrained network, trained with a certain normalisation factors(mean and std), should I use those values or the values calculated with the method explained above from my dataset ?

    • @patloeber
      @patloeber  3 года назад +1

      the values the pretrained network uses

  • @helimehuseynova6631
    @helimehuseynova6631 2 года назад

    Hi, How can we plot train acc and loss and val acc and loss in your code?
    Thanks

  • @krishnachauhan2822
    @krishnachauhan2822 3 года назад

    is there any option for audiofolder similar to imagefolder in pytorch?

  • @rs9130
    @rs9130 3 года назад

    How can I use pretrained weights like vgg16 in fcn architecture.
    Is this correct?
    self.conv1_1 = vgg16.features[0]
    Please help

  • @krishnachauhan2850
    @krishnachauhan2850 3 года назад

    is there any option for audiofolder similar to imagefolder in pytorch? plz reply

  • @HKHforpeace
    @HKHforpeace 4 года назад

    Brilliant. Can i ask which text editor it is?

  • @kunalgoyal9453
    @kunalgoyal9453 4 года назад

    Why train loss is more than validation loss?

    • @patloeber
      @patloeber  4 года назад

      is it? i did not notice...but i guess that can happen during shuffled training...

  • @MdCor2012
    @MdCor2012 4 года назад

    are the "bee and ant images" a common dataset which we can download to follow your tutorial? Danke :)

    • @patloeber
      @patloeber  4 года назад +4

      Yes you can download it from pytorch official website: download.pytorch.org/tutorial/hymenoptera_data.zip

  • @alecsmart1244
    @alecsmart1244 3 года назад

    Hello, I had a few errors and have checked the code 2 times...
    FIRST - I had to change the argument "scheduler" at the end to match step_lr_scheduler (not sure if that was correct, but it was not defined previously, so it was either than or change step_lr_scheduler to just "scheduler".) THEN...
    SECOND - when i ran it after the above change, I got this error: (took out some of the path for privacy)
    ['ants', 'bees']
    Epoch 0/1
    ----------
    train Loss: 0.6369 Acc: 0.6311
    Traceback (most recent call last):
    File "/python_engineer/pytorch15-transfer.py", line 137, in
    model = train_model(model, criterion, optimizer, scheduler=step_lr_scheduler, num_epochs=2)
    File "/python_engineer/pytorch15-transfer.py", line 77, in train_model
    for inputs, labels in dataloaders[phase]:
    File "/.virtualenvs/torch38/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 363, in __next__
    data = self._next_data()
    File "/.virtualenvs/torch38/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 403, in _next_data
    data = self._dataset_fetcher.fetch(index) # may raise StopIteration
    File "/.virtualenvs/torch38/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
    File "/.virtualenvs/torch38/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
    data = [self.dataset[idx] for idx in possibly_batched_index]
    File "/.virtualenvs/torch38/lib/python3.8/site-packages/torchvision/datasets/folder.py", line 139, in __getitem__
    sample = self.transform(sample)
    File "/.virtualenvs/torch38/lib/python3.8/site-packages/torchvision/transforms/transforms.py", line 61, in __call__
    img = t(img)
    File "/.virtualenvs/torch38/lib/python3.8/site-packages/torchvision/transforms/transforms.py", line 212, in __call__
    return F.normalize(tensor, self.mean, self.std, self.inplace)
    File "/.virtualenvs/torch38/lib/python3.8/site-packages/torchvision/transforms/functional.py", line 280, in normalize
    raise TypeError('tensor should be a torch tensor. Got {}.'.format(type(tensor)))
    TypeError: tensor should be a torch tensor. Got .
    Any ideas? I do not know how to fix this.

    • @patloeber
      @patloeber  3 года назад +1

      Did you compare the code with mine on GitHub? I guess for 2 you have to apply the toTensor tranform to get an actual tensor from the image

    • @ShermanSitter
      @ShermanSitter 3 года назад +1

      @@patloeber Thank you, that was the issue! Btw, I am having so much fun with PyTorch thanks in part to your amazing tutorials. Very clear, well constructed, and makes learning it easy and fun!

    • @patloeber
      @patloeber  3 года назад

      @@ShermanSitter happy to hear that :)

  • @krishnachauhan2850
    @krishnachauhan2850 3 года назад

    Cn someone help me with this, I tried this but some bugs are there in mu case. I was working for speech classification on speech spectrograms. they are two dimensional , I think resnet is trained for rgb mages (3 channels)? plz guide

  • @hafsayousif2474
    @hafsayousif2474 3 года назад

    Hi ,heck_not_importing_main
    is not going to be frozen to produce an executable.''')
    RuntimeError:
    An attempt has been made to start a new process before the
    current process has finished its bootstrapping phase.
    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:
    if __name__ == '__main__':
    freeze_support()
    ...
    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.

    • @patloeber
      @patloeber  3 года назад +1

      use if _name_ == '__main__': and also try setting num_workers=0

  • @yeshuang2226
    @yeshuang2226 4 года назад

    Traceback (most recent call last):
    File "15_transfer_learning.py", line 156, in
    step_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1)
    NameError: name 'optimizer_ft' is not defined
    Please update your sample code, thank you. (should be optimizer, instead of optimizer_ft ?)

    • @patloeber
      @patloeber  4 года назад +1

      Thanks for the hint. I will update this. You can also open an issue on GitHub for such findings

  • @rafailmahammadli2213
    @rafailmahammadli2213 2 года назад

    Hi, is it necessary to have test set? As i see you created train and valid dataset.
    Thanks

    • @patloeber
      @patloeber  2 года назад +1

      for training and parameter optimization you use train and validation set. the test set should be used afterwards for completely new and unseen data (for example when submitting a kaggle competition)

  • @user-or7ji5hv8y
    @user-or7ji5hv8y 4 года назад

    Why is the first one models.resnet18() and the second one torchvision.models.resnet18()? Is there a difference?

    • @patloeber
      @patloeber  4 года назад

      it's the same. I imported "from torchvision import models" so i could use it directly. I should have used it for both instances...

  • @Idontknow-vh2hl
    @Idontknow-vh2hl 3 года назад

    how to do resnet 34?

  • @Vanilla102
    @Vanilla102 2 года назад

    Big fan of this series, but this video in particular was a bit painful. Code is not explained enough even if you followed the previous videos in the series (especially the training loop is rather different, and normalization of the data is kinda glossed over), and some of the numbers are a bit magic (why do we update the scheduler every 7 epochs?) As others have pointed out, around this video and somewhat in the CNN video, the explanations started degrading a bit. I get you don't wanna re-explain everything that has already been shown, but there really is quite some things here that just appear out of nowhere. Just some (hopefully constructive) criticism.

  • @tryfonmichalopoulos5656
    @tryfonmichalopoulos5656 3 года назад

    You literally just copy pasted the tutorial from torch documentation ... you should at least mention that, cause this same example is really well documented on the site ...

    • @patloeber
      @patloeber  3 года назад

      Yeah it was very close to the docs article, I referenced it in the description. Should have mentioned it in the video, too...

  • @saurrav3801
    @saurrav3801 3 года назад

    Bro my GPU is out of memory how to clear GPU memory