PyTorch Tutorial 13 - Feed-Forward Neural Network

Поделиться
HTML-код
  • Опубликовано: 19 ноя 2024

Комментарии • 133

  • @patloeber
    @patloeber  4 года назад +79

    10:35 I forgot to send the model to the device! Please call model = NeuralNet(input_size, hidden_size, num_classes).to(device)

    • @krishnachauhan2822
      @krishnachauhan2822 4 года назад

      Sir why the accuracy is different everytime?

    • @amazing-graceolutomilayo5041
      @amazing-graceolutomilayo5041 3 года назад +5

      @@krishnachauhan2822 Due to shuffling of data and random initialisation of weights

    • @r_pydatascience
      @r_pydatascience 3 года назад

      I followed with great attention. But, I am lost here.
      labels = labels.to(device)
      Error: 'int' object has no attribute 'to'
      loss = criterion(outputs, labels)
      Error: cross_entropy_loss(): argument 'target' (position 2) must be Tensor, not int

    • @pr_mittal
      @pr_mittal 2 года назад

      @@r_pydatascience the labels need to be converted to tensor . Both are errors are there due to the same reason. Maybe you forgot to call transform when taking dataset .

    • @trenvert123
      @trenvert123 Год назад

      @@krishnachauhan2822 When you train a NN, you're trying to get the loss to a local minima. The issue is that there are often many different local minima it can fall into. So unless you use a random seed, so that you get the same random numbers every time, your model could end up in any one of these similar local minima each time it trains.
      It's kind of interesting. Like you're teaching a bunch of children numbers, and even though they can all tell numbers by the end of it, they all think in different ways, and may have different levels of mastery.

  • @prudvi01
    @prudvi01 3 года назад +15

    Really felt satisfying to be able to put together all that I learned in the previous videos. Thank you for this series!

  • @leonardmensah6781
    @leonardmensah6781 2 года назад +6

    @Python Engineer
    please why do you have to use the zero_grad before calculating the loss function.
    Shouldn't we calculate the loss and take the optimized step before using the zero_grad?

  • @shubhamchechani3703
    @shubhamchechani3703 Год назад +12

    For me, it works with "samples, labels = next(examples)"
    Otherwise, it throws me an error: "AttributeError: '_SingleProcessDataLoaderIter' object has no attribute 'next'"

    • @panosmallioris3346
      @panosmallioris3346 Год назад +3

      this will work
      examples = iter(train_loader)
      samples, labels = next(examples)

    • @priyanshumohanty5261
      @priyanshumohanty5261 Год назад

      That's because of the disparity in the PyTorch version.

    • @tushar6416
      @tushar6416 4 месяца назад

      if u use the self identity .__next__() it shud work I had the same issue

  • @juleswombat5309
    @juleswombat5309 4 года назад +2

    Nice.
    And a good call on explaining the avoiding a call to a softmax actuation in the model, because the cross entropy criteria does that for us.

    • @patloeber
      @patloeber  4 года назад

      Yes :) Glad you like it

  • @uniwander
    @uniwander 2 года назад +2

    You have been doing a great job teaching pytorch to beginners like me! Keep it up!

  • @unknown3158
    @unknown3158 3 года назад +1

    A quick suggestion: you could add plot/image of the number and show what the NN predicts.

  • @adityajindal3738
    @adityajindal3738 3 года назад +2

    It's really awesome content. Just a suggestion, bro, you could add the relevant doubts from comment sections and make a FAQ section, which helps beginners solve their common doubts.

  • @danielstoops1188
    @danielstoops1188 4 года назад +3

    Really fantastic series, keep up the amazing work! Looking forward to your future videos!

    • @patloeber
      @patloeber  4 года назад +1

      Thank you for watching :)

  • @marfblah33
    @marfblah33 3 года назад +2

    hi. love following your series! thanks!!!
    can you please elaborate on the torch.max part...
    what exactly do you call "values" (you ignore those and store them in "_")? values of what?
    why is an index the same as the predicted label?
    and what is the "1" passed along with the model output

  • @uncoded0
    @uncoded0 Год назад

    Excellent, clear, and without extra stuff! Thank you!

  • @thanhquocbaonguyen8379
    @thanhquocbaonguyen8379 3 года назад +1

    thank you for your instructions. it is really helpful for my assignment.

  • @karanbania4449
    @karanbania4449 Год назад +3

    I believe that images.reshape(-1,28*28) won't work for everyone, you can just define a new function flatten = nn.Flatten() and then call images = flatten(images) before reshaping images; it'll work.

    • @darylallen2485
      @darylallen2485 10 месяцев назад

      Does it work on data types other than images too?

  • @kevinkawchak
    @kevinkawchak Год назад

    Thank you for the discussion.

  • @fbaftizadeh
    @fbaftizadeh 4 года назад +1

    Thanks a lot, I am really enjoying your tutorials. Good job!

    • @patloeber
      @patloeber  4 года назад

      Thank you! Glad you like it :)

  • @guilhermelopes7809
    @guilhermelopes7809 Год назад +1

    What a great video! Thank you very much :)

  • @forvm2051
    @forvm2051 4 года назад +2

    Thank you for this series!

  • @mrbb7843
    @mrbb7843 Год назад +1

    Hi, how do I reshape the image in the trainingloop if I have a RGB picture?

  • @mohamedsaaou9258
    @mohamedsaaou9258 4 года назад +1

    Wonderful, and amazing teaching sir, thanks a lot.

  • @alejandromartinezleon877
    @alejandromartinezleon877 2 года назад +3

    Hi great tutorials. But I have a question. On previews tutorials you showed that the workflow is: [loss = criterion(outputs, labels), loss.backward(), optimizer.step(), optimizer.zero_grad()].
    However on this one you changed the order to: [loss = criterion(outputs, labels), optimizer.zero_grad(), loss.backward(), optimizer.step()].
    Now I am wondering if you set to zero the gradients, then, how the optimizer could update the parameters without any information about the gradient?

    • @priyanshumohanty5261
      @priyanshumohanty5261 Год назад +1

      I presume that optimizer.zero_grad() is merely for resetting the gradients, so it shouldn't matter whether it is done after optimizer.step() in an existing epoch/step or done before loss.backward() in the next epoch/step; the key here is that existing gradients should be flushed out, if any, before computing the new gradients and updating them

    • @naveenpala3416
      @naveenpala3416 9 месяцев назад

      Well said@@priyanshumohanty5261

  • @yassine20909
    @yassine20909 2 года назад

    Great series, thank you python engineer

  • @Al-ns9yw
    @Al-ns9yw 3 года назад +1

    How can the NeuralNet class be rewritten as a multiple regression network? Eg. I want to train a network to predict 3 real values as the output that represent (x,y,z) coordinates. Could I use the class as is? Or do I have to change the forward pass or output dimensions?

  • @AchjaWassolls
    @AchjaWassolls Год назад

    Feels so good, tysm!

  • @bendibhafed1687
    @bendibhafed1687 2 года назад

    Amazing series, thank you so much

  • @computerscience8532
    @computerscience8532 3 года назад

    you are doing very good I have taken lot of knowledge from your tutorial than you so much. please extend your tutorials towards word vectors and seq2seq model

  • @aesopw6324331415926
    @aesopw6324331415926 3 года назад +2

    hi, may I ask whether optimizer.zero_grad() should be called before or after loss.backward() and optimizer.step()?

    • @patloeber
      @patloeber  3 года назад +3

      I think both options are fine :)

  • @NandanAjayParalikar
    @NandanAjayParalikar 11 месяцев назад

    bro please put an exercise at the end of every video. Also. Superb job

  • @songjunhuang4687
    @songjunhuang4687 3 года назад

    May I ask your python version. when I was trying to write samples, labels = examples.next(), there is an error says TypeError: object() takes no parameters

  • @neithane7262
    @neithane7262 11 месяцев назад

    Great video, i was wondering if the BCEloss function was applying the sigmoid function before computing the loss just like the Cross entropy apply the softmax function.

  • @ArthurRabatin
    @ArthurRabatin Год назад

    Hello, I dont understand the purpose of torch.max - what does it do? Also, why is the index the prediction? Thank you, @patloeber!

  • @therohanjaiswal
    @therohanjaiswal 6 месяцев назад

    The lecture is awesome, but i have one question, here why the labels are not converted from shape [100] to [100 * 1]?

  • @BrianPondiGeoGeek
    @BrianPondiGeoGeek 2 месяца назад

    solid content.

  • @sivaramasastrygumma1362
    @sivaramasastrygumma1362 4 года назад +1

    reshape method can't be used in pytorch, how is it working here??

  • @valarmorghulisx
    @valarmorghulisx 3 года назад

    sir i want to ask you somethings about trainig. firstly in 06:30 we printed label.shape is torch.Size([100]). you said every class label has 100 image. so have we 1000 image(cuz we have 10 classes) in all train dataset. in this examples we did 2 epochs and 784 flat image data with 100 batch size. and 6 steps every epoch.so that 1200 image was trained. why is the train_loader length is 6 and what is the label.shape? im little bit confused can you help me :)

  • @caiyu538
    @caiyu538 2 года назад

    Thank you so much

  • @krishnachauhan2822
    @krishnachauhan2822 4 года назад

    I can not find any data folder in my working directory after downloading as per the code. Though the code is working fine. Where I am wrong?

  • @danii5232
    @danii5232 3 года назад

    Hi, thanks for a video
    Just one thing, how can I test any single custom image in model, I always face problems with shapes . Is there any quick method?

  • @arikfriedman4442
    @arikfriedman4442 3 года назад

    Thanks!
    What's the idea of dividing the "test" data into batches as well? I thought batches are only relevant when training the data...

  • @kainoah_dev
    @kainoah_dev 7 месяцев назад +1

    the examples.next() will not work already. Encountered this issue and been stuck with this. You might want to change this into samples, `labels = next(examples)`

    • @NashGRS
      @NashGRS 15 дней назад

      thank you, brother

  • @eb406
    @eb406 2 года назад

    Many thanks for your great video @PythonEngineer !
    Just one question how do you choose the hidden_size ? Does it only represent the number of neurons in the network?
    Best !

  • @syinx2301
    @syinx2301 4 года назад +1

    thanks for video, beautiful.
    but i little confuse about putting optimizer.zero_grad() front of loss.backward and optimizer.step(), something i miss something??
    i mean why we didnt put after loss.bacwad() and optimizer.step() like before? Thanks

    • @patloeber
      @patloeber  4 года назад +5

      You can do it at the end of your loop (then it’s empty for the next iteration) or at the beginning of your loop. Just make sure that the grads are empty before calling backward and update step

    • @syinx2301
      @syinx2301 4 года назад +1

      @@patloeber ohh okayy. Thanks

  • @UsmanMalik57
    @UsmanMalik57 Год назад

    At @19:50, shouldn't it be `n_correct += (predictions == labels).sum().item()` ?

  • @xQuiiTeGB
    @xQuiiTeGB 2 года назад

    During the test, shouldn't we add a softmax layer?

  • @DiegoAndresAlvarezMarin
    @DiegoAndresAlvarezMarin 3 года назад

    I am getting a SEGMENTATION FAULT.
    Any idea why?
    Accuracy of the network on the 10000 test images: 98.06 %
    Segmentation fault (core dumped)

  • @mohamadabdulkarem206
    @mohamadabdulkarem206 2 года назад

    Thank you a lot for excellent explanation I would like ask , you said _, predection = torch.max(outputs,1) number 1 here what it does mean?

  • @navintiwari
    @navintiwari 3 года назад +1

    This is an excellent tutorial. May I please know what color theme are you using for the editor?

    • @patloeber
      @patloeber  3 года назад +2

      I think it is Monokai in this video. Right now I'm using Night Owl

    • @navintiwari
      @navintiwari 3 года назад

      @@patloeber Thank you so much! I discovered this channel recently and I love it:)

  • @sushilbastola8940
    @sushilbastola8940 4 года назад

    I watched all your videos. I need a help on transfer neural network for RUL estimation. How and where can i get help sir ?

  • @ahsanrossi4328
    @ahsanrossi4328 4 года назад

    Thanks Mate

  • @popamaji
    @popamaji 4 года назад +2

    how the forward method is been called? would u plz explain or give a good n short vid?
    thanks for u beautiful channel

    • @cuentatrucha6310
      @cuentatrucha6310 3 года назад +1

      I don't know very much of programming, but i think is in 'output = model(images)' line.
      output = model(images) is like NeuralNet.forward(images), i think.
      nn.Module maybe has a function to do this automatically (automagically).

  • @sarahjamal86
    @sarahjamal86 4 года назад

    Very well done !
    Thanks a lot :-)

  • @xl0pate0lx
    @xl0pate0lx 2 года назад

    I am not super familiar with python, can someone explain to me what the "_," does in line 89 at minute 18:45 ? Could not find anything helpful online.

    • @uncoded0
      @uncoded0 Год назад

      it's a convention, meaning that variable that will not be used

  • @alirezamohseni5045
    @alirezamohseni5045 6 месяцев назад

    very nice

  • @andyloram2356
    @andyloram2356 4 года назад +1

    Hey great tutorial. Just one thing, how is that when you do the forward part you can call the forward method just by using the outputs=model(images) ? thx

    • @patloeber
      @patloeber  4 года назад +2

      yes, model(x) is performing the forward pass. This is because PyTorch implemented the __call__(self) function with the forward pass

    • @tarekradwan8661
      @tarekradwan8661 4 года назад

      @@patloeber is forward the keyword? i mean if i have more than one function in the NeuralNet Class, which one will be called when you call model(X ) THANKS IN ADVANCE

    • @sauravchoudhury1018
      @sauravchoudhury1018 3 года назад

      @@tarekradwan8661 the custom class inherits forward method from nn.Module and overrides it.

  • @seandarcy2612
    @seandarcy2612 Год назад

    Made a typo in my code saying optimizer.step instead of optimizer.step() so the model would run but wouldn't converge at all.

  • @rkalghatgi3
    @rkalghatgi3 4 года назад

    Great tutorial. How did you decide on the number of layers for the network in this tutorial? Is there a general rule or guidance on minimum required to build a network?

    • @patloeber
      @patloeber  4 года назад

      In this tutorial there was no real plan, but input size and number of classes must be according to the dataset. In general you can try to model architectures from other popular networks, and then tweak it for your needs

  • @dinamoses4893
    @dinamoses4893 3 года назад

    Hey python Engineer, What a great and useful tutorials! You helped me a lot! I have two question I am a beginner at PyTorch and I would really appreciate your help , If I want after calculating the accuracy to plot a graph to see the progression of loss and accuracy through the epochs, How to do so ? also, I want to visualize a few examples as we did earlier and compare it with model's output. Thank you for your time!

    • @patloeber
      @patloeber  3 года назад +2

      the easiert solution would be to use the Tensorboard (I have a video about this). Or you store the loss/accuracy in each epoch in a list, and then plot it yourself with matplotlib

  • @Mr2009johnsteele
    @Mr2009johnsteele 4 года назад +1

    Why does he do optimizer.zero_grad() before calculating the gradients and taking a step rather than after?

    • @patloeber
      @patloeber  4 года назад +1

      Both are fine. Just make sure the gradients are emtpy again in the next iteration

    • @jaivratsingh9966
      @jaivratsingh9966 3 года назад

      @@patloeber Hi , this is excellent video! many thanks for this. I think had similar question - I would try to verify myself. However, it seems that optimizer.step() might be using gradient info inside to step. So if you make it zero before stepping, then first step goes waste. However the code still works because the next step() call may be using gradient of previous iteration. So, there might be a "loss" of one batch of data, but still it works.

  • @atchutram9894
    @atchutram9894 3 года назад +1

    How come optimizer step is working well even after zero_grad operation?

    • @patloeber
      @patloeber  3 года назад

      Normally it should not work well and produce different results. But for some simple examples you might not see that big of a difference

    • @namandixit4972
      @namandixit4972 3 года назад +1

      I had that exact same doubt. xD

  • @roshankumargupta9978
    @roshankumargupta9978 4 года назад

    At 14.06, I didn't get the size (100,1,28,28). Could you please explain?

    • @patloeber
      @patloeber  4 года назад

      I explain the size at minute 06:30

  • @raminessalat9803
    @raminessalat9803 3 года назад

    One question: how does the code understands when you're calling the model(images) that it should use the feedforward method in it? because I though model is an instance of the NeuralNet class

    • @patloeber
      @patloeber  3 года назад +2

      Pytorch implemented the __call__(self) method such that it uses the forward method inside...

    • @raminessalat9803
      @raminessalat9803 3 года назад

      @@patloeber Got it! thank you!

  • @Ftur-57-fetr
    @Ftur-57-fetr 3 года назад

    Thanks!!

    • @patloeber
      @patloeber  3 года назад

      thanks for watching!

  • @atanumondal1301
    @atanumondal1301 2 года назад

    what is the meaning of batch_size = 100?

  • @harissajwani2583
    @harissajwani2583 4 года назад

    Is there a way to check the torch model parameter updates along with the loss for each iteration.

    • @patloeber
      @patloeber  4 года назад

      sure. you get it with model.parameters(). You can also have a look at model.state_dict() and optimizer.state_dict() and print what you need

  • @amc8437
    @amc8437 3 года назад

    I am stuck here;
    >>>>class NeuralNetwork(nn.Module):
    >>>> def __init__(self, input_size, hiden_size, num_classes):
    >>>>super(NeuralNetwork, self).__init__()
    >>>>self.l1=nn.Linear(input_size, hidden_size)
    >>>>self.relu=nn.ReLU()
    >>>>self.l2=nn.Linear(hidden_size, num_classes).to(device)
    ---------------------------------------------------------------------------NameError Traceback (most recent call last)
    in 34 test_loader = torch.utils.data.DataLoader(mnist_test, batch_size=batch_size, shuffle=False) 35
    ---> 36 class NeuralNetwork(nn.Module): 37 def __init__(self, input_size, hiden_size, num_classes): 38 super(NeuralNetwork, self).__init__()
    in NeuralNetwork() 46 out=self.l2
    47 return out
    ---> 48 model=NeuralNetwork(input_size,hidden_size,num_classes) 49
    50 #loss and Optimizer
    NameError: name 'NeuralNetwork' is not defined

  • @shuaili5656
    @shuaili5656 3 года назад

    thanks for this impressive tutorials ! I have a question confusing me for a while : when training the NN mode we give N * feature to the model directly, N is the number of the data, feature is the dimension of the input, then use the same data train the NN like 1000 times, My question is: why not every time we give only one single line of the data, and train it len(data set) times? this for me is more reasonable, I cannot figure out the benefits using whole data set to train 1000 times, and how can understand it in mathematical way, Thank u very much, hope some one can explain this for me

    • @keroldjoumessi
      @keroldjoumessi 3 года назад

      @Shuai I think your question is about the (batch/mini-batch) size and the epoch number. If we use just one data and train the model len(dataset) time is like we have choose to use SGD with batch_size=1 and epoch=len(dataset). This means that we only used the prediction error made by one single data point to update the weights at each step while by increasing the mini-batch size we have more data to evaluate the loss.
      But it is intirely up to you and you can try out various batch size with epoch and look at the loss to see which batch size is more suitable for your problem

  • @seeking9145
    @seeking9145 2 года назад

    7:00 "For each class label we have one value here"?
    But the value is 100 and we have 10 classes

  • @MohamedAli-dk6cb
    @MohamedAli-dk6cb 4 года назад

    I did everything exactly the same, it is working fine but I am getting the testing accuracy 9.5% up to 11%. Any explanation?

    • @patloeber
      @patloeber  4 года назад

      Hmm weird. Can you compare with my code on github? Maybe there is a slight difference

    • @nelson3560
      @nelson3560 3 года назад +1

      I also ran into the same problem, and based on ​ @Python Engineer advice and codes on github, I found the problem, and I suspect yours would be similar. My problem is in my testing loop I had my model output to 'output', while in my training I had my model output to 'outputs', and then, when I am getting my predictions, I pass in 'outputs'. A simple fix for me is to make sure I line up variable names, and accuracy goes to the expected ~95% with 2 epochs.

  • @tilkesh
    @tilkesh 4 месяца назад

    print('Thank you very much')

  • @amc8437
    @amc8437 3 года назад

    File "", line 69
    print(f 'epoch {epoch+1} / {num_epochs}, step {i+1}/{n_total_steps}, loss = {loss.item():.3f} ')
    ^
    SyntaxError: invalid syntax
    Why I am getting this error when I run the same code?

    • @patloeber
      @patloeber  3 года назад

      which Python version are you using? Make sure it supports f-Strings

    • @amc8437
      @amc8437 3 года назад

      @@patloeber I am using Python 3.7.10.
      This is the challenge I am having now:
      >>>---------------------------------------------------------------------------
      NameError Traceback (most recent call last)
      in
      32 test_loader = torch.utils.data.DataLoader(mnist_test, batch_size=batch_size, shuffle=False)
      33
      ---> 34 class NeuralNet(nn.Module):
      35 def _init_(self, input_size, hiden_size, num_classes):
      36 super(NeuralNet, self)._init_()
      in NeuralNet()
      44 out=self.l2
      45 return out
      ---> 46 model=NeuralNet(input_size,hidden_size,num_classes)
      47
      48 #loss and Optimizer
      NameError: name 'NeuralNet' is not defined

  • @lamho411
    @lamho411 2 года назад

    _, predictions = ... I have never seen the _, syntax before. Can someone point me to where i can read about it?

    • @arkanon8661
      @arkanon8661 Год назад

      it's not really syntax, it's called a "throwaway" variable. generally it is used for any value that is unused.
      i personally avoid using it, as python does have another somewhat obscure use for underscores, which is intended for use in the shell, where it represents the result of the previous expression.

  • @saurrav3801
    @saurrav3801 4 года назад

    Bro can you please tell me how to print predicted img

    • @patloeber
      @patloeber  4 года назад +1

      you mean you want to have a look at the predicted outcome? I recommend using the Tensorboard for this. (Have a look at tutorial #16)

  • @adityasahu96
    @adityasahu96 3 года назад

    my kernel keeps saying its dead

  • @pestlewebengland1346
    @pestlewebengland1346 7 месяцев назад

    I like your tutorials. But the code completion pop-ups constantly appearing is really painfull. Very distracting. I see from the video description you are pushing the code completion tool .. and it might be ok .. but this is not a good advertisment for it because it constantly interrupts trying to read your code.

  • @DanielWeikert
    @DanielWeikert 4 года назад +1

    I thought you also need to send the model.to_device?

    • @patloeber
      @patloeber  4 года назад +1

      You are correct! Nice catch! Sorry I forgot this. In my example it did not produce an error since I didn't have GPU support on the MacBook anyway. But it should give an error if you have GPU support...

    • @brzrst802
      @brzrst802 4 года назад

      @@patloeber Could you please tell us how to fix it? thank you so much!

    • @priteshsinghvi9067
      @priteshsinghvi9067 4 года назад +1

      @@patloeber having error, can't able to solve it, can u provide the right solution 🙂

    • @wafflecat8
      @wafflecat8 3 года назад

      @@priteshsinghvi9067 Under model = NeuralNet(input_size, hidden_size, num_classes), just put:
      model.to(device)

  • @zrmsraggot
    @zrmsraggot 3 года назад

    no module named torchvision

    • @patloeber
      @patloeber  3 года назад

      you need to install it with pip install torchvision, or conda...see the installation guide :)

  • @plasma7851
    @plasma7851 4 года назад +1

    Every 5-10 min: “Meet webflow ...“
    Skip ad