I followed with great attention. But, I am lost here. labels = labels.to(device) Error: 'int' object has no attribute 'to' loss = criterion(outputs, labels) Error: cross_entropy_loss(): argument 'target' (position 2) must be Tensor, not int
@@r_pydatascience the labels need to be converted to tensor . Both are errors are there due to the same reason. Maybe you forgot to call transform when taking dataset .
@@krishnachauhan2822 When you train a NN, you're trying to get the loss to a local minima. The issue is that there are often many different local minima it can fall into. So unless you use a random seed, so that you get the same random numbers every time, your model could end up in any one of these similar local minima each time it trains. It's kind of interesting. Like you're teaching a bunch of children numbers, and even though they can all tell numbers by the end of it, they all think in different ways, and may have different levels of mastery.
@Python Engineer please why do you have to use the zero_grad before calculating the loss function. Shouldn't we calculate the loss and take the optimized step before using the zero_grad?
For me, it works with "samples, labels = next(examples)" Otherwise, it throws me an error: "AttributeError: '_SingleProcessDataLoaderIter' object has no attribute 'next'"
It's really awesome content. Just a suggestion, bro, you could add the relevant doubts from comment sections and make a FAQ section, which helps beginners solve their common doubts.
hi. love following your series! thanks!!! can you please elaborate on the torch.max part... what exactly do you call "values" (you ignore those and store them in "_")? values of what? why is an index the same as the predicted label? and what is the "1" passed along with the model output
I believe that images.reshape(-1,28*28) won't work for everyone, you can just define a new function flatten = nn.Flatten() and then call images = flatten(images) before reshaping images; it'll work.
Hi great tutorials. But I have a question. On previews tutorials you showed that the workflow is: [loss = criterion(outputs, labels), loss.backward(), optimizer.step(), optimizer.zero_grad()]. However on this one you changed the order to: [loss = criterion(outputs, labels), optimizer.zero_grad(), loss.backward(), optimizer.step()]. Now I am wondering if you set to zero the gradients, then, how the optimizer could update the parameters without any information about the gradient?
I presume that optimizer.zero_grad() is merely for resetting the gradients, so it shouldn't matter whether it is done after optimizer.step() in an existing epoch/step or done before loss.backward() in the next epoch/step; the key here is that existing gradients should be flushed out, if any, before computing the new gradients and updating them
How can the NeuralNet class be rewritten as a multiple regression network? Eg. I want to train a network to predict 3 real values as the output that represent (x,y,z) coordinates. Could I use the class as is? Or do I have to change the forward pass or output dimensions?
you are doing very good I have taken lot of knowledge from your tutorial than you so much. please extend your tutorials towards word vectors and seq2seq model
May I ask your python version. when I was trying to write samples, labels = examples.next(), there is an error says TypeError: object() takes no parameters
Great video, i was wondering if the BCEloss function was applying the sigmoid function before computing the loss just like the Cross entropy apply the softmax function.
sir i want to ask you somethings about trainig. firstly in 06:30 we printed label.shape is torch.Size([100]). you said every class label has 100 image. so have we 1000 image(cuz we have 10 classes) in all train dataset. in this examples we did 2 epochs and 784 flat image data with 100 batch size. and 6 steps every epoch.so that 1200 image was trained. why is the train_loader length is 6 and what is the label.shape? im little bit confused can you help me :)
Hi, thanks for a video Just one thing, how can I test any single custom image in model, I always face problems with shapes . Is there any quick method?
the examples.next() will not work already. Encountered this issue and been stuck with this. You might want to change this into samples, `labels = next(examples)`
Many thanks for your great video @PythonEngineer ! Just one question how do you choose the hidden_size ? Does it only represent the number of neurons in the network? Best !
thanks for video, beautiful. but i little confuse about putting optimizer.zero_grad() front of loss.backward and optimizer.step(), something i miss something?? i mean why we didnt put after loss.bacwad() and optimizer.step() like before? Thanks
You can do it at the end of your loop (then it’s empty for the next iteration) or at the beginning of your loop. Just make sure that the grads are empty before calling backward and update step
I don't know very much of programming, but i think is in 'output = model(images)' line. output = model(images) is like NeuralNet.forward(images), i think. nn.Module maybe has a function to do this automatically (automagically).
Hey great tutorial. Just one thing, how is that when you do the forward part you can call the forward method just by using the outputs=model(images) ? thx
@@patloeber is forward the keyword? i mean if i have more than one function in the NeuralNet Class, which one will be called when you call model(X ) THANKS IN ADVANCE
Great tutorial. How did you decide on the number of layers for the network in this tutorial? Is there a general rule or guidance on minimum required to build a network?
In this tutorial there was no real plan, but input size and number of classes must be according to the dataset. In general you can try to model architectures from other popular networks, and then tweak it for your needs
Hey python Engineer, What a great and useful tutorials! You helped me a lot! I have two question I am a beginner at PyTorch and I would really appreciate your help , If I want after calculating the accuracy to plot a graph to see the progression of loss and accuracy through the epochs, How to do so ? also, I want to visualize a few examples as we did earlier and compare it with model's output. Thank you for your time!
the easiert solution would be to use the Tensorboard (I have a video about this). Or you store the loss/accuracy in each epoch in a list, and then plot it yourself with matplotlib
@@patloeber Hi , this is excellent video! many thanks for this. I think had similar question - I would try to verify myself. However, it seems that optimizer.step() might be using gradient info inside to step. So if you make it zero before stepping, then first step goes waste. However the code still works because the next step() call may be using gradient of previous iteration. So, there might be a "loss" of one batch of data, but still it works.
One question: how does the code understands when you're calling the model(images) that it should use the feedforward method in it? because I though model is an instance of the NeuralNet class
thanks for this impressive tutorials ! I have a question confusing me for a while : when training the NN mode we give N * feature to the model directly, N is the number of the data, feature is the dimension of the input, then use the same data train the NN like 1000 times, My question is: why not every time we give only one single line of the data, and train it len(data set) times? this for me is more reasonable, I cannot figure out the benefits using whole data set to train 1000 times, and how can understand it in mathematical way, Thank u very much, hope some one can explain this for me
@Shuai I think your question is about the (batch/mini-batch) size and the epoch number. If we use just one data and train the model len(dataset) time is like we have choose to use SGD with batch_size=1 and epoch=len(dataset). This means that we only used the prediction error made by one single data point to update the weights at each step while by increasing the mini-batch size we have more data to evaluate the loss. But it is intirely up to you and you can try out various batch size with epoch and look at the loss to see which batch size is more suitable for your problem
I also ran into the same problem, and based on @Python Engineer advice and codes on github, I found the problem, and I suspect yours would be similar. My problem is in my testing loop I had my model output to 'output', while in my training I had my model output to 'outputs', and then, when I am getting my predictions, I pass in 'outputs'. A simple fix for me is to make sure I line up variable names, and accuracy goes to the expected ~95% with 2 epochs.
File "", line 69 print(f 'epoch {epoch+1} / {num_epochs}, step {i+1}/{n_total_steps}, loss = {loss.item():.3f} ') ^ SyntaxError: invalid syntax Why I am getting this error when I run the same code?
@@patloeber I am using Python 3.7.10. This is the challenge I am having now: >>>--------------------------------------------------------------------------- NameError Traceback (most recent call last) in 32 test_loader = torch.utils.data.DataLoader(mnist_test, batch_size=batch_size, shuffle=False) 33 ---> 34 class NeuralNet(nn.Module): 35 def _init_(self, input_size, hiden_size, num_classes): 36 super(NeuralNet, self)._init_() in NeuralNet() 44 out=self.l2 45 return out ---> 46 model=NeuralNet(input_size,hidden_size,num_classes) 47 48 #loss and Optimizer NameError: name 'NeuralNet' is not defined
it's not really syntax, it's called a "throwaway" variable. generally it is used for any value that is unused. i personally avoid using it, as python does have another somewhat obscure use for underscores, which is intended for use in the shell, where it represents the result of the previous expression.
I like your tutorials. But the code completion pop-ups constantly appearing is really painfull. Very distracting. I see from the video description you are pushing the code completion tool .. and it might be ok .. but this is not a good advertisment for it because it constantly interrupts trying to read your code.
You are correct! Nice catch! Sorry I forgot this. In my example it did not produce an error since I didn't have GPU support on the MacBook anyway. But it should give an error if you have GPU support...
10:35 I forgot to send the model to the device! Please call model = NeuralNet(input_size, hidden_size, num_classes).to(device)
Sir why the accuracy is different everytime?
@@krishnachauhan2822 Due to shuffling of data and random initialisation of weights
I followed with great attention. But, I am lost here.
labels = labels.to(device)
Error: 'int' object has no attribute 'to'
loss = criterion(outputs, labels)
Error: cross_entropy_loss(): argument 'target' (position 2) must be Tensor, not int
@@r_pydatascience the labels need to be converted to tensor . Both are errors are there due to the same reason. Maybe you forgot to call transform when taking dataset .
@@krishnachauhan2822 When you train a NN, you're trying to get the loss to a local minima. The issue is that there are often many different local minima it can fall into. So unless you use a random seed, so that you get the same random numbers every time, your model could end up in any one of these similar local minima each time it trains.
It's kind of interesting. Like you're teaching a bunch of children numbers, and even though they can all tell numbers by the end of it, they all think in different ways, and may have different levels of mastery.
Really felt satisfying to be able to put together all that I learned in the previous videos. Thank you for this series!
@Python Engineer
please why do you have to use the zero_grad before calculating the loss function.
Shouldn't we calculate the loss and take the optimized step before using the zero_grad?
For me, it works with "samples, labels = next(examples)"
Otherwise, it throws me an error: "AttributeError: '_SingleProcessDataLoaderIter' object has no attribute 'next'"
this will work
examples = iter(train_loader)
samples, labels = next(examples)
That's because of the disparity in the PyTorch version.
if u use the self identity .__next__() it shud work I had the same issue
Nice.
And a good call on explaining the avoiding a call to a softmax actuation in the model, because the cross entropy criteria does that for us.
Yes :) Glad you like it
You have been doing a great job teaching pytorch to beginners like me! Keep it up!
A quick suggestion: you could add plot/image of the number and show what the NN predicts.
It's really awesome content. Just a suggestion, bro, you could add the relevant doubts from comment sections and make a FAQ section, which helps beginners solve their common doubts.
thanks for the tip!
Really fantastic series, keep up the amazing work! Looking forward to your future videos!
Thank you for watching :)
hi. love following your series! thanks!!!
can you please elaborate on the torch.max part...
what exactly do you call "values" (you ignore those and store them in "_")? values of what?
why is an index the same as the predicted label?
and what is the "1" passed along with the model output
Excellent, clear, and without extra stuff! Thank you!
thank you for your instructions. it is really helpful for my assignment.
glad to hear that
I believe that images.reshape(-1,28*28) won't work for everyone, you can just define a new function flatten = nn.Flatten() and then call images = flatten(images) before reshaping images; it'll work.
Does it work on data types other than images too?
Thank you for the discussion.
Thanks a lot, I am really enjoying your tutorials. Good job!
Thank you! Glad you like it :)
What a great video! Thank you very much :)
Thank you for this series!
Hi, how do I reshape the image in the trainingloop if I have a RGB picture?
Wonderful, and amazing teaching sir, thanks a lot.
Thank you :)
Hi great tutorials. But I have a question. On previews tutorials you showed that the workflow is: [loss = criterion(outputs, labels), loss.backward(), optimizer.step(), optimizer.zero_grad()].
However on this one you changed the order to: [loss = criterion(outputs, labels), optimizer.zero_grad(), loss.backward(), optimizer.step()].
Now I am wondering if you set to zero the gradients, then, how the optimizer could update the parameters without any information about the gradient?
I presume that optimizer.zero_grad() is merely for resetting the gradients, so it shouldn't matter whether it is done after optimizer.step() in an existing epoch/step or done before loss.backward() in the next epoch/step; the key here is that existing gradients should be flushed out, if any, before computing the new gradients and updating them
Well said@@priyanshumohanty5261
Great series, thank you python engineer
How can the NeuralNet class be rewritten as a multiple regression network? Eg. I want to train a network to predict 3 real values as the output that represent (x,y,z) coordinates. Could I use the class as is? Or do I have to change the forward pass or output dimensions?
Feels so good, tysm!
Amazing series, thank you so much
you are doing very good I have taken lot of knowledge from your tutorial than you so much. please extend your tutorials towards word vectors and seq2seq model
glad you like it :)
hi, may I ask whether optimizer.zero_grad() should be called before or after loss.backward() and optimizer.step()?
I think both options are fine :)
bro please put an exercise at the end of every video. Also. Superb job
May I ask your python version. when I was trying to write samples, labels = examples.next(), there is an error says TypeError: object() takes no parameters
Great video, i was wondering if the BCEloss function was applying the sigmoid function before computing the loss just like the Cross entropy apply the softmax function.
Hello, I dont understand the purpose of torch.max - what does it do? Also, why is the index the prediction? Thank you, @patloeber!
The lecture is awesome, but i have one question, here why the labels are not converted from shape [100] to [100 * 1]?
solid content.
reshape method can't be used in pytorch, how is it working here??
sir i want to ask you somethings about trainig. firstly in 06:30 we printed label.shape is torch.Size([100]). you said every class label has 100 image. so have we 1000 image(cuz we have 10 classes) in all train dataset. in this examples we did 2 epochs and 784 flat image data with 100 batch size. and 6 steps every epoch.so that 1200 image was trained. why is the train_loader length is 6 and what is the label.shape? im little bit confused can you help me :)
Thank you so much
I can not find any data folder in my working directory after downloading as per the code. Though the code is working fine. Where I am wrong?
Hi, thanks for a video
Just one thing, how can I test any single custom image in model, I always face problems with shapes . Is there any quick method?
Thanks!
What's the idea of dividing the "test" data into batches as well? I thought batches are only relevant when training the data...
the examples.next() will not work already. Encountered this issue and been stuck with this. You might want to change this into samples, `labels = next(examples)`
thank you, brother
Many thanks for your great video @PythonEngineer !
Just one question how do you choose the hidden_size ? Does it only represent the number of neurons in the network?
Best !
thanks for video, beautiful.
but i little confuse about putting optimizer.zero_grad() front of loss.backward and optimizer.step(), something i miss something??
i mean why we didnt put after loss.bacwad() and optimizer.step() like before? Thanks
You can do it at the end of your loop (then it’s empty for the next iteration) or at the beginning of your loop. Just make sure that the grads are empty before calling backward and update step
@@patloeber ohh okayy. Thanks
At @19:50, shouldn't it be `n_correct += (predictions == labels).sum().item()` ?
During the test, shouldn't we add a softmax layer?
I am getting a SEGMENTATION FAULT.
Any idea why?
Accuracy of the network on the 10000 test images: 98.06 %
Segmentation fault (core dumped)
Thank you a lot for excellent explanation I would like ask , you said _, predection = torch.max(outputs,1) number 1 here what it does mean?
This is an excellent tutorial. May I please know what color theme are you using for the editor?
I think it is Monokai in this video. Right now I'm using Night Owl
@@patloeber Thank you so much! I discovered this channel recently and I love it:)
I watched all your videos. I need a help on transfer neural network for RUL estimation. How and where can i get help sir ?
Thanks Mate
how the forward method is been called? would u plz explain or give a good n short vid?
thanks for u beautiful channel
I don't know very much of programming, but i think is in 'output = model(images)' line.
output = model(images) is like NeuralNet.forward(images), i think.
nn.Module maybe has a function to do this automatically (automagically).
Very well done !
Thanks a lot :-)
Thanks!
I am not super familiar with python, can someone explain to me what the "_," does in line 89 at minute 18:45 ? Could not find anything helpful online.
it's a convention, meaning that variable that will not be used
very nice
Hey great tutorial. Just one thing, how is that when you do the forward part you can call the forward method just by using the outputs=model(images) ? thx
yes, model(x) is performing the forward pass. This is because PyTorch implemented the __call__(self) function with the forward pass
@@patloeber is forward the keyword? i mean if i have more than one function in the NeuralNet Class, which one will be called when you call model(X ) THANKS IN ADVANCE
@@tarekradwan8661 the custom class inherits forward method from nn.Module and overrides it.
Made a typo in my code saying optimizer.step instead of optimizer.step() so the model would run but wouldn't converge at all.
Great tutorial. How did you decide on the number of layers for the network in this tutorial? Is there a general rule or guidance on minimum required to build a network?
In this tutorial there was no real plan, but input size and number of classes must be according to the dataset. In general you can try to model architectures from other popular networks, and then tweak it for your needs
Hey python Engineer, What a great and useful tutorials! You helped me a lot! I have two question I am a beginner at PyTorch and I would really appreciate your help , If I want after calculating the accuracy to plot a graph to see the progression of loss and accuracy through the epochs, How to do so ? also, I want to visualize a few examples as we did earlier and compare it with model's output. Thank you for your time!
the easiert solution would be to use the Tensorboard (I have a video about this). Or you store the loss/accuracy in each epoch in a list, and then plot it yourself with matplotlib
Why does he do optimizer.zero_grad() before calculating the gradients and taking a step rather than after?
Both are fine. Just make sure the gradients are emtpy again in the next iteration
@@patloeber Hi , this is excellent video! many thanks for this. I think had similar question - I would try to verify myself. However, it seems that optimizer.step() might be using gradient info inside to step. So if you make it zero before stepping, then first step goes waste. However the code still works because the next step() call may be using gradient of previous iteration. So, there might be a "loss" of one batch of data, but still it works.
How come optimizer step is working well even after zero_grad operation?
Normally it should not work well and produce different results. But for some simple examples you might not see that big of a difference
I had that exact same doubt. xD
At 14.06, I didn't get the size (100,1,28,28). Could you please explain?
I explain the size at minute 06:30
One question: how does the code understands when you're calling the model(images) that it should use the feedforward method in it? because I though model is an instance of the NeuralNet class
Pytorch implemented the __call__(self) method such that it uses the forward method inside...
@@patloeber Got it! thank you!
Thanks!!
thanks for watching!
what is the meaning of batch_size = 100?
Is there a way to check the torch model parameter updates along with the loss for each iteration.
sure. you get it with model.parameters(). You can also have a look at model.state_dict() and optimizer.state_dict() and print what you need
I am stuck here;
>>>>class NeuralNetwork(nn.Module):
>>>> def __init__(self, input_size, hiden_size, num_classes):
>>>>super(NeuralNetwork, self).__init__()
>>>>self.l1=nn.Linear(input_size, hidden_size)
>>>>self.relu=nn.ReLU()
>>>>self.l2=nn.Linear(hidden_size, num_classes).to(device)
---------------------------------------------------------------------------NameError Traceback (most recent call last)
in 34 test_loader = torch.utils.data.DataLoader(mnist_test, batch_size=batch_size, shuffle=False) 35
---> 36 class NeuralNetwork(nn.Module): 37 def __init__(self, input_size, hiden_size, num_classes): 38 super(NeuralNetwork, self).__init__()
in NeuralNetwork() 46 out=self.l2
47 return out
---> 48 model=NeuralNetwork(input_size,hidden_size,num_classes) 49
50 #loss and Optimizer
NameError: name 'NeuralNetwork' is not defined
thanks for this impressive tutorials ! I have a question confusing me for a while : when training the NN mode we give N * feature to the model directly, N is the number of the data, feature is the dimension of the input, then use the same data train the NN like 1000 times, My question is: why not every time we give only one single line of the data, and train it len(data set) times? this for me is more reasonable, I cannot figure out the benefits using whole data set to train 1000 times, and how can understand it in mathematical way, Thank u very much, hope some one can explain this for me
@Shuai I think your question is about the (batch/mini-batch) size and the epoch number. If we use just one data and train the model len(dataset) time is like we have choose to use SGD with batch_size=1 and epoch=len(dataset). This means that we only used the prediction error made by one single data point to update the weights at each step while by increasing the mini-batch size we have more data to evaluate the loss.
But it is intirely up to you and you can try out various batch size with epoch and look at the loss to see which batch size is more suitable for your problem
7:00 "For each class label we have one value here"?
But the value is 100 and we have 10 classes
I did everything exactly the same, it is working fine but I am getting the testing accuracy 9.5% up to 11%. Any explanation?
Hmm weird. Can you compare with my code on github? Maybe there is a slight difference
I also ran into the same problem, and based on @Python Engineer advice and codes on github, I found the problem, and I suspect yours would be similar. My problem is in my testing loop I had my model output to 'output', while in my training I had my model output to 'outputs', and then, when I am getting my predictions, I pass in 'outputs'. A simple fix for me is to make sure I line up variable names, and accuracy goes to the expected ~95% with 2 epochs.
print('Thank you very much')
File "", line 69
print(f 'epoch {epoch+1} / {num_epochs}, step {i+1}/{n_total_steps}, loss = {loss.item():.3f} ')
^
SyntaxError: invalid syntax
Why I am getting this error when I run the same code?
which Python version are you using? Make sure it supports f-Strings
@@patloeber I am using Python 3.7.10.
This is the challenge I am having now:
>>>---------------------------------------------------------------------------
NameError Traceback (most recent call last)
in
32 test_loader = torch.utils.data.DataLoader(mnist_test, batch_size=batch_size, shuffle=False)
33
---> 34 class NeuralNet(nn.Module):
35 def _init_(self, input_size, hiden_size, num_classes):
36 super(NeuralNet, self)._init_()
in NeuralNet()
44 out=self.l2
45 return out
---> 46 model=NeuralNet(input_size,hidden_size,num_classes)
47
48 #loss and Optimizer
NameError: name 'NeuralNet' is not defined
_, predictions = ... I have never seen the _, syntax before. Can someone point me to where i can read about it?
it's not really syntax, it's called a "throwaway" variable. generally it is used for any value that is unused.
i personally avoid using it, as python does have another somewhat obscure use for underscores, which is intended for use in the shell, where it represents the result of the previous expression.
Bro can you please tell me how to print predicted img
you mean you want to have a look at the predicted outcome? I recommend using the Tensorboard for this. (Have a look at tutorial #16)
my kernel keeps saying its dead
I like your tutorials. But the code completion pop-ups constantly appearing is really painfull. Very distracting. I see from the video description you are pushing the code completion tool .. and it might be ok .. but this is not a good advertisment for it because it constantly interrupts trying to read your code.
I thought you also need to send the model.to_device?
You are correct! Nice catch! Sorry I forgot this. In my example it did not produce an error since I didn't have GPU support on the MacBook anyway. But it should give an error if you have GPU support...
@@patloeber Could you please tell us how to fix it? thank you so much!
@@patloeber having error, can't able to solve it, can u provide the right solution 🙂
@@priteshsinghvi9067 Under model = NeuralNet(input_size, hidden_size, num_classes), just put:
model.to(device)
no module named torchvision
you need to install it with pip install torchvision, or conda...see the installation guide :)
Every 5-10 min: “Meet webflow ...“
Skip ad