very cool video. I had a question about this particular line of code "np.diagonal(np.take(y_pred_errors, y_true_errors, axis=1))". It seems like the np.take function is doing a lot of unnecessary sampling. Is there a way around this so we don't even bother with np.diagonal? edit: For anyone interested, I think this is a little better true_probability_errors = y_pred_errors[np.arange(len(y_true_errors)), y_true_errors]
@@AndreasZinonos yah I know but I think x_train has 3 dimensions and we want to get the matrix which has the label i but x_train[i][0] seem to be a first array of the matrix i. Is it true, sorry but I'm just a newbie :((
@@khoitaquang7351 So if we break it down: X_train: has three dimensions X_train[y_train == i] means select from the 3-dim matrix X_Train, where the y_train label is equal to i. y_train and X_train have the same number of elements, X_train has the image matrices while y_train has only the labels. So if i = 2, if we did: X_train[y_train == 2], this line of code would first select all y_train samples with label 2 and then select the corresponding images in the X_train array, which are basically 2d arrays. Adding the [0], so we have X_train[y_train== 2][0], this would simply return the first element/image/2d matrix where the label is 2. We replace 2 with i since we are doing this in a loop. The best way to see this yourself is if you run piece by piece every command in the notebook and see what output you get.
Depends on what you want to do, but for deep learning PyTorch is my favourite. Also see Keras, TensorFlow. For general machine learning some good ones are numpy, matplotlib, pandas, sklearn.
hey is there any particular reason for choosing ur units to be 128? Im new to neural networks and I dont understand how to decide how many neurons we should have in our hidden layers. please let me know
hello author.I am doing with the same thing in evaluate and inference stage.But it shows ValueError: Input 0 of layer "sequential_9" is incompatible with the layer: expected shape=(None, 784), found shape=(None, 4704)
boolean index did not match indexed array along dimension 0; dimension is 60000 but corresponding boolean dimension is 10000, this is the error I get if I try to display the samples
I have a question please. I don’t understand how output layer gives the highest probability for handwritten 0 at its 0th neuron, handwritten 4 at its 4th neuron and etc.
Let's say you have 3 classes for simplicity, the digits 0, 1, 2. The output vector will be a vector of size 3, that contains the probability for the input image to be each of those classes/digits. For example if you give an image of a 2, your output vector could be something like: [0.03, 0.07, 0.99]. You can see the highest probability is for class/digit 2, since this is the digit 2, so our classifier performs correctly. Also note that all probabilities should sum to 1 (basic probability theory) minus numerical errors.
@@AndreasZinonos thank you very much for the answer. I think I understand it now but I still don’t get how is it that for class/digit 2 the 3rd neuron’s output(the 0.99 one) is the highest?
i have a question ...can u please please ...tell me...if the model has already been train then do we have to train it again ,when do the train weight are stored ? ...and how much time it takes
If you don’t save the weights you lose them and you have to train again. You can save the model (look at the keras docs) and load it later, so you don’t need to train again in that case.
@@AndreasZinonos can u tell me for handwritten calculator project ...mnist dataset won't work .since it has only number but not operators right?so can i use mnist for numbers and other dataset for operator ...is it possible to do it?
@@worldzodiac8378 so generally your test set needs to be as similar to the training set as possible. The least similar it is, the more difference you will have in your error rates. As you said, you would need to have operators in your training set for it to understand them in the test set, otherwise it would not have seen them before. I wouldn’t suggest combining datasets though. You need to find a dataset that has all of them or make your own.
@@AndreasZinonos oh i get about the error rates it get but can u tell me, is it possible to train same model first with mints dataset and then other dataset and stored in json file or not?
Den ime sigouros, pithanon kati na pezei me ta variables pu kaneis pass sto function? Tha synistousa na katevaseis to notebook pu ekana upload sto github ke na to sigkrineis me to code su, einai ligo dyskolo na ktlvw ti pezei apo to comment mono :)
Make sure you have defined them above and have run the cell where they were defined. I would restart the kernel of the notebook and carefully run each cell to see where the problem is.
@@AndreasZinonos sure thing! I have another question, when I try to train the model I get : “ValueError: Shapes (None, 10, 10) and (None, 10 are incompatible” Yet I literally copied your code from GitHub. Any idea why?
@@hervelabrie-durand4574 hmm that means the input data shape and the first layer of the network or any 2 consecutive layers of the network have different sizes, you probably missed something small. If you run the original notebook I have on GitHub does it do that?
Good that you’re trying to experiment! The problem here is most likely that the test sample you are using (the one from paint), is from a very different distribution than the data you used to train on. For example if all the training data has a black background like MNIST, but your test sample has a white background (from Paint), the model will fail to make an accurate prediction.
Just finished learning multivariable calculus and I'm ready to rock and roll :)
Very interesting and insightful tutorial. Thanks Andreas!
Great introduction man!
Andreas, thanks for a good job! Very helpful!
Thanks for the video, keep creating, please!
very cool video. I had a question about this particular line of code "np.diagonal(np.take(y_pred_errors, y_true_errors, axis=1))". It seems like the np.take function is doing a lot of unnecessary sampling. Is there a way around this so we don't even bother with np.diagonal?
edit: For anyone interested, I think this is a little better
true_probability_errors = y_pred_errors[np.arange(len(y_true_errors)), y_true_errors]
Great job Andreas
When you fit the model, you run 60000 data, but in my program it's 118 data for each epochs, am i doing it wrong ?
That’s probably the batch size, otherwise yeah there might be something wrong.
@@AndreasZinonos But, it's working, nothing wrong in other step.
@@andreansihombing6780 try using my code and see if you get different outputs
Hi, i got the same issue. Setting the batch_size = 1, fixed it for me.
Hi can you make more projects with deep learning, this really helping me to learn
Thank you, will try depending on how many views these get :)
about 10:04
Whoever use Keras:
from keras.utils import to_categorical
y_train = to_categorical(y_train, num_classes)
y_test = to_categorical(y_test, num_classes)
Whoever use tensorflow.keras:
from tensorflow.keras.utils import to_categorical
y_train = to_categorical(y_train, num_classes)
y_test = to_categorical(y_test, num_classes)
how do i test using photos?
very helpful
sorry but I don't know the line x_train[y_train== i][0], what does the zero mean?
[0] means take the first element. In most programming languages the indices of a list/array start from 0.
@@AndreasZinonos yah I know but I think x_train has 3 dimensions and we want to get the matrix which has the label i but x_train[i][0] seem to be a first array of the matrix i. Is it true, sorry but I'm just a newbie :((
@@khoitaquang7351 So if we break it down:
X_train: has three dimensions
X_train[y_train == i] means select from the 3-dim matrix X_Train, where the y_train label is equal to i. y_train and X_train have the same number of elements, X_train has the image matrices while y_train has only the labels. So if i = 2, if we did:
X_train[y_train == 2], this line of code would first select all y_train samples with label 2 and then select the corresponding images in the X_train array, which are basically 2d arrays.
Adding the [0], so we have X_train[y_train== 2][0], this would simply return the first element/image/2d matrix where the label is 2. We replace 2 with i since we are doing this in a loop.
The best way to see this yourself is if you run piece by piece every command in the notebook and see what output you get.
Yah thank you very much I've been stuck in it for hour 😅
Which tools and libraries do you use to code
Depends on what you want to do, but for deep learning PyTorch is my favourite. Also see Keras, TensorFlow. For general machine learning some good ones are numpy, matplotlib, pandas, sklearn.
@@AndreasZinonos thanks
hey is there any particular reason for choosing ur units to be 128? Im new to neural networks and I dont understand how to decide how many neurons we should have in our hidden layers. please let me know
It’s always trial and error, but these are numbers that were found to work well for such an experiment. Also the numbers are usually powers of 2.
hello author.I am doing with the same thing in evaluate and inference stage.But it shows ValueError: Input 0 of layer "sequential_9" is incompatible with the layer: expected shape=(None, 784), found shape=(None, 4704)
good job man
boolean index did not match indexed array along dimension 0; dimension is 60000 but corresponding boolean dimension is 10000, this is the error I get if I try to display the samples
I have a question please.
I don’t understand how output layer gives the highest probability for handwritten 0 at its 0th neuron, handwritten 4 at its 4th neuron and etc.
Let's say you have 3 classes for simplicity, the digits 0, 1, 2. The output vector will be a vector of size 3, that contains the probability for the input image to be each of those classes/digits.
For example if you give an image of a 2, your output vector could be something like: [0.03, 0.07, 0.99]. You can see the highest probability is for class/digit 2, since this is the digit 2, so our classifier performs correctly. Also note that all probabilities should sum to 1 (basic probability theory) minus numerical errors.
@@AndreasZinonos thank you very much for the answer.
I think I understand it now but I still don’t get how is it that for class/digit 2 the 3rd neuron’s output(the 0.99 one) is the highest?
@@vaheyepremyan6379 well if you compare the 3 values, 0.99 is bigger than both 0.03 and 0.07, so it’s the largest value.
i have a question ...can u please please ...tell me...if the model has already been train then do we have to train it again ,when do the train weight are stored ? ...and how much time it takes
If you don’t save the weights you lose them and you have to train again. You can save the model (look at the keras docs) and load it later, so you don’t need to train again in that case.
@@AndreasZinonos oh... thank u so much !! i was about to lose it since i have been working on it and i got confuse ..
@@AndreasZinonos can u tell me for handwritten calculator project ...mnist dataset won't work .since it has only number but not operators right?so can i use mnist for numbers and other dataset for operator ...is it possible to do it?
@@worldzodiac8378 so generally your test set needs to be as similar to the training set as possible. The least similar it is, the more difference you will have in your error rates.
As you said, you would need to have operators in your training set for it to understand them in the test set, otherwise it would not have seen them before. I wouldn’t suggest combining datasets though. You need to find a dataset that has all of them or make your own.
@@AndreasZinonos oh i get about the error rates it get but can u tell me, is it possible to train same model first with mints dataset and then other dataset and stored in json file or not?
Η εντολη model.fit(x=x_train, y=y_train, batch_size=batch_size, epochs=epochs)
μου βγαζει error :in wrapper(*args, **kwargs)
971 except Exception as e: # pylint:disable=broad-except
972 if hasattr(e, "ag_error_metadata"):
--> 973 raise e.ag_error_metadata.to_exception(e)
974 else:
975 raise
στο func_graph.py γνωριζεις πως μπορω να το διορθωσω;
Den ime sigouros, pithanon kati na pezei me ta variables pu kaneis pass sto function? Tha synistousa na katevaseis to notebook pu ekana upload sto github ke na to sigkrineis me to code su, einai ligo dyskolo na ktlvw ti pezei apo to comment mono :)
Whenever I try using x_train or y_train, it says they are not defined
Make sure you have defined them above and have run the cell where they were defined. I would restart the kernel of the notebook and carefully run each cell to see where the problem is.
@@AndreasZinonos restarting my computer did the trick for some reason! Thanks for the reply, great tutorial btw
@@hervelabrie-durand4574 Good to hear! Thank you; please subscribe to support the channel :)
@@AndreasZinonos sure thing! I have another question, when I try to train the model I get : “ValueError: Shapes (None, 10, 10) and (None, 10 are incompatible” Yet I literally copied your code from GitHub. Any idea why?
@@hervelabrie-durand4574 hmm that means the input data shape and the first layer of the network or any 2 consecutive layers of the network have different sizes, you probably missed something small. If you run the original notebook I have on GitHub does it do that?
i tried with same model but took a number that i drew in paint. it always predicts 8
Good that you’re trying to experiment! The problem here is most likely that the test sample you are using (the one from paint), is from a very different distribution than the data you used to train on.
For example if all the training data has a black background like MNIST, but your test sample has a white background (from Paint), the model will fail to make an accurate prediction.
The background noise is anoying!
model.fit(x=x_train, y=y_train, batch_size=batch_size, epochs=epochs) .. . .1 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/func_graph.py in autograph_handler(*args, **kwargs)
1145 except Exception as e: # pylint:disable=broad-except
1146 if hasattr(e, "ag_error_metadata"):
-> 1147 raise e.ag_error_metadata.to_exception(e)
1148 else:
1149 raise