It looks like a recent numpy update has changed the default np.load(Path) function to have allow_pickle=False by default. To accommodate this numpy update, I needed to change the following line in the balance_data.py script from: train_data = np.load('training_data.npy' to: train_data = np.load('training_data.npy', allow_pickle=True)
1.Drive around for about 30 minutes using the directional keys 2.run balance_data.py 3.Wonder why it doesn't work 4.complain in the comments 5.try again with wasd Btw, your vids are awesome ;D
this rocks !!! easy ,fast and efficient than the last method !!! GREAT WORK ! note - we all can share our data in github and hence every one will have huge data sets to train from!
Anyone who wants to share some training data is welcome to, I will happily host it and validate it. I think I first want to come up with a final concept before I start building anything too large. I may try to further perfect this traffic speeder guy. Also curious about implementing the evading police a bit more. I am also not sure if I want to stay in 3rd person or move to 1st.
I love your videos. I'm not a great programmer at all but seeing someone with the skills you have still mess up and have fun with it makes me feel better about all the mistakes I make.
I'm following this series because I'm wanting for neural network from scratch series and I wanna build AI for my Android game that I made :) Much love Sentdex
I'm thinking that pre-allocating the memory for lefts, rights and forwards would be a lot faster. I was looking at this as inspiration for my own data (3-second audio files). I have about 700,000 of them, and pre-allocating memory helped make it blazingly fast. I was also using numpy arrays instead of lists though. P.S. Still my favourite youtube channel. Sorry, Siraj.
Maybe one shouldn't break the temporal/linear consistency of the data. Rather, pack the data into tuples of size 2 or 3 depending on your choice of threshold following the Markov property. Rather than shuffling the entire thing, one should shuffle those tuples rather. For eg: 1. Break the list into tuples of size, let's say 3 : new_data = zip(data[::3], data[1::3], data[2::3]) 2. Shuffle the new data instead shuffle(new_data) I might be wrong but maybe this could be a better input to feed to a neural network rather than a single frame at a time.
Is Training-data-vid.npy a separate file you trained, or did you ass '-vid' to the original file to see what was going on? I'm a little confused on that, thanks!
For those who're not able to get correct values for [W,A,D] / getting [0,1,0] always. 1. Run the terminal/anaconda prompt as administrator & then run the python file. 2. Run the game as administrator 3. Turn on CAPS-LOCK
Can anyone explain more about the purpose of balancing the data? Isn't that makes "left" and "right" more important and "straight"less important in the dataset and causing the model generate too many "left" and "right" signals than it should be?
cheung dennis It doesn't make the lefts or rights more important, it just shows more samples where the correct action is to turn left or right. The problem is actually the reverse. Without balancing the data we're teaching the network that going straight is the correct answer 90% of the time. The easiest solution for the network to learn then is to just always say go straight, unless it's VERY confident that it should turn. Doing so guarantees it a 90%+ success rate - much better than having to actually learn what the correct option is.
Hey Sentdex, I love this series. I have one question - How comes you did divide your image data values by 255, so that they fit between 0 and 1? I thought this was important for the model to work with the data better? Was this step left out intentionally?
silly question, but wouldnt setting the lengths of lefts rights and forwards equal just make each one equal probability, as if you'd pressed them an equal amount of times thus making it pointless having done the training??? what am i missing / not understanding?
small question, i've created about 140 files each of 500 iterations, but when i load the files i get different counter values, anyone have a clue what is happening? wondering if it is just a memory error or something, to clarify the counter is for all of the files
You could mirror your 'right' and 'left' part of the dataset, right? That way you should be able to augment the number of the not-forward data. Let me know if this does not work. btw, great tutorials :}
The goal is to slice that list so it's only as long as the shortest list. Let's say forward is 500 long, left is 205 long, and right is 298 long. forward = forward[:len(left)] makes forward 205 long. Then when we also do [:len(right)] , we're saying we'll slice up to 298, but the length is already 205, so the length is still 205. Hope that clears it up. If not, make some examples for yourself and play with it to see how it works.
Freeze-framing at 12:34 the math of your balance slices doesn't add up. Does it matter? What's making it not add up? 70365 - forwards 6708 - rights +6427 - lefts ---------- 83500 total!! Sweet, that's = len(final_data), but... Taking the smallest (lefts) and trimming the others to its len() should create 3x6427 = 19281 < 22436. Also 22436 % 3 != 0. Our final_data isn't really balanced?? What's causing this?
Thanks GeniusOD. I and hopefully everyone else caught that error in their code (if they coded alongside the video). The github code was corrected by Sentdex quickly but I didn't notice he'd actually run the erroneous code in the video. Thanks for pointing that out.
really enjoy your videos; would appreciate if you can balance your voice volume with your very loud keyboard. maybe possibly moving your mic, or using a different keyboard would be super awesome! still love your videos!
This works 2020-02-24: import numpy as np import pandas as pd from _collections import _count_elements from random import shuffle import cv2 trainin_data = np.load("training_data-vid.npy",allow_pickle=True) for data in trainin_data: img = data[0] choice = data[1] cv2.imshow(("test"),img) print(choice) if cv2.waitKey(25) & 0xFF == ord("q"): cv2.destroyAllWindows() break
After getting my training data in training_data.npy and running balance_data.py, I get 'None' value for each iteration. Can someone tell me what my error is?
I FOUND THE SOLUTION!!! 1. Run the terminal/anaconda prompt as administrator & then run the python file. 2. Run the game as administrator 3. Turn on CAPS-LOCK
It looks like a recent numpy update has changed the default np.load(Path) function to have allow_pickle=False by default. To accommodate this numpy update, I needed to change the following line in the balance_data.py script from:
train_data = np.load('training_data.npy'
to:
train_data = np.load('training_data.npy', allow_pickle=True)
Thank You!! xD
Appreciated, thanks!
Thanks I searched for that error!
Thanks!
1.Drive around for about 30 minutes using the directional keys
2.run balance_data.py
3.Wonder why it doesn't work
4.complain in the comments
5.try again with wasd
Btw, your vids are awesome ;D
Hahaa 🤣
This series is just so amazing!
I love that you fail from time to time and your great explanations.
"every frame is its own snowflake"
this rocks !!!
easy ,fast and efficient than the last method !!! GREAT WORK !
note - we all can share our data in github and hence every one will have huge data sets to train from!
Anyone who wants to share some training data is welcome to, I will happily host it and validate it. I think I first want to come up with a final concept before I start building anything too large. I may try to further perfect this traffic speeder guy. Also curious about implementing the evading police a bit more. I am also not sure if I want to stay in 3rd person or move to 1st.
yeah!
in next video if you can map mouse input it will bring your AI to next level! :)
All the best! hoping to see next tutorial soon!
I love your videos. I'm not a great programmer at all but seeing someone with the skills you have still mess up and have fun with it makes me feel better about all the mistakes I make.
I laughed when he pasted in rights a third time at 10:43.
I'm following this series because I'm wanting for neural network from scratch series and I wanna build AI for my Android game that I made :) Much love Sentdex
I'm thinking that pre-allocating the memory for lefts, rights and forwards would be a lot faster. I was looking at this as inspiration for my own data (3-second audio files). I have about 700,000 of them, and pre-allocating memory helped make it blazingly fast. I was also using numpy arrays instead of lists though.
P.S. Still my favourite youtube channel. Sorry, Siraj.
A lot more cumbersome though :(
pythonprogramming.net/more-interesting-self-driving-python-plays-gta-v/?completed=/testing-self-driving-car-neural-network-python-plays-gta-v/
What is pre allocation and how to do it?
Maybe one shouldn't break the temporal/linear consistency of the data. Rather, pack the data into tuples of size 2 or 3 depending on your choice of threshold following the Markov property. Rather than shuffling the entire thing, one should shuffle those tuples rather. For eg:
1. Break the list into tuples of size, let's say 3 :
new_data = zip(data[::3], data[1::3], data[2::3])
2. Shuffle the new data instead
shuffle(new_data)
I might be wrong but maybe this could be a better input to feed to a neural network rather than a single frame at a time.
Is Training-data-vid.npy a separate file you trained, or did you ass '-vid' to the original file to see what was going on? I'm a little confused on that, thanks!
he changes it back ruclips.net/video/wIxUp-37jVY/видео.html
For those who're not able to get correct values for [W,A,D] / getting [0,1,0] always.
1. Run the terminal/anaconda prompt as administrator & then run the python file.
2. Run the game as administrator
3. Turn on CAPS-LOCK
hello please can any one illustrate the numbers wich appear at 5:59 is it contain the slopes and what another informaition in it
Can anyone explain more about the purpose of balancing the data? Isn't that makes "left" and "right" more important and "straight"less important in the dataset and causing the model generate too many "left" and "right" signals than it should be?
cheung dennis
It doesn't make the lefts or rights more important, it just shows more samples where the correct action is to turn left or right.
The problem is actually the reverse. Without balancing the data we're teaching the network that going straight is the correct answer 90% of the time. The easiest solution for the network to learn then is to just always say go straight, unless it's VERY confident that it should turn. Doing so guarantees it a 90%+ success rate - much better than having to actually learn what the correct option is.
Hey Sentdex, I love this series. I have one question - How comes you did divide your image data values by 255, so that they fit between 0 and 1? I thought this was important for the model to work with the data better? Was this step left out intentionally?
At what part was the labeling of images done ?
silly question, but wouldnt setting the lengths of lefts rights and forwards equal just make each one equal probability, as if you'd pressed them an equal amount of times thus making it pointless having done the training??? what am i missing / not understanding?
small question, i've created about 140 files each of 500 iterations, but when i load the files i get different counter values, anyone have a clue what is happening? wondering if it is just a memory error or something, to clarify the counter is for all of the files
any help ? : while running balance script i get error : AttributeError: 'NoneType' object has no attribute 'fileno'
You could mirror your 'right' and 'left' part of the dataset, right? That way you should be able to augment the number of the not-forward data.
Let me know if this does not work.
btw, great tutorials :}
can you explain why you did forwd= forwd[:len(left)][:len(right)]???
The goal is to slice that list so it's only as long as the shortest list.
Let's say forward is 500 long, left is 205 long, and right is 298 long.
forward = forward[:len(left)] makes forward 205 long.
Then when we also do [:len(right)] , we're saying we'll slice up to 298, but the length is already 205, so the length is still 205. Hope that clears it up. If not, make some examples for yourself and play with it to see how it works.
equivalent of forward = forward[: min(len(left),len(right))]????
Yep, that's right.
Thanks man
4:10 - 4:24 best part
Freeze-framing at 12:34 the math of your balance slices doesn't add up. Does it matter? What's making it not add up?
70365 - forwards
6708 - rights
+6427 - lefts
----------
83500 total!! Sweet, that's = len(final_data), but...
Taking the smallest (lefts) and trimming the others to its len() should create 3x6427 = 19281 < 22436. Also 22436 % 3 != 0. Our final_data isn't really balanced??
What's causing this?
There should be rights = rights[:len(forwards)] instead of rights = rights[:len(rights)].
Thanks GeniusOD. I and hopefully everyone else caught that error in their code (if they coded alongside the video). The github code was corrected by Sentdex quickly but I didn't notice he'd actually run the erroneous code in the video. Thanks for pointing that out.
I think LSTM + CNN will play better grade than simple cnn cause we should know our speed as short term memory.
Where is the training data hosted?
I feel like training with imitation learning before actual DQN or DDPG is a good idea
please do a neural network that silence the noise from your keyboard
+sminsms wouldnt that cancel out my neural network that amplifies my keyboard noise?
sentdex Shook 😲
Keyboard noise sounds great only for me? :D
Can anyone please share their 'training_data.npy' ?!
Could anyone please explain to me why the number of data after balancing, which is 22436, is not equal to three times the least number of choices?
I think the shuffle here has problem, I tried this code and found that after shuffling, the number of each choices changed, it confused me....
pass train_data as a list to shuffle. shuffle([train_data])
really enjoy your videos; would appreciate if you can balance your voice volume with your very loud keyboard. maybe possibly moving your mic, or using a different keyboard would be super awesome! still love your videos!
This works 2020-02-24:
import numpy as np
import pandas as pd
from _collections import _count_elements
from random import shuffle
import cv2
trainin_data = np.load("training_data-vid.npy",allow_pickle=True)
for data in trainin_data:
img = data[0]
choice = data[1]
cv2.imshow(("test"),img)
print(choice)
if cv2.waitKey(25) & 0xFF == ord("q"):
cv2.destroyAllWindows()
break
After getting my training data in training_data.npy and running balance_data.py, I get 'None' value for each iteration. Can someone tell me what my error is?
SOMEONE PLEASE REPLY
I’ve had this issue as well, I believe it’s a bug that’s ongoing
Do you have a link to your source code? No one is responding probably because it is hard to know what you did wrong because we can't see your code
it`s amazing
you import the wrong shuffle function,
it should be np.random.shuffle,not random.shuffle
mine always showing [0,1,0]
Mine is kind of similar [1,0,0], did you get any solution
Caps-Lock
I FOUND THE SOLUTION!!!
1. Run the terminal/anaconda prompt as administrator & then run the python file.
2. Run the game as administrator
3. Turn on CAPS-LOCK
play GTA-V FOR SCIENCE!!
right 😂
You could try balance your data using weighted_cross_entropy_with_logits
Have you trained it using a RNN?
Should work better I guess.
from pandas import 🐼🐼🐼🐼🐼
help
Host the data separately, dont bundle it with the code. Thanks.
It's already been hosted, and it's not bundled with the code.
where can I find the training data?