Balancing self-driving training data - Python plays GTA p.10

Поделиться
HTML-код
  • Опубликовано: 14 дек 2024

Комментарии • 72

  • @pwal6773
    @pwal6773 5 лет назад +34

    It looks like a recent numpy update has changed the default np.load(Path) function to have allow_pickle=False by default. To accommodate this numpy update, I needed to change the following line in the balance_data.py script from:
    train_data = np.load('training_data.npy'
    to:
    train_data = np.load('training_data.npy', allow_pickle=True)

  • @XenotriX
    @XenotriX 7 лет назад +55

    1.Drive around for about 30 minutes using the directional keys
    2.run balance_data.py
    3.Wonder why it doesn't work
    4.complain in the comments
    5.try again with wasd
    Btw, your vids are awesome ;D

  • @alexnick7119
    @alexnick7119 6 лет назад +7

    This series is just so amazing!
    I love that you fail from time to time and your great explanations.
    "every frame is its own snowflake"

  • @rohanarora2728
    @rohanarora2728 7 лет назад +4

    this rocks !!!
    easy ,fast and efficient than the last method !!! GREAT WORK !
    note - we all can share our data in github and hence every one will have huge data sets to train from!

    • @sentdex
      @sentdex  7 лет назад +3

      Anyone who wants to share some training data is welcome to, I will happily host it and validate it. I think I first want to come up with a final concept before I start building anything too large. I may try to further perfect this traffic speeder guy. Also curious about implementing the evading police a bit more. I am also not sure if I want to stay in 3rd person or move to 1st.

    • @rohanarora2728
      @rohanarora2728 7 лет назад

      yeah!
      in next video if you can map mouse input it will bring your AI to next level! :)
      All the best! hoping to see next tutorial soon!

  • @Damaged7
    @Damaged7 7 лет назад

    I love your videos. I'm not a great programmer at all but seeing someone with the skills you have still mess up and have fun with it makes me feel better about all the mistakes I make.

  • @sethbettwieser
    @sethbettwieser 6 лет назад +2

    I laughed when he pasted in rights a third time at 10:43.

  • @bohdankhv
    @bohdankhv 3 года назад

    I'm following this series because I'm wanting for neural network from scratch series and I wanna build AI for my Android game that I made :) Much love Sentdex

  • @theknight2510
    @theknight2510 7 лет назад +2

    I'm thinking that pre-allocating the memory for lefts, rights and forwards would be a lot faster. I was looking at this as inspiration for my own data (3-second audio files). I have about 700,000 of them, and pre-allocating memory helped make it blazingly fast. I was also using numpy arrays instead of lists though.
    P.S. Still my favourite youtube channel. Sorry, Siraj.

    • @theknight2510
      @theknight2510 7 лет назад

      A lot more cumbersome though :(

    • @andrybratun7064
      @andrybratun7064 6 лет назад

      pythonprogramming.net/more-interesting-self-driving-python-plays-gta-v/?completed=/testing-self-driving-car-neural-network-python-plays-gta-v/

    • @shivamraisharma1474
      @shivamraisharma1474 5 лет назад

      What is pre allocation and how to do it?

  • @tuhinmukherjee8141
    @tuhinmukherjee8141 3 года назад

    Maybe one shouldn't break the temporal/linear consistency of the data. Rather, pack the data into tuples of size 2 or 3 depending on your choice of threshold following the Markov property. Rather than shuffling the entire thing, one should shuffle those tuples rather. For eg:
    1. Break the list into tuples of size, let's say 3 :
    new_data = zip(data[::3], data[1::3], data[2::3])
    2. Shuffle the new data instead
    shuffle(new_data)
    I might be wrong but maybe this could be a better input to feed to a neural network rather than a single frame at a time.

  • @cashdogg411
    @cashdogg411 6 лет назад +2

    Is Training-data-vid.npy a separate file you trained, or did you ass '-vid' to the original file to see what was going on? I'm a little confused on that, thanks!

    • @saint-jiub
      @saint-jiub 2 года назад

      he changes it back ruclips.net/video/wIxUp-37jVY/видео.html

  • @Parth.Deshpande
    @Parth.Deshpande 3 года назад

    For those who're not able to get correct values for [W,A,D] / getting [0,1,0] always.
    1. Run the terminal/anaconda prompt as administrator & then run the python file.
    2. Run the game as administrator
    3. Turn on CAPS-LOCK

  • @abdengineer6225
    @abdengineer6225 4 года назад

    hello please can any one illustrate the numbers wich appear at 5:59 is it contain the slopes and what another informaition in it

  • @dennischeung3745
    @dennischeung3745 7 лет назад +1

    Can anyone explain more about the purpose of balancing the data? Isn't that makes "left" and "right" more important and "straight"less important in the dataset and causing the model generate too many "left" and "right" signals than it should be?

    • @ashwhall
      @ashwhall 7 лет назад

      cheung dennis
      It doesn't make the lefts or rights more important, it just shows more samples where the correct action is to turn left or right.
      The problem is actually the reverse. Without balancing the data we're teaching the network that going straight is the correct answer 90% of the time. The easiest solution for the network to learn then is to just always say go straight, unless it's VERY confident that it should turn. Doing so guarantees it a 90%+ success rate - much better than having to actually learn what the correct option is.

  • @TheDeadking100
    @TheDeadking100 3 года назад

    Hey Sentdex, I love this series. I have one question - How comes you did divide your image data values by 255, so that they fit between 0 and 1? I thought this was important for the model to work with the data better? Was this step left out intentionally?

  • @jyashi1
    @jyashi1 5 лет назад

    At what part was the labeling of images done ?

  • @uobscdarkside732
    @uobscdarkside732 7 лет назад +2

    silly question, but wouldnt setting the lengths of lefts rights and forwards equal just make each one equal probability, as if you'd pressed them an equal amount of times thus making it pointless having done the training??? what am i missing / not understanding?

  • @h0len
    @h0len 6 лет назад

    small question, i've created about 140 files each of 500 iterations, but when i load the files i get different counter values, anyone have a clue what is happening? wondering if it is just a memory error or something, to clarify the counter is for all of the files

  • @abbasshodroj6805
    @abbasshodroj6805 5 лет назад

    any help ? : while running balance script i get error : AttributeError: 'NoneType' object has no attribute 'fileno'

  • @xR0G3R
    @xR0G3R 7 лет назад

    You could mirror your 'right' and 'left' part of the dataset, right? That way you should be able to augment the number of the not-forward data.
    Let me know if this does not work.
    btw, great tutorials :}

  • @mugundhanbalaji
    @mugundhanbalaji 7 лет назад +2

    can you explain why you did forwd= forwd[:len(left)][:len(right)]???

    • @sentdex
      @sentdex  7 лет назад +9

      The goal is to slice that list so it's only as long as the shortest list.
      Let's say forward is 500 long, left is 205 long, and right is 298 long.
      forward = forward[:len(left)] makes forward 205 long.
      Then when we also do [:len(right)] , we're saying we'll slice up to 298, but the length is already 205, so the length is still 205. Hope that clears it up. If not, make some examples for yourself and play with it to see how it works.

    • @mugundhanbalaji
      @mugundhanbalaji 7 лет назад +3

      equivalent of forward = forward[: min(len(left),len(right))]????

    • @sentdex
      @sentdex  7 лет назад +4

      Yep, that's right.

    • @mugundhanbalaji
      @mugundhanbalaji 7 лет назад

      Thanks man

  • @outroutono4937
    @outroutono4937 Год назад

    4:10 - 4:24 best part

  • @WoodyWilliams
    @WoodyWilliams 7 лет назад +1

    Freeze-framing at 12:34 the math of your balance slices doesn't add up. Does it matter? What's making it not add up?
    70365 - forwards
    6708 - rights
    +6427 - lefts
    ----------
    83500 total!! Sweet, that's = len(final_data), but...
    Taking the smallest (lefts) and trimming the others to its len() should create 3x6427 = 19281 < 22436. Also 22436 % 3 != 0. Our final_data isn't really balanced??
    What's causing this?

    • @geniousofdarkness
      @geniousofdarkness 7 лет назад

      There should be rights = rights[:len(forwards)] instead of rights = rights[:len(rights)].

    • @WoodyWilliams
      @WoodyWilliams 7 лет назад +1

      Thanks GeniusOD. I and hopefully everyone else caught that error in their code (if they coded alongside the video). The github code was corrected by Sentdex quickly but I didn't notice he'd actually run the erroneous code in the video. Thanks for pointing that out.

  • @dosonleung536
    @dosonleung536 6 лет назад +2

    I think LSTM + CNN will play better grade than simple cnn cause we should know our speed as short term memory.

  • @akcricketlive6029
    @akcricketlive6029 6 лет назад

    Where is the training data hosted?

  • @junweima
    @junweima 7 лет назад

    I feel like training with imitation learning before actual DQN or DDPG is a good idea

  • @sminsms
    @sminsms 7 лет назад +10

    please do a neural network that silence the noise from your keyboard

    • @sentdex
      @sentdex  7 лет назад +35

      +sminsms wouldnt that cancel out my neural network that amplifies my keyboard noise?

    • @FalloutNewNarwhal
      @FalloutNewNarwhal 6 лет назад

      sentdex Shook 😲

    • @emretatbak
      @emretatbak 5 лет назад

      Keyboard noise sounds great only for me? :D

  • @sandeepganesh7397
    @sandeepganesh7397 4 года назад

    Can anyone please share their 'training_data.npy' ?!

  • @davidwang4461
    @davidwang4461 7 лет назад

    Could anyone please explain to me why the number of data after balancing, which is 22436, is not equal to three times the least number of choices?

    • @davidwang4461
      @davidwang4461 7 лет назад

      I think the shuffle here has problem, I tried this code and found that after shuffling, the number of each choices changed, it confused me....

    • @asharkhan6714
      @asharkhan6714 6 лет назад

      pass train_data as a list to shuffle. shuffle([train_data])

  • @bchoor
    @bchoor 7 лет назад

    really enjoy your videos; would appreciate if you can balance your voice volume with your very loud keyboard. maybe possibly moving your mic, or using a different keyboard would be super awesome! still love your videos!

  • @deknas1407
    @deknas1407 4 года назад

    This works 2020-02-24:
    import numpy as np
    import pandas as pd
    from _collections import _count_elements
    from random import shuffle
    import cv2
    trainin_data = np.load("training_data-vid.npy",allow_pickle=True)
    for data in trainin_data:
    img = data[0]
    choice = data[1]
    cv2.imshow(("test"),img)
    print(choice)
    if cv2.waitKey(25) & 0xFF == ord("q"):
    cv2.destroyAllWindows()
    break

  • @TechAspiron
    @TechAspiron 6 лет назад

    After getting my training data in training_data.npy and running balance_data.py, I get 'None' value for each iteration. Can someone tell me what my error is?

    • @TechAspiron
      @TechAspiron 6 лет назад

      SOMEONE PLEASE REPLY

    • @dwightschrute782
      @dwightschrute782 6 лет назад

      I’ve had this issue as well, I believe it’s a bug that’s ongoing

    • @gavargas22
      @gavargas22 5 лет назад +1

      Do you have a link to your source code? No one is responding probably because it is hard to know what you did wrong because we can't see your code

  • @i_norwe_i
    @i_norwe_i 6 лет назад +1

    it`s amazing

  • @jfliu730
    @jfliu730 2 года назад

    you import the wrong shuffle function,
    it should be np.random.shuffle,not random.shuffle

  • @yashshrivastava1648
    @yashshrivastava1648 6 лет назад +1

    mine always showing [0,1,0]

    • @cangunen2165
      @cangunen2165 5 лет назад

      Mine is kind of similar [1,0,0], did you get any solution

    • @Parth.Deshpande
      @Parth.Deshpande 3 года назад +1

      Caps-Lock

    • @Parth.Deshpande
      @Parth.Deshpande 3 года назад

      I FOUND THE SOLUTION!!!
      1. Run the terminal/anaconda prompt as administrator & then run the python file.
      2. Run the game as administrator
      3. Turn on CAPS-LOCK

  • @tuhinmukherjee8141
    @tuhinmukherjee8141 3 года назад

    play GTA-V FOR SCIENCE!!
    right 😂

  • @cerpokas
    @cerpokas 7 лет назад

    You could try balance your data using weighted_cross_entropy_with_logits

  • @rohanshankar4576
    @rohanshankar4576 7 лет назад

    Have you trained it using a RNN?
    Should work better I guess.

  • @DimulyaPlay
    @DimulyaPlay 3 года назад

    from pandas import 🐼🐼🐼🐼🐼

  • @Ятич
    @Ятич 2 месяца назад

    help

  • @herp_derpingson
    @herp_derpingson 7 лет назад

    Host the data separately, dont bundle it with the code. Thanks.

    • @sentdex
      @sentdex  7 лет назад +2

      It's already been hosted, and it's not bundled with the code.

    • @akcricketlive6029
      @akcricketlive6029 6 лет назад

      where can I find the training data?