Pytorch Text Generator with character level LSTM

Поделиться
HTML-код
  • Опубликовано: 24 окт 2024

Комментарии • 52

  • @foobar1672
    @foobar1672 3 года назад +5

    8:00 -- In the function *get_random_batch()* you should generate random batch for EACH batch separately. Just put these 3 lines of code in the *for* loop for each batch in the *get_random_batch()* funciton:
    for i in range(self.batch_size):
    start_idx = random.randint(0, len(file) - self.chunk_len)
    end_idx = start_idx + self.chunk_len + 1
    text_str = file[start_idx:end_idx]
    Also in *generate()* function you should replace self.batch_size with 1 in the line of code, because there is only 1 batch for *generate()* function:
    hidden, cell = self.rnn.init_hidden(1)
    Great tutorial. I've learned a lot. Thank you!

  • @ashwinjayaprakash7991
    @ashwinjayaprakash7991 3 года назад

    It was for a second but love the firewatch background. 7:25

  • @suibaba
    @suibaba 4 года назад

    Sir,great initiation from ur side...
    Sir one request please, would u make speech to text recognition system like youtube videos did...i searched on google,youtube etc but never. find an easy approach to learn but watching ur videos i find things are not difficult..becoz sometime teaching nd the content matters that really great in ur videos sir...

    • @AladdinPersson
      @AladdinPersson  4 года назад +1

      That is an interesting idea, will look into it sir!

    • @suibaba
      @suibaba 4 года назад

      @@AladdinPersson Eager to watch this in future and thanks for the reply...

  • @akaskmsskssk6927
    @akaskmsskssk6927 3 года назад

    I actually used the same model to write stories,nice tutorial

  • @yurig1756
    @yurig1756 4 года назад +1

    looking forward to reinforcement training video )))

    • @AladdinPersson
      @AladdinPersson  4 года назад +1

      I'm looking forward to learning more about RL as well, I have only dabbled in it before :)

    • @kae4881
      @kae4881 3 года назад

      Same, waiting for RL series!

  • @asiskumarroy4470
    @asiskumarroy4470 4 года назад +1

    I guess there is a typo in the forward method while feeding in the inputs to the lstm cell. I don't think there was any need to use out.unsqueeze(1) since the output we get after passing x through the embedding layer is of the right dimension. You could have simply used self.lstm(out, (hidden,cell))

  • @mukundsrinivas8426
    @mukundsrinivas8426 3 года назад +1

    excellent vidoes.. but create smaller smaller code length. Easier to understand

  • @palashkamble2325
    @palashkamble2325 4 года назад

    Hey man, I came here after watching Udacity's deep learning course. I must say that I got more intuitions and skills by watching your series of deep learning. It was so exciting to code along with your videos. Thank you so much for making this playlist. Keep doing the great work.
    One quick question though -
    What the ideal embed_size should be? Is it correlated to vocab_size? Or we just have to experiment with random values?

    • @AladdinPersson
      @AladdinPersson  4 года назад +2

      I'm very happy to hear that and really encourages me to make more videos. Which Udacity course is that and how did you find it? I'm looking to do some more courses just to be able to give people better advice on what courses I think are best :)
      The ideal value of the embed size is not correlated to the vocabulary size. When going through an embedding layer we create a (vocab_size, embed_size) dimensional matrix and then for each word we pick out the corresponding embed_size dimensional vector for that word. The reason for this is because then we are able to map a word and significance to other words, and we could imagine (for example) that the first dimension represents how much the input word correlates to gender, second might be fruit, third might be vegetable, etc. Now, we really have no idea how the network chooses to have these representations but at least we can have some intuition. So having these words go through this embedding means we can represent the words much much better (rather than inputting a one hot vector with no information really). Alright this was a rant, anyways the embed_size is a hyperparameter that you can play around with. It's not correlated to really anything except how many categories you want each word to be represented in and 300 is a good default value.

    • @palashkamble2325
      @palashkamble2325 4 года назад

      @@AladdinPersson Superb, got it. Thank you so much again. And this is the Udacity course I was talking about - www.udacity.com/course/deep-learning-pytorch--ud188

  • @adesiph.d.journal461
    @adesiph.d.journal461 3 года назад +1

    Hello! Great video. Just a small clarification. at 12:12 while you are trying to create the batches. why do you choose text_str[1:]. From my understanding assume you have the word text_str="ABCD" given "ABC" you would like to predict "D" so should that not be just text_str[-1]. I guess i am missing something.

    • @abhisekpanigrahi1033
      @abhisekpanigrahi1033 Год назад

      Hello it is due to creating a batch of 250 inputs so that in training time we will input each character for 250 times to the model for next character
      And yes as the text_str length is 251 so the first 250 goes to input batch and last 250 goes to target batch
      so from 0-250 for input and from 1-251 for targets which is going to processed character wise at training time .I hope this is clear otherwise let me know.

  • @МаксимЧапаев-в6ь
    @МаксимЧапаев-в6ь 4 года назад

    Great video btw!

  • @abhisekpanigrahi1033
    @abhisekpanigrahi1033 Год назад

    Hello can you please explain how the input and output going to the model while training?

  • @Suman-zm7wx
    @Suman-zm7wx 2 года назад

    Hey I would like to say that how about you approach the same using a bit advanced, An LSTM VAE or May be an LSTM GANN or maybe a GANN-RL Model, by the way do give a second thought.
    Message from Mufasa,
    Love you man thanks for dealing with these complicated topics with so ease.
    God Bless You My Friend

  • @DarkF4lcon
    @DarkF4lcon 3 года назад

    Hi I had a question, how do you save this model to a file? I am getting an error that says missing positional arguments when I try to save the RNN model.

  • @knowaiknowfuture3076
    @knowaiknowfuture3076 4 года назад +1

    i think there is one typo in defining lstm layer (Expected - input_size, provided by you - hidden_size)

    • @AladdinPersson
      @AladdinPersson  4 года назад

      The notation I used was a bit misleading, but we first run it through the embedding layer (which has out shape hidden_size) and that is then sent to the LSTM which then has the input size to be hidden size. I should've perhaps called it embed_size and that would've been a bit clearer

  • @thecros1076
    @thecros1076 4 года назад

    hey man, the charater level lstm can also be used to generate musical notes , please just make a short video on how do we process the input data in such a project ....like we can take midi file and convert it to csv file for input data for the lstm and it will be able to generate new midi notes which can be played in the software..........also i did not understand the data flow in the code ...the dimensions and all.....if u have a good blog to read about the same please do share ....actully i searched for many blogs but did not get it .....please do help....

    • @AladdinPersson
      @AladdinPersson  4 года назад

      I vaguely remember doing in that in some deep learning course but I don't have too much experience in that specific application although it sounds very interesting, definitely an idea for the future! It would help if you would be more specific to what dimensions you feel you're confused over or if there are any specifics to what you didn't understand would help pinpoint what resources & explanations I can give

    • @thecros1076
      @thecros1076 4 года назад

      @@AladdinPersson actually yesterday i acme across a software ...midi to csv file ......so the csv file has the data stored into it and can be used a a sequence input to lstm ......do have a look at it , we will just need to change the input and all the other things will be same,........regarding the video ....actually i was a little confused about the generator function but when i watched the video about 5-6 time it got cleared ......also is it possible that we input a onehot encoded character .

  • @tedp9146
    @tedp9146 4 года назад

    One question to nn.Embedding:
    Does it take in just ONE one-hot char? So I cant input like a list of one-hot vectors (which would represent the entire name)?

    • @AladdinPersson
      @AladdinPersson  4 года назад +1

      Let's say you have a vocabulary of length v and then you would have nn. Embedding(v, embed_size). If you would send in a single element which is the numerical value of the word/character in your vocabulary the embedding would take out the column vector (of length embed_size) from the specific row of the element you're looking up. In other words the embedding is simply a matrix and all we're doing is a mapping of a single element to some embed_size dimensional vector that can have a better representation. You can't input a one hot vector (I'm pretty sure, although you can easily try) but if you can simply input the element value this is simpler than having to first create one hot embeddings

  • @batak2571
    @batak2571 Год назад

    Hello I can't find the code related to this video would you please drop a link

  • @pradipsinhvansadia6437
    @pradipsinhvansadia6437 4 года назад

    Hey, I had one quick question. You can send the entire chunk of 250 together in the model run method instead of the loop using the sequence length =250, right?

    • @AladdinPersson
      @AladdinPersson  4 года назад

      Yeah I think that should be possible, I'm wondering a little bit over what you mean could you explain in a little more detail? Which lines do you want to change

    • @vansadiakartik
      @vansadiakartik 4 года назад

      @@AladdinPersson I am very new to LSTMS and pytorch. I was just wondering if the model would be exactly the same if we send the entire seq_len together

    • @vansadiakartik
      @vansadiakartik 4 года назад

      also as this is character level lstm. I dont see the point to what embeddings can learn from characters. Would one hot encoding be better for this model?

    • @AladdinPersson
      @AladdinPersson  4 года назад

      @@vansadiakartik Yeah I agree, I'm not sure if the embedding is particularly relevant here (but I guess it doesn't hurt either). I think you could use one hot encoding here and it would be a good idea

  • @thecros1076
    @thecros1076 4 года назад

    like we had all the characters belonging to english in string.printable......how can we get the characters involved in other languages.....like hindi?

    • @AladdinPersson
      @AladdinPersson  4 года назад

      I think you should be able to write them yourself instead of string.printable? Try it and let me know how it went:)

  • @ShivamVerma-gq2sm
    @ShivamVerma-gq2sm 3 года назад

    what is unidecode.unidecode() is doing ?

  • @runankaroy2687
    @runankaroy2687 3 года назад +2

    Not clarified enough. So tough to understand

  • @LennyThroughParadise
    @LennyThroughParadise 4 года назад

    Anyone tried to export this model to ONNX? I'm trying to but I can't figure it out.

    • @AladdinPersson
      @AladdinPersson  3 года назад

      No sorry haven't used ONNX

    • @LennyThroughParadise
      @LennyThroughParadise 3 года назад

      @@AladdinPersson I was trying to deploy this model and apparently a Pytorch model with LSTM layers is the only exception where you cannot convert to ONNX.

  • @LennyThroughParadise
    @LennyThroughParadise 4 года назад

    Mavis is a cool name.

  • @МаксимЧапаев-в6ь
    @МаксимЧапаев-в6ь 4 года назад

    What is your IDE?

    • @AladdinPersson
      @AladdinPersson  4 года назад

      In the video I used Spyder but I've since then switched and recommend using Pycharm. I have a video on how to set up a deeplearning environment: ruclips.net/video/2S1dgHpqCdk/видео.html

  • @shrimonmukherjee1746
    @shrimonmukherjee1746 3 года назад

    From where do I get your dataset name.txt

    • @AladdinPersson
      @AladdinPersson  3 года назад

      On github: github.com/aladdinpersson/Machine-Learning-Collection/blob/master/ML/Projects/text_generation_babynames/data/names.txt

    • @shrimonmukherjee1746
      @shrimonmukherjee1746 3 года назад

      @@AladdinPersson thank you

    • @shrimonmukherjee1746
      @shrimonmukherjee1746 3 года назад

      Sir can you implement word2 vec skip gram model to generate word embeddings

    • @shrimonmukherjee1746
      @shrimonmukherjee1746 3 года назад

      How can I apply word-level RNN?

  • @deepshankarjha5344
    @deepshankarjha5344 4 года назад

    expand this concept and you will get a GPT out of it !!

  • @tanakanaoshi4769
    @tanakanaoshi4769 2 года назад +1

    Th is not a d sound bro!

  • @rapidretrovenue563
    @rapidretrovenue563 2 года назад

    That is a LOT of code.