8:00 -- In the function *get_random_batch()* you should generate random batch for EACH batch separately. Just put these 3 lines of code in the *for* loop for each batch in the *get_random_batch()* funciton: for i in range(self.batch_size): start_idx = random.randint(0, len(file) - self.chunk_len) end_idx = start_idx + self.chunk_len + 1 text_str = file[start_idx:end_idx] Also in *generate()* function you should replace self.batch_size with 1 in the line of code, because there is only 1 batch for *generate()* function: hidden, cell = self.rnn.init_hidden(1) Great tutorial. I've learned a lot. Thank you!
Sir,great initiation from ur side... Sir one request please, would u make speech to text recognition system like youtube videos did...i searched on google,youtube etc but never. find an easy approach to learn but watching ur videos i find things are not difficult..becoz sometime teaching nd the content matters that really great in ur videos sir...
I guess there is a typo in the forward method while feeding in the inputs to the lstm cell. I don't think there was any need to use out.unsqueeze(1) since the output we get after passing x through the embedding layer is of the right dimension. You could have simply used self.lstm(out, (hidden,cell))
Hey man, I came here after watching Udacity's deep learning course. I must say that I got more intuitions and skills by watching your series of deep learning. It was so exciting to code along with your videos. Thank you so much for making this playlist. Keep doing the great work. One quick question though - What the ideal embed_size should be? Is it correlated to vocab_size? Or we just have to experiment with random values?
I'm very happy to hear that and really encourages me to make more videos. Which Udacity course is that and how did you find it? I'm looking to do some more courses just to be able to give people better advice on what courses I think are best :) The ideal value of the embed size is not correlated to the vocabulary size. When going through an embedding layer we create a (vocab_size, embed_size) dimensional matrix and then for each word we pick out the corresponding embed_size dimensional vector for that word. The reason for this is because then we are able to map a word and significance to other words, and we could imagine (for example) that the first dimension represents how much the input word correlates to gender, second might be fruit, third might be vegetable, etc. Now, we really have no idea how the network chooses to have these representations but at least we can have some intuition. So having these words go through this embedding means we can represent the words much much better (rather than inputting a one hot vector with no information really). Alright this was a rant, anyways the embed_size is a hyperparameter that you can play around with. It's not correlated to really anything except how many categories you want each word to be represented in and 300 is a good default value.
@@AladdinPersson Superb, got it. Thank you so much again. And this is the Udacity course I was talking about - www.udacity.com/course/deep-learning-pytorch--ud188
Hello! Great video. Just a small clarification. at 12:12 while you are trying to create the batches. why do you choose text_str[1:]. From my understanding assume you have the word text_str="ABCD" given "ABC" you would like to predict "D" so should that not be just text_str[-1]. I guess i am missing something.
Hello it is due to creating a batch of 250 inputs so that in training time we will input each character for 250 times to the model for next character And yes as the text_str length is 251 so the first 250 goes to input batch and last 250 goes to target batch so from 0-250 for input and from 1-251 for targets which is going to processed character wise at training time .I hope this is clear otherwise let me know.
Hey I would like to say that how about you approach the same using a bit advanced, An LSTM VAE or May be an LSTM GANN or maybe a GANN-RL Model, by the way do give a second thought. Message from Mufasa, Love you man thanks for dealing with these complicated topics with so ease. God Bless You My Friend
Hi I had a question, how do you save this model to a file? I am getting an error that says missing positional arguments when I try to save the RNN model.
The notation I used was a bit misleading, but we first run it through the embedding layer (which has out shape hidden_size) and that is then sent to the LSTM which then has the input size to be hidden size. I should've perhaps called it embed_size and that would've been a bit clearer
hey man, the charater level lstm can also be used to generate musical notes , please just make a short video on how do we process the input data in such a project ....like we can take midi file and convert it to csv file for input data for the lstm and it will be able to generate new midi notes which can be played in the software..........also i did not understand the data flow in the code ...the dimensions and all.....if u have a good blog to read about the same please do share ....actully i searched for many blogs but did not get it .....please do help....
I vaguely remember doing in that in some deep learning course but I don't have too much experience in that specific application although it sounds very interesting, definitely an idea for the future! It would help if you would be more specific to what dimensions you feel you're confused over or if there are any specifics to what you didn't understand would help pinpoint what resources & explanations I can give
@@AladdinPersson actually yesterday i acme across a software ...midi to csv file ......so the csv file has the data stored into it and can be used a a sequence input to lstm ......do have a look at it , we will just need to change the input and all the other things will be same,........regarding the video ....actually i was a little confused about the generator function but when i watched the video about 5-6 time it got cleared ......also is it possible that we input a onehot encoded character .
One question to nn.Embedding: Does it take in just ONE one-hot char? So I cant input like a list of one-hot vectors (which would represent the entire name)?
Let's say you have a vocabulary of length v and then you would have nn. Embedding(v, embed_size). If you would send in a single element which is the numerical value of the word/character in your vocabulary the embedding would take out the column vector (of length embed_size) from the specific row of the element you're looking up. In other words the embedding is simply a matrix and all we're doing is a mapping of a single element to some embed_size dimensional vector that can have a better representation. You can't input a one hot vector (I'm pretty sure, although you can easily try) but if you can simply input the element value this is simpler than having to first create one hot embeddings
Hey, I had one quick question. You can send the entire chunk of 250 together in the model run method instead of the loop using the sequence length =250, right?
Yeah I think that should be possible, I'm wondering a little bit over what you mean could you explain in a little more detail? Which lines do you want to change
@@AladdinPersson I am very new to LSTMS and pytorch. I was just wondering if the model would be exactly the same if we send the entire seq_len together
also as this is character level lstm. I dont see the point to what embeddings can learn from characters. Would one hot encoding be better for this model?
@@vansadiakartik Yeah I agree, I'm not sure if the embedding is particularly relevant here (but I guess it doesn't hurt either). I think you could use one hot encoding here and it would be a good idea
@@AladdinPersson I was trying to deploy this model and apparently a Pytorch model with LSTM layers is the only exception where you cannot convert to ONNX.
In the video I used Spyder but I've since then switched and recommend using Pycharm. I have a video on how to set up a deeplearning environment: ruclips.net/video/2S1dgHpqCdk/видео.html
8:00 -- In the function *get_random_batch()* you should generate random batch for EACH batch separately. Just put these 3 lines of code in the *for* loop for each batch in the *get_random_batch()* funciton:
for i in range(self.batch_size):
start_idx = random.randint(0, len(file) - self.chunk_len)
end_idx = start_idx + self.chunk_len + 1
text_str = file[start_idx:end_idx]
Also in *generate()* function you should replace self.batch_size with 1 in the line of code, because there is only 1 batch for *generate()* function:
hidden, cell = self.rnn.init_hidden(1)
Great tutorial. I've learned a lot. Thank you!
It was for a second but love the firewatch background. 7:25
Sir,great initiation from ur side...
Sir one request please, would u make speech to text recognition system like youtube videos did...i searched on google,youtube etc but never. find an easy approach to learn but watching ur videos i find things are not difficult..becoz sometime teaching nd the content matters that really great in ur videos sir...
That is an interesting idea, will look into it sir!
@@AladdinPersson Eager to watch this in future and thanks for the reply...
I actually used the same model to write stories,nice tutorial
looking forward to reinforcement training video )))
I'm looking forward to learning more about RL as well, I have only dabbled in it before :)
Same, waiting for RL series!
I guess there is a typo in the forward method while feeding in the inputs to the lstm cell. I don't think there was any need to use out.unsqueeze(1) since the output we get after passing x through the embedding layer is of the right dimension. You could have simply used self.lstm(out, (hidden,cell))
Yeah you're right!
excellent vidoes.. but create smaller smaller code length. Easier to understand
Hey man, I came here after watching Udacity's deep learning course. I must say that I got more intuitions and skills by watching your series of deep learning. It was so exciting to code along with your videos. Thank you so much for making this playlist. Keep doing the great work.
One quick question though -
What the ideal embed_size should be? Is it correlated to vocab_size? Or we just have to experiment with random values?
I'm very happy to hear that and really encourages me to make more videos. Which Udacity course is that and how did you find it? I'm looking to do some more courses just to be able to give people better advice on what courses I think are best :)
The ideal value of the embed size is not correlated to the vocabulary size. When going through an embedding layer we create a (vocab_size, embed_size) dimensional matrix and then for each word we pick out the corresponding embed_size dimensional vector for that word. The reason for this is because then we are able to map a word and significance to other words, and we could imagine (for example) that the first dimension represents how much the input word correlates to gender, second might be fruit, third might be vegetable, etc. Now, we really have no idea how the network chooses to have these representations but at least we can have some intuition. So having these words go through this embedding means we can represent the words much much better (rather than inputting a one hot vector with no information really). Alright this was a rant, anyways the embed_size is a hyperparameter that you can play around with. It's not correlated to really anything except how many categories you want each word to be represented in and 300 is a good default value.
@@AladdinPersson Superb, got it. Thank you so much again. And this is the Udacity course I was talking about - www.udacity.com/course/deep-learning-pytorch--ud188
Hello! Great video. Just a small clarification. at 12:12 while you are trying to create the batches. why do you choose text_str[1:]. From my understanding assume you have the word text_str="ABCD" given "ABC" you would like to predict "D" so should that not be just text_str[-1]. I guess i am missing something.
Hello it is due to creating a batch of 250 inputs so that in training time we will input each character for 250 times to the model for next character
And yes as the text_str length is 251 so the first 250 goes to input batch and last 250 goes to target batch
so from 0-250 for input and from 1-251 for targets which is going to processed character wise at training time .I hope this is clear otherwise let me know.
Great video btw!
Hello can you please explain how the input and output going to the model while training?
Hey I would like to say that how about you approach the same using a bit advanced, An LSTM VAE or May be an LSTM GANN or maybe a GANN-RL Model, by the way do give a second thought.
Message from Mufasa,
Love you man thanks for dealing with these complicated topics with so ease.
God Bless You My Friend
Hi I had a question, how do you save this model to a file? I am getting an error that says missing positional arguments when I try to save the RNN model.
i think there is one typo in defining lstm layer (Expected - input_size, provided by you - hidden_size)
The notation I used was a bit misleading, but we first run it through the embedding layer (which has out shape hidden_size) and that is then sent to the LSTM which then has the input size to be hidden size. I should've perhaps called it embed_size and that would've been a bit clearer
hey man, the charater level lstm can also be used to generate musical notes , please just make a short video on how do we process the input data in such a project ....like we can take midi file and convert it to csv file for input data for the lstm and it will be able to generate new midi notes which can be played in the software..........also i did not understand the data flow in the code ...the dimensions and all.....if u have a good blog to read about the same please do share ....actully i searched for many blogs but did not get it .....please do help....
I vaguely remember doing in that in some deep learning course but I don't have too much experience in that specific application although it sounds very interesting, definitely an idea for the future! It would help if you would be more specific to what dimensions you feel you're confused over or if there are any specifics to what you didn't understand would help pinpoint what resources & explanations I can give
@@AladdinPersson actually yesterday i acme across a software ...midi to csv file ......so the csv file has the data stored into it and can be used a a sequence input to lstm ......do have a look at it , we will just need to change the input and all the other things will be same,........regarding the video ....actually i was a little confused about the generator function but when i watched the video about 5-6 time it got cleared ......also is it possible that we input a onehot encoded character .
One question to nn.Embedding:
Does it take in just ONE one-hot char? So I cant input like a list of one-hot vectors (which would represent the entire name)?
Let's say you have a vocabulary of length v and then you would have nn. Embedding(v, embed_size). If you would send in a single element which is the numerical value of the word/character in your vocabulary the embedding would take out the column vector (of length embed_size) from the specific row of the element you're looking up. In other words the embedding is simply a matrix and all we're doing is a mapping of a single element to some embed_size dimensional vector that can have a better representation. You can't input a one hot vector (I'm pretty sure, although you can easily try) but if you can simply input the element value this is simpler than having to first create one hot embeddings
Hello I can't find the code related to this video would you please drop a link
Hey, I had one quick question. You can send the entire chunk of 250 together in the model run method instead of the loop using the sequence length =250, right?
Yeah I think that should be possible, I'm wondering a little bit over what you mean could you explain in a little more detail? Which lines do you want to change
@@AladdinPersson I am very new to LSTMS and pytorch. I was just wondering if the model would be exactly the same if we send the entire seq_len together
also as this is character level lstm. I dont see the point to what embeddings can learn from characters. Would one hot encoding be better for this model?
@@vansadiakartik Yeah I agree, I'm not sure if the embedding is particularly relevant here (but I guess it doesn't hurt either). I think you could use one hot encoding here and it would be a good idea
like we had all the characters belonging to english in string.printable......how can we get the characters involved in other languages.....like hindi?
I think you should be able to write them yourself instead of string.printable? Try it and let me know how it went:)
what is unidecode.unidecode() is doing ?
Not clarified enough. So tough to understand
Anyone tried to export this model to ONNX? I'm trying to but I can't figure it out.
No sorry haven't used ONNX
@@AladdinPersson I was trying to deploy this model and apparently a Pytorch model with LSTM layers is the only exception where you cannot convert to ONNX.
Mavis is a cool name.
What is your IDE?
In the video I used Spyder but I've since then switched and recommend using Pycharm. I have a video on how to set up a deeplearning environment: ruclips.net/video/2S1dgHpqCdk/видео.html
From where do I get your dataset name.txt
On github: github.com/aladdinpersson/Machine-Learning-Collection/blob/master/ML/Projects/text_generation_babynames/data/names.txt
@@AladdinPersson thank you
Sir can you implement word2 vec skip gram model to generate word embeddings
How can I apply word-level RNN?
expand this concept and you will get a GPT out of it !!
Th is not a d sound bro!
That is a LOT of code.