I will suggest you watch the output context of LSTM. The output context of LSTM would be not a single vector it would be multiple vector generated at each time step. This is were attention would be implemented and differentiate from the RNN. I appriciate your tutorials
Greetings!! Can you please upload more stuff on Deep learning like attention models, Transformers, BERT and do cover unsupervised learning too if possible. It would be highly appreciated.
What is the function of dense layer after decoder? Aren't we actually interested in decoder output? Why adding dense layer would not hamper the actual output by decoder? I would be very thankful if someone answers all my questions.
Can you please take a small sample text and perform encoding and decoding functioning briefly, so that we can understand it briefly as there are few doubts regarding t=timestamp.
i dont understand why my final testing loop is decoding every input to ' i want to go to room' i have made hindi to english translation and used the dataset of the blog which was just shown below english french database
Hello any one please tell me this model does not accurate i tune with various hyper parameter but the accuracy is not good. Can someone tell me what to do exactly to achieve so?
Sir, I have a doubt, During the validation, what if the decoder outputs an output sequence shorter than the true output sequence. How is the loss calculated in such cases. will categorical cross entropy work in such cases.
Hey krish, I am getting some cardinality in inference model input with this code Model fits perfectly but while predicting the inference model I am getting this error.
In this video only predefine values are converting to their respective meaning. I don't understand how this helps us to covert user input into other language.
@krishnaik06 This is super helpful video. I have been following the NLP playlist. Do you mind sharing this code in the Git repo? The folder for seq2seq in your repo seems empty. thanks:)
Hi @krish, Can you tell how you create this new format. I mean which application you use. This looks so cool. I also want to record videos, but not sure which software can give me these features. Any insight ?
Sir, thanks for this nice explanations. But I have one query, Instead of text I have numeric indexes of the text, but these are not vectors. How can I translate those indexes into its corresponding text?
Dear sir, I would like to join your channel. I tried to contact you (FB/LinkedIn). But the site can't be reached. How can I contact you, sir? Please help me
I've been waiting long time for this video, I have one doubt, in case if need to update the neural network with some extra stacked LSTM layer how could I do that, because this is not like other sequential model.
honestly, I watched many great videos for you, but this one is very poor and lacks explanation and clarity... I suggest in future you remake this session again. Any way thanks a lot for your efforts.
Even I couldn't understand the -1 part, logically it makes no difference, coming to the min() part, it is just used to choose the min value between num_samples,i.e,10k as mentioned in the code and len(lines),i.e, approx 185k. So, min(10k,185k) returns 10k and the first 10k records are considered for training the model. The underlying purpose is faster computation as computing 10k samples would take less time than in comparison to 185k samples.
I think we iterating for loop for lines so if num_samples are minimum then we don't need -1 but if len(lines) are minimum then we have to iterate the for loop till -1 because it starts from 0 and without -1 it will give out of bound error
Hi Krish, 2 questions: 1. why are we not using word2vec here and instead using OH encoding? 2. when u say A,B,C as inputs to encoders -- are these characters or words? I am confused
YOLO, BERT, TRANSFORMERS!! Please bring explanations on these
watch yolo in codebasics channel
Greetings from Austria, thanks for your knowledge sharing!
Hi Krish..Thanks for the video...waiting for more topics like attention mechanism, transformers etc
I will suggest you watch the output context of LSTM. The output context of LSTM would be not a single vector it would be multiple vector generated at each time step. This is were attention would be implemented and differentiate from the RNN. I appriciate your tutorials
Greetings!! Can you please upload more stuff on Deep learning like attention models, Transformers, BERT and do cover unsupervised learning too if possible. It would be highly appreciated.
A trick : watch series on Flixzone. I've been using them for watching all kinds of movies lately.
@Darian Judah Yea, have been using flixzone} for months myself :D
What is the function of dense layer after decoder? Aren't we actually interested in decoder output? Why adding dense layer would not hamper the actual output by decoder? I would be very thankful if someone answers all my questions.
Can you please take a small sample text and perform encoding and decoding functioning briefly, so that we can understand it briefly as there are few doubts regarding t=timestamp.
Sir take a sample text and perform encoding and decoding functioning briefly, so that it makes sense
Thanks krish for this video... I was waiting for this .!!
Can i use this, for question answering instead of language translation??
i dont understand why my final testing loop is decoding every input to ' i want to go to room'
i have made hindi to english translation and used the dataset of the blog which was just shown below english french database
Please provide Github link for this code.
Hello any one please tell me this model does not accurate i tune with various hyper parameter but the accuracy is not good. Can someone tell me what to do exactly to achieve so?
i have followed every step. still my encoder_input_data is all same, for each sentence, please help
Can you share the notebook?
okay now ill try to train english-telugu and add it on my resume
Dani Dataset unda mee deggara?
Sir, I have a doubt,
During the validation, what if the decoder outputs an output sequence shorter than the true output sequence. How is the loss calculated in such cases. will categorical cross entropy work in such cases.
Thanks Krish
excuse me sir, i wonder why use one hot encoder method instead word embedding layers ?
we can use them but we should be precise about the dimensions
why suddenly we are talking about characters when in previous vedios you mentioned of doing one-hot encoding of words?
Hey krish, I am getting some cardinality in inference model input with this code
Model fits perfectly but while predicting the inference model I am getting this error.
Hello Sir, thank you for this video. I did all the steps but in the end I don't receive the correct output, can you help me please. thanks
In this video only predefine values are converting to their respective meaning. I don't understand how this helps us to covert user input into other language.
What is the encoding scheme used in this tutorial e.g one hot, word2vec, glove etc
hey man, kindly give the link of code. if you did not upload there. please upload code there. it's a request.
@krishnaik06 This is super helpful video. I have been following the NLP playlist. Do you mind sharing this code in the Git repo? The folder for seq2seq in your repo seems empty. thanks:)
Hey Krish, I think you forgot to upload or provide a link to your notebook.Can you please upload it.
Sir, unable to locate github code...can you please share?
Hi @krish, Can you tell how you create this new format. I mean which application you use. This looks so cool. I also want to record videos, but not sure which software can give me these features. Any insight ?
can you please share the github link for above code
Please make videos on semantic segmentation
Sir, thanks for this nice explanations. But I have one query, Instead of text I have numeric indexes of the text, but these are not vectors. How can I translate those indexes into its corresponding text?
Is the Attention model posted? I have been waiting since this video was posted. Looking forward to it...
Not yet
Sir can you plz add the Jupyter notebook respective to the model
Where is the dataset? How can I get it?
🔥🔥🔥🔥🔥🔥🔥🔥
First Like
U skipped the main part ... Decoder input n decoder output . That how we r dividing the target as input n output with 1 timestep ..
can you explain "encoder_input_data[i, t+1:, input_token_index[' ']] = 1." this line? why we should use?
Thank you
please provide your github repo where you're code is present, please provide the code
krish kindly upload the image captions projects ..
Dear sir, I would like to join your channel. I tried to contact you (FB/LinkedIn). But the site can't be reached. How can I contact you, sir? Please help me
Can you please share the code?
bro this video doesn't make any sense, sorry to say but it's not at all intuitive.
if basics aren't clear then yes it isn't
It’s a three year old comment 😂😂
i have a simple doubt is this character encoding or word encoding
Can you help for english-urdu machine translation model?
Sir i dont understand the input dimension , that how input dimension is like that
I've been waiting long time for this video,
I have one doubt, in case if need to update the neural network with some extra stacked LSTM layer how could I do that, because this is not like other sequential model.
U need to use functional api for model creation
@@babritbehera4087 sorry what is functional api. I don't know.
Juzz Google ... Model building with functional api
@@babritbehera4087 Can i use this, for question answering instead of language translation??
honestly, I watched many great videos for you, but this one is very poor and lacks explanation and clarity... I suggest in future you remake this session again. Any way thanks a lot for your efforts.
Hello Sir! I tried computing the code but its crashing when working on google colab and in case of initialising the zero matrices system is crashing
probably because of more data and less ram, try fixing with inputs take less samples of dataset around 10k i took in my 8gb ram.
does this apply to time series as well?
Sir why we not given the inputs like this : encoder_inputs=Input(shape=(max_encoder_sequence_length,num_encoder_tokens))
Are encoder_outputs and h_t not the same thing?
can you please provide notebook link
Where is the code ?
Where i can get this code ?
Please provide the github link for the code.
yes please provide code
sir its compulsary to give input as one hot encoder ,, may we use the word embedding
Can i use this, for question answering instead of language translation??
Yes we can use embedding too.
decoder_input-data and decoder_target_data will be same ?
Yes, both will be same vector representation.
for line in lines[: min(num_samples,len(lines)-1] why are we doing -1 can anybody explain and why are we taking min ?
Even I couldn't understand the -1 part, logically it makes no difference, coming to the min() part, it is just used to choose the min value between num_samples,i.e,10k as mentioned in the code and len(lines),i.e, approx 185k. So, min(10k,185k) returns 10k and the first 10k records are considered for training the model. The underlying purpose is faster computation as computing 10k samples would take less time than in comparison to 185k samples.
I think we iterating for loop for lines so if num_samples are minimum then we don't need -1 but if len(lines) are minimum then we have to iterate the for loop till -1 because it starts from 0 and without -1 it will give out of bound error
Sirni have written the code exactly the same... but my accuracy is just 0.0020.... what could be the reason.?
Can you please share this written code which you Did ?
@Chaitanya Kaul
Were you able to solve the issue?
Can i use this, for question answering instead of language translation??
No as questions and answers are different in every sense
Most likely a transformer network will satisfy your needs
may be better audio would be really helpful.
Can you explain the inference code please, thanks.
can any one explain "encoder_input_data[i, t+1:, input_token_index[' ']] = 1." this line? why we should use?
probably too late but,
@@ShodaiThox hey did any of guys understand why he did that
Kindly upload more videos
Is it wotk for question answering also??
great video but u are mistaking the term character by word which creates confusion
17:36
Hi Krish,
2 questions:
1. why are we not using word2vec here and instead using OH encoding?
2. when u say A,B,C as inputs to encoders -- are these characters or words? I am confused
characters
You are using *word* for characters
sir please provide your notebook ... it will be really helpful, the notebook which you have prepared
Can i use this, for question answering instead of language translation??
@@hasiburrahman96 yes
@@hasiburrahman96 Most people use transformers however encoder decoder can also do the job
Github link??
in latent dimensionality how you define 256 plz explain
he took 256 randomly its upon you try taking different as they just reflect as time steps in an LSTM
Thanks for the video but you are not going deep into the code or the architecture. Please try to go a bit deeper.
Loss is very high
jeery
not
jerry
Seemed like you yourself were not understanding the code you have written. Zero logic building and only reading out the code. Very upsetting 😑.
Az yavaş anlat yiğido anlayamıyok!