thanks a lot man, after two days of such boring videos on youtube finally found a good one to learn NLP, great work man keep going on machine learning and other data science topic, it's really rare to find such a great video.
I am at present studying the Machine Learning A-Z course from Udemy. Trust me, the tutor has no Idea how to explain stuff and it is the best selling course on Udemy for Data Science. I scratched my head while I was going through his tutorial, then I came here and all my doubts were cleared. Coding in python is not a difficult task but understanding the concept is the most important thing. And I got that understanding of Bag of words Model from your tutorial. Thanks a lot for your help. P.S. One who understands the concept and has strong Fundamentals has the ability to explain stuff in the simplest manner possible. Please keep it Simple like this in upcoming tutorials also. ALREADY SUBSCRIBED.
What does CountVectorizer do in this model ? Does it convert the words in the instances/Document into 0/1 please suggest.... Thanks for your help in advance.
Hello Normalize Nerd, i've got an error in y(target) y = data.as_matrix(['Review_class'], AttributeError: 'DataFrame' object has no attribute 'as_matrix', By the way, thanks 💚 the tutorial is very clear and well explained . 👏Bravo
Very informative video , thank you for uploading the NLP series , having one queries how can we use auto text summary generator in other human lanaguages text like(Japanese , chinese & Korean)????? your reply would be very helpfull.
Unfortunately there's no library for that can summarize every language. However, you'll find many github repos where people have built text summarizers for other languages using the same method!
One suggestion here. Please ZOOM-IN your screen while you are explaining the coding part. Press Ctrl and scroll up from your mouse, it will zoom in. It puts strain on our eyes and understanding the coding part becomes a punishment.
Yes, ideally we should first split then preprocess. Here, the text preprocessing will remain the same for both train and test set so I did them together. However, I also formed the BOW model on the whole data; which is not the correct way. We should build it only on the training set then apply it on the test set. I did it just to make things a little easier.
Compare y_pred and y_test. The indices where they don't match are the mistaken samples. Then use that indices to access the sentences in X_test. I hope it helps.
Good question. The thing is...stemming should give us 'lov'. But, Porter stemmer gives us 'love'. I guess the reason lies in the details of Porter stemmer's implementation.
In the .txt file, the values are separated by tab('/t') just like the values are separated by a comma in a .csv file. In pandas we have the function read_csv(reads .csv files by default). We need to pass the parameter to read tab-separated files.
I am Bengali as well. I am glad that you are doing this and feeling proud of you brother.
Rocked in very less time. This can only come with very keen knowledge.
excellent vids brother ... keep going. You've compressed months & months of learning into a easy-to-learn videos ... please don't stop
Thanks a lot for appreciating the effort :D
I like the video. Simple and straight to the point. Keep it up!
Thanks! :D
My 4 months struggle to learn nlp got done in 23 min. Thanks a ton bro!
Glad to hear that bro! ❤
thanks a lot man, after two days of such boring videos on youtube finally found a good one to learn NLP, great work man keep going on machine learning and other data science topic, it's really rare to find such a great video.
Thanks a lot mate! Keep supporting...
Amazing. W8ing for more video😊
I am at present studying the Machine Learning A-Z course from Udemy. Trust me, the tutor has no Idea how to explain stuff and it is the best selling course on Udemy for Data Science.
I scratched my head while I was going through his tutorial, then I came here and all my doubts were cleared. Coding in python is not a difficult task but understanding the concept is the most important thing. And I got that understanding of Bag of words Model from your tutorial.
Thanks a lot for your help.
P.S. One who understands the concept and has strong Fundamentals has the ability to explain stuff in the simplest manner possible. Please keep it Simple like this in upcoming tutorials also. ALREADY SUBSCRIBED.
Thanks a lot @Sumit Chhabra. I'll try my best to maintain this level of simplicity :)
Thank you for your best explanation!
Really Amazing and very clear . Keep this up . New subscriber .
Thanks for the sub!
excellent. thanks Normalised Nerd
Very nice explanation also covered much things in very less time. Keep it up Man👍
Glad to hear that!
Thanks brother❤
Please make a video or two about neural machine translation. With an example.
What does CountVectorizer do in this model ?
Does it convert the words in the instances/Document into 0/1 please suggest....
Thanks for your help in advance.
It'll generate the feature matrix that I started drawing at 7:06
Thank you sir 🙏
Hello Normalize Nerd, i've got an error in y(target) y = data.as_matrix(['Review_class'],
AttributeError: 'DataFrame' object has no attribute 'as_matrix',
By the way, thanks 💚 the tutorial is very clear and well explained . 👏Bravo
change it into data.(['Review_class'].to_numpy().
Shouldn't you initialize the regular expression outside the loop?
Very informative video , thank you for uploading the NLP series , having one queries how can we use auto text summary generator in other human lanaguages text like(Japanese , chinese & Korean)????? your reply would be very helpfull.
Unfortunately there's no library for that can summarize every language. However, you'll find many github repos where people have built text summarizers for other languages using the same method!
Well explained !!!
Thanks!!
chalie jao dada!!!
Thank you bhai!
Can you make some videos about seaborn?
Thank you so much
One suggestion here. Please ZOOM-IN your screen while you are explaining the coding part. Press Ctrl and scroll up from your mouse, it will zoom in. It puts strain on our eyes and understanding the coding part becomes a punishment.
Point noted. Thanks for the feedback.
We first split the data and do Preprocessing right why you perform on whole dataset
Yes, ideally we should first split then preprocess. Here, the text preprocessing will remain the same for both train and test set so I did them together. However, I also formed the BOW model on the whole data; which is not the correct way. We should build it only on the training set then apply it on the test set. I did it just to make things a little easier.
you are great.
plz a make tutorial how nlp is work on Bangla text datasets.
Sure I'll...stay tuned!
how can we go back to the original sentence from X_test? I mean how can I see what sentences the algorithm doesn't classify correctly?
Compare y_pred and y_test. The indices where they don't match are the mistaken samples. Then use that indices to access the sentences in X_test. I hope it helps.
Is X_train the bag of words?
X_train is the feature matrix.
Great, New subscriber
Thanks a lot :D
amazing keep posting videos :)
Will do 😁
Is it 'Lov' or 'Love' the root?
Good question. The thing is...stemming should give us 'lov'. But, Porter stemmer gives us 'love'. I guess the reason lies in the details of Porter stemmer's implementation.
why we use deimitor as '/t'
In the .txt file, the values are separated by tab('/t') just like the values are separated by a comma in a .csv file. In pandas we have the function read_csv(reads .csv files by default). We need to pass the parameter to read tab-separated files.
powerpoint could have been used for nice presentation
I would have loved it if it was good. 😀😅Is this positive?
Such complex shit ;__;
How can we access the text files used?
I've provide the link in the video description
ভাই আমিও বাঙালি.! ❤❤❤
বেশ ভালো লাগলো আপনার কমেন্টটি পেয়ে। চ্যানেলটিকে আপনার পরিচিতদের মধ্যে শেয়ার করার অনুরোধ রইল। ❤️