The Guassian model is more accurate. As mentioned in the video, the Gussian model is more accurate for cases where the features have continuous values, which is the case for the Wine dataset.
For comparing the models I used Cross Validation (CV = 4) as you explained in the previous videos. Average Gaussian Score = 0.9722222222222222 Average Multinomial score = 0.8333333333333333
outstanding video series! greetings from Turkey, I learn too much from this channel. It's now my primary go-to resource to learn machine learning from scratch
U r one of the best teacher I have ever seen keep rocking By the way I don't know from what you r suffering get well soon buddy take care of yourself.👍
Thanks a lot for the tuto. Your series is best because it contains the exercises. My exercise result: GaussianNB = 0.96, MultinomialNB = 0.84. I also applied cross validation =5
Really great videos sir, explained very well. About the exercise:- for GaussianNB :- 1.0 for MultinomialNB:- 0.944 with random_state= 7 and test_size=0.2
I solved the exercise and I got the following score: used train_test_split with test_size=0.2 and random_state=123 This parameters gave me following results: GaussianNB score: 1.0(100%) MultinomialNB score: 0.888888888888888(88%) dataset shape : (178,13)[Dataset is pretty small!]
solved the excercise with the help of cross_val_score method where i have found Gaussian performed better than Multinomial as i got the list of their score in which max value of Gaussian=0.97222222 max value of Multinomial=0.91428571 SIR, your tutorial helping me a lot because your teaching teachnique is quite familair and easy for me thanks a lot SIR
Wonderfull explanation sir, thanks for that and here is my result after execution GaussianNB : 97.77% MultinomialNB: 73.33% BernoulliNB: 44.44% with test size = 25%
Your course is great for serving the practical needs of getting started doing ML in Python. For this video, some more explanation of pipelines would help. I understand what they are accomplishing, but not entirely how. Are the .fit methods referred to the underlying functions in the pipeline or is .fit its own method of the pipeline? How does the pipeline know to sue the right transformation method, that didn't seem to be explicitly specified? Again thank so much for this and the other videos. John
Thanks for the garble free explanation sir, my scores are: GaussianNB:97.7% MultinomialNB:80% BernoulliNB:48.8% hope, the above mentioned scores are good. Please comment, if any better score can be achieved in any another way.
Sir, you did not give fit_transform method in pipeline. You only gave CoutVectorizer() but it automatically did fit_transform step. How did it do that?
Thanks a lot for this course! As a beautiful and clever student I always do your exercises ^) I don't know what would make your course better. Maybe more exercises.
Gaussian model was most accurate for me resulting 97% accuracy, while Bernoulli being the least resulting only 19% accuracy which is to be expected since the training dataset was for continuous variables and Bernoulli model works better for Binary variables.
This varies because of Train, Test dataset split. It's random split. Even if you execute train_test_split multiple times, you will get different results. :)
I used minMax scaler while preprocessing data as the features had different range like proline ,alcohol,malic_acid etc.... Got 0.97 in Gaussian and 0.81 in Multinomial
it is just for visualization purpose, printing only X_train_count .toarray() would have printed all the data points which are in thousands i guess, so sir just used slicing method "[:3]" which states that only 3 data poins will be shown. so we can look at the code properly. get yourself familiarized with pandas slicing and methods like df.iloc[] and df.loc[]. It will be useful
@@swapnshah3234 Yes , I have used iloc quite often but I felt we were just converting here to and array and not printing it and the 3 somehow had significance in this specific dataset for data cleaning thank you for your reply !
i got accuracy 100 %. my train test split is as below. from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(df,target,random_state=20,test_size=0.05)
Thank you for tutorial. I like the way that you teach and GaussianNB work better but I do not know why! Also score of MultinomialNB for me caculate as 0.8444444444444444
With min-max scaling and X_train, X_test, y_train, y_test = train_test_split(X, y,test_size=.2, random_state=43) i got with Multinominal : 1 Using Gaussian : 1.0
I already have 3 videos on deep learning, just check the ML playlist. Also you read my mind in a sense that my next target is NLP series. Stay tuned :)
Hi. i am getting lower case error when evaluating test data by CountVectorizer .There is no integer value present as well. How can i resolve it? AttributeError: 'int' object has no attribute 'lower'
Thank you sir for your tutorial. I was confused in the countvectorizer at 4:06 , it would have been much better if you would have explained in more in detail. Like what datatype is xtrain and xtraincount, what kind of data is stored in x_train_count and so on. I learned from the shape and type of numpy. But it would have saved time. Also, why first you fit_transform and later just transform for emails. can anybody please help me
Not sure about the first problem But I can help you solve the second problem. To solve your second problem , lets first understand what is fit(),transform(),and fit_transform() methods fit() - The fit methods calculates the learning model parameter from training data . We use model.fit(x_train,y_train) so on , it calculates the internal parameters and adjusts the value for our prediction. transform() - The transform methods applies the calculated parameter onto our dataset. fit_transform() - The fit_transform() methods applies both fit () for calculating the parameters and transform() function to transform our dataset in one step. In the first case, we use fit_transform(x_train) for calculating and transforming our entire dataset and for test data we are applying those parameters that we learned from fit_transform(x_train) so we use transform(x_test). I hope I cleared your doubt.
Check out our premium machine learning course with 2 Industry projects: codebasics.io/courses/machine-learning-for-data-science-beginners-to-advanced
Sir You are amazing, an experience of 25 years is really brilliant, Thanks for Guiding us
The Guassian model is more accurate. As mentioned in the video, the Gussian model is more accurate for cases where the features have continuous values, which is the case for the Wine dataset.
yep , you are right, GaussianNB gave me 100% score.
Excellent channel to start learning the ML concepts...Way better than almost all the paid courses out their
Solved the exercise, got these answers:
Using Gaussian : 1.0
Using Multinominal : 0.889
Can you please send me the code and dataset vikas.kulshreshtha@gmail.com
what is .values in X_train.values in fit_transform
I got
100% accuracy with Gaussian NB
96% accuracy with Multinomial NB
Thanks for explaining in a very easy and convenient way :)
Thanks a lot for this playlist of such amazing tutorials.
at test_size=0.2, GaussianNB: 97.2% and MultinonialNB: 77.3%
For comparing the models I used Cross Validation (CV = 4) as you explained in the previous videos.
Average Gaussian Score = 0.9722222222222222
Average Multinomial score = 0.8333333333333333
better approach! thanks for your suggestion
Exercise solution: github.com/codebasics/py/blob/master/ML/14_naive_bayes/Exercise/14_naive_bayes_exercise.ipynb
Complete machine learning tutorial playlist: ruclips.net/video/gmvvaobm7eQ/видео.html
X_train_count = v.fit_transform(X_train.values) getting error here
AttributeError: 'NoneType' object has no attribute 'lower'
this is the error
i can just say that you are a perfect teacher, Thank you very much. This is a best channel to learn all about datascience!!!
GaussianNaiveBayes 0.972/ MultinomialNaiveBayes 0.94. MinMaxScaler train dataset. This series of tutorials are strongly recommended. Help me a lot
All your ML videos are wonderful. Good job. Difficult things explained easily. Thanks
Thank you for this wonderful tutorial
Exercise scores
GaussianNB score - 94.5%
MultinomialNB score - 84.5%
Good job gajanan, that’s a pretty good score. Thanks for working on the exercise
Kindly Sir, Help me to find a malicious email through AI. any link etc...
outstanding video series! greetings from Turkey, I learn too much from this channel. It's now my primary go-to resource to learn machine learning from scratch
I don't know why some people have disliked this video. How beautifully he is explaining the M.L algorithms.
My scores are : Multinomial NB = 0.84, Gaussian NB = 0.97. Thank you so much for these videos :)
Great job and great score. ☺️👍
You, Sir, are our hero!!!
I think you might be the most valuable resource online for ML beginners.
Gaussian: 100%
Multinomial: 86.1%
U r one of the best teacher I have ever seen
keep rocking
By the way I don't know from what you r suffering
get well soon buddy
take care of yourself.👍
I was suffering from Ulcerative colitis. I am doing well now.
@@codebasics thanks for ur reply sir
May I no from where u r?
@@karthikc8992 He is in US
@@muhammedrajab2301 I learned it before , by the way thank u for your reply
Wonderfull explanation sir, thanks for that and here is my result after execution
GaussianNB : 96.2%
MultinomialNB: 88.8%
Siddu, good job indeed.thats a pretty good score
Kindly Sir, Help me to find a malicious email through AI. any link etc...
Gaussian: 1.0
Multinomial: 0.833
Keep up the good work you're doing
I always recommend your playlist to others, it's really helpful and thanks for this effort.
Amazing tutorial, you teach far better than university professors. Following many of your playlist thoroughly !!! Thank you very much
Thank you for sharing your knowledge. These ML classes are gold ! 👏🏼👏🏼👏🏼
Thank you very much for that tutorial!
My results were:
GaussianNB score - 97.2%
MultinomialNB score - 86.1%
Good job Alikhan, that’s a pretty good score. Thanks for working on the exercise
Kindly Sir, Help me to find a malicious email through AI. any link etc...
Thank you Sir, for these well informed videos on ML.
Exercise answer:
Gaussian : 1.0
MultinomialNB : 0.889
Sir u use random state in your solution.Thank you sir i learned something new
Thanks a lot for the tuto. Your series is best because it contains the exercises.
My exercise result: GaussianNB = 0.96, MultinomialNB = 0.84. I also applied cross validation =5
i must say premium lectures i am getting from you sir
you are one of the best teacher in my life.
thanks Bhavya
Amazing !!! Just Amazing ️🔥 The best ML tutorial on RUclips....
Glad it was helpful!
Thanks for making such great content, free of cost. I'm enjoying .
I have never found such informative course like this.. Really great job !!!
Thank you for your amazing explanation. I have learned a lot.
Gaussian NB: 100%
Multinomial: 91.11%
From where did you get the dataset?
@@himakshipahuja3015 Check the exercise file and you will see the data set. Please tell me if you can't find it and I will send it to you
Thank you very much, @Stephen Ngumbi Kiilu. I found the dataset.
Very nice explanation, Thank you so much sir for keeping this much effort in making videos and the exercises
Really great videos sir, explained very well.
About the exercise:-
for GaussianNB :- 1.0
for MultinomialNB:- 0.944
with random_state= 7 and test_size=0.2
Great score. Good job 👍👏
I solved the exercise and I got the following score:
used train_test_split with test_size=0.2 and random_state=123
This parameters gave me following results:
GaussianNB score: 1.0(100%)
MultinomialNB score: 0.888888888888888(88%)
dataset shape : (178,13)[Dataset is pretty small!]
Great job muhammed. Good score indeed
I don't have any words for your work. Thanks a lot.
Thanks a lot for videos!!!!, 81% for MultinomialNB and 96% for GaussianNB
Perfect samad. You are really a good student as you are working on all my exercises 😊👌 keep it up 👍
very well demonstration sir,keep inspiring us with your great videos.
Thanks Prakash.
Sir very nice teaching and really it's very easy to understand
GaussianNB : 97.22
MultinomialNB: 86.11
thank you for this video
solved the excercise with the help of cross_val_score method
where i have found Gaussian performed better than Multinomial
as i got the list of their score in which
max value of Gaussian=0.97222222
max value of Multinomial=0.91428571
SIR, your tutorial helping me a lot because your teaching teachnique is quite familair and easy for me
thanks a lot SIR
Good job ashutosh, that’s a pretty good score. Thanks for working on the exercise
Wonderful sir,really cleared the concepts of pipeline and vectorisation method
Well, I've just finished the exercise - that's well-prepared, thanks for your committment.
Wonderfull explanation sir, thanks for that and here is my result after execution
GaussianNB : 97.77%
MultinomialNB: 73.33%
BernoulliNB: 44.44%
with test size = 25%
Sir, Your videos are great continue doing your job. I got an accuracy of GNB of 97.22 and MNB as 86.122 for the exercise question.
GaussianNB score is 1 whereas that for the MultinomialNB is 0.866... for the WINE dataset. Hence, GNB is performing better that MNB.
Your course is great for serving the practical needs of getting started doing ML in Python. For this video, some more explanation of pipelines would help. I understand what they are accomplishing, but not entirely how. Are the .fit methods referred to the underlying functions in the pipeline or is .fit its own method of the pipeline? How does the pipeline know to sue the right transformation method, that didn't seem to be explicitly specified?
Again thank so much for this and the other videos.
John
Just love the tutorial sir...........
Hats off to you!!
Your teaching is great sir
thank u so much sir from somewhere on earth from pakistan
Very helpful, appreciate all your content!
clean and clear explaination...thank you sir
Awesone tutorial
Thanks for the garble free explanation sir, my scores are:
GaussianNB:97.7%
MultinomialNB:80%
BernoulliNB:48.8%
hope, the above mentioned scores are good. Please comment, if any better score can be achieved in any another way.
Thank you very much for great explanation , my results are
GaussianNB =96.3%
MultinomialNB=83.33%
Sir, you did not give fit_transform method in pipeline. You only gave CoutVectorizer() but it automatically did fit_transform step. How did it do that?
i also did Regression analysis, which has r2 value as 0.89.
Great video, very well explained.. I'm gonna try doing the exercise soon
Awesome and so clean..
Glad you like it!
GausianNB is one for this dataset, it scores approx 97%
MultinomialNB had a score approx 92%
RandomForgetClassifier had a score approx 97%
That’s the way to go raj, good job working on that exercise
sir, is it possible to list out the vocabularies that the Naive Bayes algorithm found out to contain the high possibility of spam?
GaussianNB: 1
MultinomialNB: 0.85
Thankyou very much guru ji...
Thanks a lot for this course! As a beautiful and clever student I always do your exercises ^) I don't know what would make your course better. Maybe more exercises.
Glad you like them!
wonderful explanation sir.
fantastic
Thank you so much!
You explained this complex concept so easily..
👍
Excellent tutorial
Thanks for the video sir
My results are below
Gaussian score : 97.777%
Multinomial score: 88.888%
Good job surya, that’s a pretty good score. Thanks for working on the exercise
Gaussian model was most accurate for me resulting 97% accuracy, while Bernoulli being the least resulting only 19% accuracy which is to be expected since the training dataset was for continuous variables and Bernoulli model works better for Binary variables.
Very helpful videos buddy !!!
Thankyou for your efforts. These videos are really helpful
Glad you like them!
85% for Multinomial and 96% for Gaussian using mean of 10 fold cross validation
Awesome Anup. you are so fast. Good job :)
@@codebasics All thanks to you for such a nice explanation:)
In exercise, no need of pipeline or count vector, directly apply train test split and fit method, gaussian gives 97% while multinational gave 80%
Thank you for such videos. I got Multinomial as 83% and Gaussian as 100%. But my question is, why have different participants got different results?
This varies because of Train, Test dataset split. It's random split. Even if you execute train_test_split multiple times, you will get different results. :)
Really good one to start
Rashid, I am glad you liked it
your videos are great! good luck.
Thanks, you melika
I have a question. How it is finding the probability of continuous variables. Can you give me a link to explore
I used minMax scaler while preprocessing data as the features had different range like proline ,alcohol,malic_acid etc....
Got 0.97 in Gaussian and 0.81 in Multinomial
good content
Glad you liked it!
Great video!One doubt though, Why did we use X_train_count .toarray()[:3] , I did not understand the 3 , Thank you in advance
it is just for visualization purpose, printing only X_train_count .toarray() would have printed all the data points which are in thousands i guess, so sir just used slicing method "[:3]" which states that only 3 data poins will be shown. so we can look at the code properly. get yourself familiarized with pandas slicing and methods like df.iloc[] and df.loc[]. It will be useful
@@swapnshah3234 Yes , I have used iloc quite often but I felt we were just converting here to and array and not printing it and the 3 somehow had significance in this specific dataset for data cleaning thank you for your reply !
@@swapnshah3234 what is .values in X_train.values in fit_transform
i got accuracy 100 %. my train test split is as below.
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(df,target,random_state=20,test_size=0.05)
Thank you for tutorial. I like the way that you teach and GaussianNB work better but I do not know why! Also score of MultinomialNB for me caculate as 0.8444444444444444
Thank you so much sir. Your videos are really useful
Glad to hear that
worked that wine.csv got the following result
with Gaussian:0.97
with Multinomial:0.83
Great score. Good job 👌👏
With min-max scaling and X_train, X_test, y_train, y_test = train_test_split(X, y,test_size=.2, random_state=43) i got with Multinominal : 1
Using Gaussian : 1.0
U r awesome.. can you make videos on Deep learning an NLP.
I already have 3 videos on deep learning, just check the ML playlist. Also you read my mind in a sense that my next target is NLP series. Stay tuned :)
Thank you!
Welcome!
at 1:45 can we use mapping instead of lambda function??
awsm
how to apply the count vectorizer on more than one text column
Hi. i am getting lower case error when evaluating test data by CountVectorizer
.There is no integer value present as well.
How can i resolve it?
AttributeError: 'int' object has no attribute 'lower'
Have u solved it?
Sir, GaussianNaiveBayes works better gives 97.8% accuracy where MultinomialNB gives 86.7%
Nice one sir. Thank you so much...
Dhananjay, I am glad you liked it
Thank you sir, Results : GaussianNB is 1.0, and MultinomialNB is 0.88
great score
Nice vid bro.
well demonstration sir!!
Prakhar, I am glad you liked it
Thank you sir for your tutorial. I was confused in the countvectorizer at 4:06 , it would have been much better if you would have explained in more in detail. Like what datatype is xtrain and xtraincount, what kind of data is stored in x_train_count and so on. I learned from the shape and type of numpy. But it would have saved time. Also, why first you fit_transform and later just transform for emails. can anybody please
help me
Not sure about the first problem But I can help you solve the second problem. To solve your second problem , lets first understand what is fit(),transform(),and fit_transform() methods
fit() - The fit methods calculates the learning model parameter from training data . We use model.fit(x_train,y_train) so on , it calculates the internal parameters and adjusts the value for our prediction.
transform() - The transform methods applies the calculated parameter onto our dataset.
fit_transform() - The fit_transform() methods applies both fit () for calculating the parameters and transform() function to transform our dataset in one step.
In the first case, we use fit_transform(x_train) for calculating and transforming our entire dataset and for test data we are applying those parameters that we learned from fit_transform(x_train) so we use transform(x_test). I hope I cleared your doubt.