To install ggbiplot, the code is now (17, Jan, 2020): library(devtools) install_github("vqv/ggbiplot") source: github.com/vqv/ggbiplot Excellent video and well explained these concepts. Thanks.
Thanks a lot Sir for your nice presentation. You saved my time. Earlier I used your R codes on Kohonen NN and now for PCA for my training lectures. Your explanation is so lucid. I appreciate your noble service of sharing knowledge
Your videos have been constant companions during the last months of my master thesis. It seemed as if every time I had to switch to another analysis technique you were allready waiting here. So thank you a lot for your guidance and clear explanations! The only thing I would appreciate would be if you could provide the basic R scripts. Even though the copying process might help with understanding each command due to step by step application, to type text of a tiny youtube screen shown in one half of my monitor to r studio in the other half is troublesome. Thanks!
Thank you for the material. It is very clear and actually very relevant to my current work. As I understand, the conversion of the data comprises addition products of notmalized predictors and loadings. Maybe you would have time to post a PLS regression video, please? The intriguing part is the explanation of the model itself
19:12 It is only for purpose to show another way to get the principal component related to training because : identical(pc$x, predict(pc,training)) gives TRUE meaning that pc$x is same as predict(pc,training).
Hello. I dont know anything about Principal Component Analysis in R: Example with Predictive Model & Biplot Interpretation and i will never need to since thats not in my line of work. I Appreciate your Intromusic though. You are a true champ Bharatendra and enrich this world with your presence. Also that intro music fucking slaps.
Thanks for the video Please publish video on Exploratory Factor Analysis,Confirmatory Factor Analysis application in a model Also please explain the difference from PCA
Thank you Dr. Bharatendra Rai for explaining PCA in detail. Can you please explain how to find weights of a variable by PCA for making a composite index? Is it rotation values that are for PC1, PC2, etc.? For example, if I have (I=w1*X+w2*Y+w3*Z) then how to find w1, w2, w3 by PCA.
Hi, Most of the people who are seeing the videos are new to data science. Please explain the parameters of each function. Just typing wont help them. The more detail and the more slow, the better the no of views. I myself was a trainer. The difference between 1000 views and a million views is the clarity and completeness
Dear Respected Sir, I wanted to install ggbiplot using the command you provided with us. but it gives me another message. The message is (Installation failed: SSL certificate problem: self signed certificate in certificate chain Warning message: Username parameter is deprecated. Please use vqv/ggbiplot) I used vqv/ggbiplot as well, but no good results. please guide me what shall I do?
For machine learning such random forest, neural networks, support vector machines, and extreme gradient boosting, you can refer to following: ruclips.net/p/PL34t5iLfZddu8M0jd7pjSVUjvjBOBdYZ1
Awesome video! Could you plz add Partial least squares regression and principal components regression to your playlist! That would be of great help. Thanks in advance!
Hi Sir,Could you take one session on SVD in R and also some theoretical explanation on it. I m finding it very difficult to understand it with most of the material available on the net.
Very informative and nice presentation sir, sir can we estimate PCA for factor (for eg species) with unequal no. of observation. And we want to see the correlations in terms of each species viz for setosa or other two, how to do it? Please explain...Thank You
Thank you for this nice video Dr. Rai. I have a doubt. Why the predict function was used multiple times. After the prcomp function, all the data of Principle components were available in: pc$x. Why do we have to do: trg
Awesome video sir...kudos... :) 1 doubt though .... 20:48 - why are we using 2 components only? How do we know how many principal components to use?(species ~ PC1 + PC2)
2 PCs capture more than 95% of the variability in the data. Other 2 only add about 5%. So you can choose to have PCs that capture over 80% or 90% of the variability.
Thanks for this video sir, very good class but I can´t get it. because Error ... could not find function "ggbiplot". Excuse me, which is your R version ?
Dear Sir..thanks for a wonderful video. I have some questions. 1) At 20:18, why did u choose to reorder by setosa? 2)Why did you choose to use data as trg and not training to build mymodel given that trg has predictions from training 3) Can PCA be used to choose k in kmeans. If so, how to go about it? Thanks again. Regards
Thank you for sharing, I get an error "Error in plot_label(p = p, data = plot.data, label = label, label.label = label.label, : Unsupported class: prcomp"", when I try to run the ggbiplot. Would you please advise how to fix it?
Great video. Do you have a suggested package for running binary logistic regression? From a brief scan of nnet it appears to only have arguments for multinomial response variables. Thank you.
@@bkrai sorry I was unclear in my message. I was hoping for a suggested package to run a binary logistic regression using PCA components as predictors - similar to what you have done here with multinomial. Any suggestions are welcome.
Dr. Rai, Thanks for this informative video. I am having a problem getting the predict function to work with the model created on the training dataset. I am getting two errors(paraphrased): 1. NAs not allowed in subscripted assignments; 2. newdata has 1900 rows but variables found have 8100 rows. I think it is looking for the same number of rows in the test dataset. Is there something I am doing wrong? Appreciate any feedback.
Firstly thank you for your helpful video. I have problem to add ellipse in the plot. I have 30 variables, first 29 is the numeric and last one is the factor variables. But i can,t plot the ellipse in the PCA plot. How can i solve this? Please help.
Hi Bharatendra, nice video. I have got couple queries. If there are large no of numeric variables and through PCA we find that they are highly correlated then before going for model building 1) Do we need to remove highly correlated variables ! 2) which one to remove ! Thanks
If I just use addEllipses =TRUE, what determines the size of those ellipses? Also, if I specify ellipse.type = “confidence”, what confidence level is used to generate the ellipses? I used factoextra if that helps.
Sir, I am doing PCA analysis on DJ 30 Stocks and when I view pca$loadings for 30 variables, I noticed that some were not displayed. For example, Component 1 has -0.218 for Apple but then shows none for JPM, what does this mean?
Here are some playlists that you can choose from based on your interest: Machine Learning videos: goo.gl/WHHqWP Becoming Data Scientist: goo.gl/JWyyQc Introductory R Videos: goo.gl/NZ55SJ Deep Learning with TensorFlow: goo.gl/5VtSuC Image Analysis & Classification: goo.gl/Md3fMi Text mining: goo.gl/7FJGmd Data Visualization: goo.gl/Q7Q2A8 Playlist: goo.gl/iwbhnE
Sir why have you predicted the training and test data with respect to PC? can use trg data for making neural model and test using tst data set? and find correlation b/w act and predicted values?
When there are many variables, chances of having multicollinearity problem increases. And PCA helps to solve that problem. And yes, you can use neural network model.
@@bkrai sir can you please explain me the significance of the lines under the heading: prediction with principle components.As I am unable to understand why we are predicting twice on test data set. Please explain sir
thank you a lot for this support sir. If you could provide further guidance it would be very helpful. I am trying to build a models for metastasis prediction using single cell gene expression levels. kindly let me know if it would be possible for you. thanks again
Hello great video as always! However one question i had (even though you warned against hard interpretability of results) relates to how to interpret the coefficients. If we look at the coefficient table and read the first line (after the intercept), does that mean that with every increase of Sepal.Length there is a log odd increase of 14.05 in the probability of categorizing the specie as Versicolor, relative to a Setosa? Thanks!
To install ggbiplot, the code is now (17, Jan, 2020):
library(devtools)
install_github("vqv/ggbiplot")
source: github.com/vqv/ggbiplot
Excellent video and well explained these concepts. Thanks.
Thanks for the update!
This is the best PCA explanation I have seen anywhere so far. Thank you for sharing your knowledge.
Thanks for the feedback!
Thanks a lot Sir for your nice presentation. You saved my time. Earlier I used your R codes on Kohonen NN and now for PCA for my training lectures. Your explanation is so lucid. I appreciate your noble service of sharing knowledge
You are most welcome!
I revisited your video for interpretation of biplots in PCA. Many thanks.
You are welcome!
The Bio-plot was explained very clearly, thank you Dr. Rai!
You are welcome!
Awesome video. Every R enthusiast needs to keep an eye on your channel. Thank you and keep up with great work!
+Model Michael thanks👍
Sir,
Can we get code file ?
This video is worth its weight in gold
Thank you!!Best explanation on Biplot on RUclips .
Glad it was helpful!
You are too good sir. An absolute treat for ML enthusiasts.
Thanks for your comments!
Thank you for this extremely helpful, and easily understood tutorial, particularly the clear interpretation of the Bi-Plot. Much appreciated
You're very welcome!
R PCA IS VERY GOOD PACKAGE AND VERY HELPFULL
Yes, I agree!
This is great. I was looking for PCA and you have done it. Many many thanks to you sir.
one really good video i have found. After watching few of your video now your videos are becoming a "turn to" when require. thanks
Glad to hear that!
One of the best PCA videos i ever seen, Thank you Mr. Rai.
Thanks for comments!
Fantastic session.Perfectly understood Biplot
Thanks for comments!
Great Video! Excellent walk though on PCA and how it can be useful for actual classifications. Thanks for the upload.
+theeoddname thanks for the feedback!
Many thanks for you Dr. God bless you.
You are most welcome!
Thank you for this amazing video. Better than my university lectures
Thanks for comments!
Your videos have been constant companions during the last months of my master thesis. It seemed as if every time I had to switch to another analysis technique you were allready waiting here. So thank you a lot for your guidance and clear explanations!
The only thing I would appreciate would be if you could provide the basic R scripts. Even though the copying process might help with understanding each command due to step by step application, to type text of a tiny youtube screen shown in one half of my monitor to r studio in the other half is troublesome. Thanks!
Thanks for the feedback!
Thanks for the video! It helped me a lot doing the forecasting for future values using PCA.
Very welcome!
I really like your explanations in your videos. Keep them coming! Thanks
Thanks for the feedback!
Thank you for the material. It is very clear and actually very relevant to my current work.
As I understand, the conversion of the data comprises addition products of notmalized predictors and loadings.
Maybe you would have time to post a PLS regression video, please? The intriguing part is the explanation of the model itself
Thank you so much Dr. Rai. Detailed teaching
Thanks for comments!
19:12 It is only for purpose to show another way to get the principal component related to training because :
identical(pc$x, predict(pc,training)) gives TRUE meaning that pc$x is same as predict(pc,training).
That's correct!
Fabulous work in PCA ! Keep it up
Thanks for the feedback!
Awesome Explanation
make sure you run following before installing:
library(devtools)
Really really great explanation sir, Thank you so much for making it very simple
Thanks for comments!
Wonderful job explaining the material.
Thanks for your comments and finding it useful!
Hello. I dont know anything about Principal Component Analysis in R: Example with Predictive Model & Biplot Interpretation and i will never need to since thats not in my line of work. I Appreciate your Intromusic though. You are a true champ Bharatendra and enrich this world with your presence. Also that intro music fucking slaps.
Thanks for comments!
Seriously awesome explanations! Thank you again.
Thanks!
Great Explanation....
Thanks!
Thank you. Learned a lot from your channel
Thanks!
Excellent demonstration of PCA, really helpful. I just don't understand why in pc object, you use only training data instead of the entire data.
We only use training data so that we can later use test data to assess prediction model.
Great video! Thanks for sharing your knowledge.
Thanks for comments!
Thanks for the video
Please publish video on Exploratory Factor Analysis,Confirmatory Factor Analysis application in a model
Also please explain the difference from PCA
Thanks for the suggestion, I've added this to my list.
Great lecture. Thanks.
Thanks!
Thank you Dr. Bharatendra Rai for explaining PCA in detail. Can you please explain how to find weights of a variable by PCA for making a composite index? Is it rotation values that are for PC1, PC2, etc.? For example, if I have (I=w1*X+w2*Y+w3*Z) then how to find w1, w2, w3 by PCA.
For calculations you can refer to any textbook.
Hi, Most of the people who are seeing the videos are new to data science. Please explain the parameters of each function. Just typing wont help them. The more detail and the more slow, the better the no of views. I myself was a trainer.
The difference between 1000 views and a million views is the clarity and completeness
Thanks for the feedback!
Thank you, this video will be really helpful to complete my thesis :)
Good luck!
too good!! plz make more such videos...plz!
Thanks for comments! You may find this useful too:
ruclips.net/p/PL34t5iLfZddu8M0jd7pjSVUjvjBOBdYZ1
sir, please make a session on factor analysis with prediction
Thanks for the suggestion!
Great work! Thank you
Dear Respected Sir,
I wanted to install ggbiplot using the command you provided with us. but it gives me another message. The message is (Installation failed: SSL certificate problem: self signed certificate in certificate chain
Warning message:
Username parameter is deprecated. Please use vqv/ggbiplot) I used vqv/ggbiplot as well, but no good results.
please guide me what shall I do?
Not sure what went wrong. May be some typo or something else. Probably you can try running commands using my R file.
great lecture..please share your thoughts on machine learning introduction too
For machine learning such random forest, neural networks, support vector machines, and extreme gradient boosting, you can refer to following:
ruclips.net/p/PL34t5iLfZddu8M0jd7pjSVUjvjBOBdYZ1
Great video, thanks for uploading.
Thanks for comments!
Awesome video! Could you plz add Partial least squares regression and principal components regression to your playlist! That would be of great help. Thanks in advance!
Thanks for suggestions!
Thank You - this was extremely useful.
Very nice channel you have here - easy sub.
Thanks for comments!
Add a video on non negative matrix factorization like intNMF
Thanks, I've added it to my list of future videos.
Very useful video sir. Could you explain me what is the need to partition the data into training and testing data?
You may review this:
ruclips.net/video/aS1O8EiGLdg/видео.html
@@bkrai thank you sir.
Thanks much appreciated..
it worked
Hi Sir,Could you take one session on SVD in R and also some theoretical explanation on it. I m finding it very difficult to understand it with most of the material available on the net.
thank you for the amazing video!
Thanks for comments!
Can you please help with combined pca and ann model?
I'm adding to the list of future videos.
Thank you
Welcome!
your videos are great :)
Thank you!
Very informative and nice presentation sir, sir can we estimate PCA for factor (for eg species) with unequal no. of observation.
And we want to see the correlations in terms of each species viz for setosa or other two, how to do it? Please explain...Thank You
Awesome explanation sir...👍👍can you make a video for independent component analysis using r in the same way sir?
Thanks, I've have added it to my list.
Hi Sir, your materials are simple and wonderful. Pls do one video for xgboost. that would be great.
Thanks for the suggestion!
Bharatendra Rai Thanks a lot sir.
I agree with sathish ravi, Sir please make a video on xgboost. You are one stop solution for every problem and I will remember you all my life.
Awesome video. Thank you. As time permits can you do a video on use of caret package? thank you
Saw this today. Thanks for comments!
It was a fruitful video.Can you please share the code.
Nice presentation. when you are coding line 8 you said a sample of size 2, which size are you referring to? Thanks
For partitioning the data in to two, training and testing.
brilliant sir..simple and sweet..thanks...nice music....if i have 10 DISCRETE VARIABLEShow to reduce to 2 or 3 components, please explain?
Thanks for comments! Note that this method is only for numeric variables.
Can you upload a video describing independent component analysis in R
I've added it to my list.
Interesting
thanks!
Thank you so... much!
Thanks for comments!
thanks for the video sir... helped a lot :)
Thanks for the feedback!
Sooo much love you sir.This helped me perfect
Thanks for comments!
Sir - Requesting you to kindly give a lecture advanced r programming like on H20 packages etc..
Thanks for the suggestion, I've added this to my list.
Thank you so much for this video. Will you please make a video on Broken-line regression in R?
Thanks for the suggestion, I've added this to my list.
Thank you for this nice video Dr. Rai.
I have a doubt. Why the predict function was used multiple times. After the prcomp function, all the data of Principle components were available in:
pc$x.
Why do we have to do:
trg
In R you can get same thing in multiple ways. This is just for illustration.
@@bkrai Thank you Sir. That makes it clear.
@@abhishek894 You are welcome!
sir my data is showing [ reached getOption("max.print") -- omitted 10 rows ]. the last 10 rows are omitted, how to fix this, please
That's just how much gets printed. But all data still remains intact.
Cool video! Can you do a video about Multiple Correspondance Analysis(MCA) for cualitative data? It would help me a lot
Thanks, I've added this to my list.
Awesome video sir...kudos... :)
1 doubt though .... 20:48 - why are we using 2 components only? How do we know how many principal components to use?(species ~ PC1 + PC2)
2 PCs capture more than 95% of the variability in the data. Other 2 only add about 5%. So you can choose to have PCs that capture over 80% or 90% of the variability.
Thanks for this video sir, very good class but I can´t get it. because Error ... could not find function "ggbiplot". Excuse me, which is your R version ?
Try this:
library(devtools)
install_github("vqv/ggbiplot")
Dear Sir..thanks for a wonderful video. I have some questions.
1) At 20:18, why did u choose to reorder by setosa?
2)Why did you choose to use data as trg and not training to build mymodel given that trg has predictions from training
3) Can PCA be used to choose k in kmeans. If so, how to go about it?
Thanks again.
Regards
Good evening
If you want to show the first dimension (Dim1) and the third dimension (Dim3)
What to do or if you can provide the code for that
Thanks
can a dataset consisting of the principal components and the target variable be used to perform machine learning techniques?
Yes, this video shows an example of doing it.
Thank you for sharing, I get an error "Error in plot_label(p = p, data = plot.data, label = label, label.label = label.label, : Unsupported class: prcomp"", when I try to run the ggbiplot. Would you please advise how to fix it?
Great video. Do you have a suggested package for running binary logistic regression? From a brief scan of nnet it appears to only have arguments for multinomial response variables. Thank you.
You can refer to this:
ruclips.net/video/AVx7Wc1CQ7Y/видео.html
@@bkrai sorry I was unclear in my message. I was hoping for a suggested package to run a binary logistic regression using PCA components as predictors - similar to what you have done here with multinomial. Any suggestions are welcome.
Yes, you can use the PCA components as predictors and run binary logistic regression as shown in the link that I sent earlier.
Dr. Rai,
Thanks for this informative video. I am having a problem getting the predict function to work with the model created on the training dataset. I am getting two errors(paraphrased): 1. NAs not allowed in subscripted assignments; 2. newdata has 1900 rows but variables found have 8100 rows. I think it is looking for the same number of rows in the test dataset. Is there something I am doing wrong? Appreciate any feedback.
NAs occur when there is missing data. For handling missing values, refer to:
ruclips.net/video/An7nPLJ0fsg/видео.html
Hi, I want to know from where can I get the iris example data ? thank you!
It's inbuilt in R itself. You can access it by running first 3 lines shown in the video.
Firstly thank you for your helpful video. I have problem to add ellipse in the plot. I have 30 variables, first 29 is the numeric and last one is the factor variables. But i can,t plot the ellipse in the PCA plot. How can i solve this? Please help.
Orthogonality of principal component- 10:17
Thx
Can you please show back propagation algorithm in r
Refer to this:
ruclips.net/video/-Vs9Vae2KI0/видео.html
Hi Bharatendra, nice video. I have got couple queries. If there are large no of numeric variables and through PCA we find that they are highly correlated then before going for model building
1) Do we need to remove highly correlated variables !
2) which one to remove ! Thanks
You don't need to remove if you are using the components for developing a prediction model. This video provides a similar example.
thanks
Principal components are orthogonal to each other, saying differently they are uncorrelated and can be used as is in model building.
Thanks!
If I just use addEllipses =TRUE, what determines the size of those ellipses? Also, if I specify ellipse.type = “confidence”, what confidence level is used to generate the ellipses? I used factoextra if that helps.
Sir, I am doing PCA analysis on DJ 30 Stocks and when I view pca$loadings for 30 variables, I noticed that some were not displayed. For example, Component 1 has -0.218 for Apple but then shows none for JPM, what does this mean?
Thank you very much for the video... I am in interested in learning R program from the basic. Please, can you teach me using some of your videos?
Here are some playlists that you can choose from based on your interest:
Machine Learning videos: goo.gl/WHHqWP
Becoming Data Scientist: goo.gl/JWyyQc
Introductory R Videos: goo.gl/NZ55SJ
Deep Learning with TensorFlow: goo.gl/5VtSuC
Image Analysis & Classification: goo.gl/Md3fMi
Text mining: goo.gl/7FJGmd
Data Visualization: goo.gl/Q7Q2A8
Playlist: goo.gl/iwbhnE
Great video.. What if we want to include factor-like "Control and Heat" for genotypes? Please suggest
It should work fine.
scatter plot & correlation coefficients 2:05
Thx
scatter Plat and Correlation- 2:04
Thx
Sir why have you predicted the training and test data with respect to PC? can use trg data for making neural model and test using tst data set? and find correlation b/w act and predicted values?
When there are many variables, chances of having multicollinearity problem increases. And PCA helps to solve that problem. And yes, you can use neural network model.
@@bkrai sir can you please explain me the significance of the lines under the heading: prediction with principle components.As I am unable to understand why we are predicting twice on test data set. Please explain sir
To avoid over-fitting where you get very good result from training data but not so from testing.
Thanks sir, why in this video use linear regression? Can i use k means to clustering from pc1 and pc2?
Which line are you referring to?
Sorry, i mean logistic regression in line 59
Hello, you put training [5] to reference the column on trg variable....
shouldn't it be training[ , 5]?
It is training[ , 5] in the video.
Do you have a video on PCA for unsupervised learning via clustering and similarity ranking?
not yet.
thank you a lot for this support sir.
If you could provide further guidance it would be very helpful. I am trying to build a models for metastasis prediction using single cell gene expression levels.
kindly let me know if it would be possible for you. thanks again
You may find this useful:
ruclips.net/video/Uil2GZa8gbg/видео.html
Great job, same as always. Can I use PCA for 2 or more categorical variables? Can I define those variables as 0 and 1 in PCA?
You can only use numeric variables. You can try using 0 and 1 and see if it works ok.
Thanks sir .....can u please tell me how start learning on R from beginning?
You can start with this playlist:
ruclips.net/p/PL34t5iLfZddv8tJkZboegN6tmyh2-zr_T
Hello great video as always! However one question i had (even though you warned against hard interpretability of results) relates to how to interpret the coefficients. If we look at the coefficient table and read the first line (after the intercept), does that mean that with every increase of Sepal.Length there is a log odd increase of 14.05 in the probability of categorizing the specie as Versicolor, relative to a Setosa? Thanks!
Your interpretation is correct.
Thank you! Keep up the good work! Your r videos are great!
Sir ..ggbiplot is not installed hence cant work on this ..though i followed the video throughly
Hi..good day bharatendra..I want to replace one my columns with value 1 for all its elements,what is the code in R studio..thanks for your time?
suppose you are using following data:
data(iris)
To add what you indicated to a "new" column, you can use:
iris$new
thanx for ur ans ..I do already have a column with different values,I wanna replace all values on that column with just 1
So for iris data if you want to change all values for Sepal.Length variable to 1, you can use:
iris$Sepal.Length