Principal Component Analysis in R: Example with Predictive Model & Biplot Interpretation

Поделиться
HTML-код
  • Опубликовано: 23 янв 2025

Комментарии • 356

  • @abdullahmohammed8521
    @abdullahmohammed8521 4 года назад +2

    Many thanks for you Dr. God bless you.

    • @bkrai
      @bkrai  4 года назад

      You are most welcome!

  • @ramram2utube
    @ramram2utube Год назад +2

    I revisited your video for interpretation of biplots in PCA. Many thanks.

    • @bkrai
      @bkrai  Год назад

      You are welcome!

  • @Drgautham
    @Drgautham 3 месяца назад +2

    Thank you so much Professor🙏

    • @bkrai
      @bkrai  3 месяца назад +1

      You are very welcome!

  • @ashishsangwan5925
    @ashishsangwan5925 6 лет назад +2

    Awesome Explanation

    • @bkrai
      @bkrai  6 лет назад

      make sure you run following before installing:
      library(devtools)

  • @shawnmckenzie8699
    @shawnmckenzie8699 5 лет назад +2

    To install ggbiplot, the code is now (17, Jan, 2020):
    library(devtools)
    install_github("vqv/ggbiplot")
    source: github.com/vqv/ggbiplot
    Excellent video and well explained these concepts. Thanks.

    • @bkrai
      @bkrai  5 лет назад +1

      Thanks for the update!

  • @babadrammeh656
    @babadrammeh656 2 года назад +2

    R PCA IS VERY GOOD PACKAGE AND VERY HELPFULL

    • @bkrai
      @bkrai  2 года назад

      Yes, I agree!

  • @philipabraham5600
    @philipabraham5600 7 лет назад +3

    This is the best PCA explanation I have seen anywhere so far. Thank you for sharing your knowledge.

    • @bkrai
      @bkrai  7 лет назад

      Thanks for the feedback!

  • @jacklu1611
    @jacklu1611 2 года назад +2

    The Bio-plot was explained very clearly, thank you Dr. Rai!

    • @bkrai
      @bkrai  2 года назад

      You are welcome!

  • @Dejia_Space
    @Dejia_Space 4 года назад +2

    Thank you!!Best explanation on Biplot on RUclips .

    • @bkrai
      @bkrai  4 года назад

      Glad it was helpful!

  • @ramram2utube
    @ramram2utube 2 года назад +2

    Thanks a lot Sir for your nice presentation. You saved my time. Earlier I used your R codes on Kohonen NN and now for PCA for my training lectures. Your explanation is so lucid. I appreciate your noble service of sharing knowledge

    • @bkrai
      @bkrai  2 года назад

      You are most welcome!

  • @nyatonkitnya4267
    @nyatonkitnya4267 3 года назад +2

    one really good video i have found. After watching few of your video now your videos are becoming a "turn to" when require. thanks

    • @bkrai
      @bkrai  3 года назад

      Glad to hear that!

  • @jonm7272
    @jonm7272 4 года назад +4

    Thank you for this extremely helpful, and easily understood tutorial, particularly the clear interpretation of the Bi-Plot. Much appreciated

    • @bkrai
      @bkrai  4 года назад

      You're very welcome!

  • @modelmichael1972
    @modelmichael1972 7 лет назад +7

    Awesome video. Every R enthusiast needs to keep an eye on your channel. Thank you and keep up with great work!

    • @bkrai
      @bkrai  7 лет назад +1

      +Model Michael thanks👍

    • @padhanewalaullu
      @padhanewalaullu 7 лет назад

      Sir,
      Can we get code file ?

  • @NIKHILESHMNAIK
    @NIKHILESHMNAIK 5 лет назад +2

    You are too good sir. An absolute treat for ML enthusiasts.

    • @bkrai
      @bkrai  5 лет назад +1

      Thanks for your comments!

  • @jinnythomas9815
    @jinnythomas9815 4 года назад +2

    Great Explanation....

    • @bkrai
      @bkrai  4 года назад

      Thanks!

  • @theeoddname
    @theeoddname 7 лет назад +3

    Great Video! Excellent walk though on PCA and how it can be useful for actual classifications. Thanks for the upload.

    • @bkrai
      @bkrai  7 лет назад

      +theeoddname thanks for the feedback!

  • @ainli4125466
    @ainli4125466 2 года назад +1

    Thank you for sharing, I get an error "Error in plot_label(p = p, data = plot.data, label = label, label.label = label.label, : Unsupported class: prcomp"", when I try to run the ggbiplot. Would you please advise how to fix it?

  • @flamboyantperson5936
    @flamboyantperson5936 7 лет назад +3

    This is great. I was looking for PCA and you have done it. Many many thanks to you sir.

  • @deepikachandrasekaran3554
    @deepikachandrasekaran3554 3 года назад +2

    Very useful video sir. Could you explain me what is the need to partition the data into training and testing data?

    • @bkrai
      @bkrai  3 года назад

      You may review this:
      ruclips.net/video/aS1O8EiGLdg/видео.html

    • @deepikachandrasekaran3554
      @deepikachandrasekaran3554 3 года назад

      @@bkrai thank you sir.

  • @jonimatix
    @jonimatix 7 лет назад +3

    I really like your explanations in your videos. Keep them coming! Thanks

    • @bkrai
      @bkrai  7 лет назад

      Thanks for the feedback!

  • @bucklasek1
    @bucklasek1 3 года назад +2

    Thanks for the video! It helped me a lot doing the forecasting for future values using PCA.

    • @bkrai
      @bkrai  3 года назад

      Very welcome!

  • @srujananeelam6547
    @srujananeelam6547 5 лет назад +2

    Fantastic session.Perfectly understood Biplot

    • @bkrai
      @bkrai  5 лет назад

      Thanks for comments!

  • @azzeddinereghais7494
    @azzeddinereghais7494 4 года назад +1

    Good evening
    If you want to show the first dimension (Dim1) and the third dimension (Dim3)
    What to do or if you can provide the code for that
    Thanks

  • @galk32
    @galk32 5 лет назад +2

    One of the best PCA videos i ever seen, Thank you Mr. Rai.

    • @bkrai
      @bkrai  5 лет назад

      Thanks for comments!

  • @saurabhkhodake
    @saurabhkhodake 7 лет назад +3

    This video is worth its weight in gold

  • @dioagusnofrizal9773
    @dioagusnofrizal9773 4 года назад +2

    Thanks sir, why in this video use linear regression? Can i use k means to clustering from pc1 and pc2?

    • @bkrai
      @bkrai  4 года назад

      Which line are you referring to?

    • @dioagusnofrizal9773
      @dioagusnofrizal9773 4 года назад

      Sorry, i mean logistic regression in line 59

  • @affyy04
    @affyy04 3 года назад +2

    Thank you for this amazing video. Better than my university lectures

    • @bkrai
      @bkrai  3 года назад

      Thanks for comments!

  • @golumworks
    @golumworks 2 года назад +1

    If I just use addEllipses =TRUE, what determines the size of those ellipses? Also, if I specify ellipse.type = “confidence”, what confidence level is used to generate the ellipses? I used factoextra if that helps.

  • @soumyanayak445
    @soumyanayak445 5 лет назад +2

    Sir why have you predicted the training and test data with respect to PC? can use trg data for making neural model and test using tst data set? and find correlation b/w act and predicted values?

    • @bkrai
      @bkrai  5 лет назад +1

      When there are many variables, chances of having multicollinearity problem increases. And PCA helps to solve that problem. And yes, you can use neural network model.

    • @soumyanayak445
      @soumyanayak445 5 лет назад +1

      @@bkrai sir can you please explain me the significance of the lines under the heading: prediction with principle components.As I am unable to understand why we are predicting twice on test data set. Please explain sir

    • @bkrai
      @bkrai  4 года назад

      To avoid over-fitting where you get very good result from training data but not so from testing.

  • @ketanverma7839
    @ketanverma7839 3 года назад +2

    is there any other alternative package for ggbiplot ?

    • @bkrai
      @bkrai  3 года назад +1

      Try this for biplot ( I just now ran this in RStudio cloud, and it worked fine):
      library(devtools)
      install_github("fawda123/ggord")
      library(ggord)

  • @statistician2856
    @statistician2856 3 года назад +1

    sir my data is showing [ reached getOption("max.print") -- omitted 10 rows ]. the last 10 rows are omitted, how to fix this, please

    • @bkrai
      @bkrai  2 года назад +1

      That's just how much gets printed. But all data still remains intact.

  • @sonalichakrabarty1618
    @sonalichakrabarty1618 3 года назад +2

    Can you please show back propagation algorithm in r

    • @bkrai
      @bkrai  3 года назад

      Refer to this:
      ruclips.net/video/-Vs9Vae2KI0/видео.html

  • @souvikmukherjee7977
    @souvikmukherjee7977 2 года назад +2

    sir, please make a session on factor analysis with prediction

    • @bkrai
      @bkrai  2 года назад

      Thanks for the suggestion!

  • @alessandrorosati969
    @alessandrorosati969 Год назад +1

    can a dataset consisting of the principal components and the target variable be used to perform machine learning techniques?

    • @bkrai
      @bkrai  Год назад

      Yes, this video shows an example of doing it.

  • @aks1008
    @aks1008 Год назад +2

    Sir can I use boruta function instead of pca in r..

    • @bkrai
      @bkrai  Год назад +1

      Yes certainly. Here is the link:
      ruclips.net/video/VEBax2WMbEA/видео.html

    • @aks1008
      @aks1008 Год назад +1

      @@bkrai sir what do you like between r and python..i find r code more easy to understand and write..

    • @bkrai
      @bkrai  Год назад +1

      In universities, business students usually use R and computer science students mostly use Python. If you are mainly looking to apply various machine learning and statistical methodologies, R is perfect.

  • @eldrigeampong8573
    @eldrigeampong8573 5 лет назад +2

    Thank you so much Dr. Rai. Detailed teaching

    • @bkrai
      @bkrai  5 лет назад

      Thanks for comments!

  • @Rutvi_patel_1111
    @Rutvi_patel_1111 7 лет назад +3

    Fabulous work in PCA ! Keep it up

    • @bkrai
      @bkrai  7 лет назад +1

      Thanks for the feedback!

  • @abiani007
    @abiani007 4 года назад +1

    Can you upload a video describing independent component analysis in R

    • @bkrai
      @bkrai  4 года назад +1

      I've added it to my list.

  • @nyadav378
    @nyadav378 Год назад +1

    Very informative and nice presentation sir, sir can we estimate PCA for factor (for eg species) with unequal no. of observation.
    And we want to see the correlations in terms of each species viz for setosa or other two, how to do it? Please explain...Thank You

  • @dejunli6417
    @dejunli6417 2 года назад +1

    Hi, I want to know from where can I get the iris example data ? thank you!

    • @bkrai
      @bkrai  2 года назад

      It's inbuilt in R itself. You can access it by running first 3 lines shown in the video.

  • @johnstevenson6458
    @johnstevenson6458 2 года назад +1

    Great video. Do you have a suggested package for running binary logistic regression? From a brief scan of nnet it appears to only have arguments for multinomial response variables. Thank you.

    • @bkrai
      @bkrai  2 года назад

      You can refer to this:
      ruclips.net/video/AVx7Wc1CQ7Y/видео.html

    • @johnstevenson6458
      @johnstevenson6458 2 года назад +1

      @@bkrai sorry I was unclear in my message. I was hoping for a suggested package to run a binary logistic regression using PCA components as predictors - similar to what you have done here with multinomial. Any suggestions are welcome.

    • @bkrai
      @bkrai  2 года назад

      Yes, you can use the PCA components as predictors and run binary logistic regression as shown in the link that I sent earlier.

  • @donne4real
    @donne4real 5 лет назад +2

    Wonderful job explaining the material.

    • @bkrai
      @bkrai  5 лет назад

      Thanks for your comments and finding it useful!

  • @inesceciliacardonadevoz5072
    @inesceciliacardonadevoz5072 4 года назад +2

    Thanks for this video sir, very good class but I can´t get it. because Error ... could not find function "ggbiplot". Excuse me, which is your R version ?

    • @bkrai
      @bkrai  4 года назад

      Try this:
      library(devtools)
      install_github("vqv/ggbiplot")

  • @koparka112
    @koparka112 2 года назад +1

    Thank you for the material. It is very clear and actually very relevant to my current work.
    As I understand, the conversion of the data comprises addition products of notmalized predictors and loadings.
    Maybe you would have time to post a PLS regression video, please? The intriguing part is the explanation of the model itself

  • @abhishek894
    @abhishek894 3 года назад +1

    Thank you for this nice video Dr. Rai.
    I have a doubt. Why the predict function was used multiple times. After the prcomp function, all the data of Principle components were available in:
    pc$x.
    Why do we have to do:
    trg

    • @bkrai
      @bkrai  3 года назад

      In R you can get same thing in multiple ways. This is just for illustration.

    • @abhishek894
      @abhishek894 3 года назад +1

      @@bkrai Thank you Sir. That makes it clear.

    • @bkrai
      @bkrai  3 года назад

      @@abhishek894 You are welcome!

  • @mukhtaradamuabubakar370
    @mukhtaradamuabubakar370 3 года назад +1

    Nice video and very helpful, I have challenges while installing the ggbiplot and mnet packages (am using R version 3.6.3) please any advice on how to over come such challenge?

    • @mukhtaradamuabubakar370
      @mukhtaradamuabubakar370 3 года назад

      OK for the nnet package it was successfully installed. but still struggling with the ggbiplot (despite using your codes). thanks

  • @upskillwithchetan
    @upskillwithchetan 5 лет назад +3

    Really really great explanation sir, Thank you so much for making it very simple

    • @bkrai
      @bkrai  5 лет назад

      Thanks for comments!

  • @katherinechau5594
    @katherinechau5594 3 года назад +2

    your videos are great :)

    • @bkrai
      @bkrai  3 года назад

      Thank you!

  • @siddharthadas86
    @siddharthadas86 7 лет назад +3

    Seriously awesome explanations! Thank you again.

    • @bkrai
      @bkrai  5 лет назад

      Thanks!

  • @sidraghayas8583
    @sidraghayas8583 5 лет назад +3

    Can you please help with combined pca and ann model?

    • @bkrai
      @bkrai  5 лет назад

      I'm adding to the list of future videos.

  • @ConeliusC33
    @ConeliusC33 7 лет назад +4

    Your videos have been constant companions during the last months of my master thesis. It seemed as if every time I had to switch to another analysis technique you were allready waiting here. So thank you a lot for your guidance and clear explanations!
    The only thing I would appreciate would be if you could provide the basic R scripts. Even though the copying process might help with understanding each command due to step by step application, to type text of a tiny youtube screen shown in one half of my monitor to r studio in the other half is troublesome. Thanks!

    • @bkrai
      @bkrai  7 лет назад

      Thanks for the feedback!

  • @asiangg
    @asiangg 7 лет назад +3

    Thank you. Learned a lot from your channel

    • @bkrai
      @bkrai  7 лет назад

      Thanks!

  • @ziauddinkhan4189
    @ziauddinkhan4189 5 лет назад +1

    Why u partition the data into 80 and 20 % please answer

    • @bkrai
      @bkrai  5 лет назад +1

      It can be any other ratio too. Eg., 60:40, 70:30, 75:25 or 90:10.

    • @ziauddinkhan4189
      @ziauddinkhan4189 5 лет назад +1

      @@bkrai my question is that what's the reason behind splitting the data into parts in testing and training either its 8o to 20 or 60 to 40. Thanks

    • @bkrai
      @bkrai  4 года назад

      To avoid over-fitting where you get very good result from training data but not so from testing.

  • @hr_foods
    @hr_foods 4 года назад +1

    Thanks for good video. Sir I am using R 3.6.1 version unable to install devtools and ggbiplot also. If devtools install then show that usethis package is missing please solve my issue.

    • @bkrai
      @bkrai  4 года назад +1

      I would suggest upgrade R. Currently it is around 4.

    • @hr_foods
      @hr_foods 4 года назад +1

      @@bkrai I upgrade it but still this problem happen

    • @bkrai
      @bkrai  4 года назад

      Try this:
      library(devtools)
      install_github("vqv/ggbiplot")

    • @hr_foods
      @hr_foods 4 года назад +1

      @@bkrai I used these codes but not install error occured

    • @bkrai
      @bkrai  4 года назад

      After intalling make sure to run library.

  • @alindonosi4
    @alindonosi4 3 года назад +1

    Where can I find the raw data of this project?

    • @bkrai
      @bkrai  3 года назад

      Data used here is available within R.

  • @mamadououattara210
    @mamadououattara210 2 года назад +1

    Hi Dr, How to I use PCA to generate a score based on several variables? Regards

  • @mohammadj.shamim9342
    @mohammadj.shamim9342 7 лет назад +2

    Dear Respected Sir,
    I wanted to install ggbiplot using the command you provided with us. but it gives me another message. The message is (Installation failed: SSL certificate problem: self signed certificate in certificate chain
    Warning message:
    Username parameter is deprecated. Please use vqv/ggbiplot) I used vqv/ggbiplot as well, but no good results.
    please guide me what shall I do?

    • @bkrai
      @bkrai  7 лет назад

      Not sure what went wrong. May be some typo or something else. Probably you can try running commands using my R file.

  • @LlamaFina
    @LlamaFina 6 лет назад +2

    Great video! Thanks for sharing your knowledge.

    • @bkrai
      @bkrai  6 лет назад

      Thanks for comments!

  • @earlymorningcodes6100
    @earlymorningcodes6100 4 года назад +1

    Orthogonality of principal component- 10:17

  • @deepthibhadran4181
    @deepthibhadran4181 4 года назад +1

    sir can u please make one video on k means clustering and classification and regression tree analysis

    • @bkrai
      @bkrai  4 года назад

      See this link:
      ruclips.net/video/5eDqRysaico/видео.html

    • @deepthibhadran4181
      @deepthibhadran4181 4 года назад +1

      @@bkrai thank you sir

    • @bkrai
      @bkrai  4 года назад

      You are welcome!

    • @deepthibhadran4181
      @deepthibhadran4181 4 года назад +1

      @@bkrai Sir do you know about WRF model

    • @bkrai
      @bkrai  4 года назад

      yes

  • @manpreetkaur7716
    @manpreetkaur7716 2 года назад +1

    Add a video on non negative matrix factorization like intNMF

    • @bkrai
      @bkrai  2 года назад

      Thanks, I've added it to my list of future videos.

  • @WahranRai
    @WahranRai 3 года назад +3

    19:12 It is only for purpose to show another way to get the principal component related to training because :
    identical(pc$x, predict(pc,training)) gives TRUE meaning that pc$x is same as predict(pc,training).

    • @bkrai
      @bkrai  3 года назад

      That's correct!

  • @md.tabibulislam9740
    @md.tabibulislam9740 7 лет назад +1

    Firstly thank you for your helpful video. I have problem to add ellipse in the plot. I have 30 variables, first 29 is the numeric and last one is the factor variables. But i can,t plot the ellipse in the PCA plot. How can i solve this? Please help.

  • @naeem3072
    @naeem3072 5 лет назад

    > install_github("ggbiplot","vqv")
    Error in parse_repo_spec(repo) :
    Invalid git repo specification: 'ggbiplot'
    what should i do sir

    • @bkrai
      @bkrai  5 лет назад

      Check if you ran library(devtools)

  • @maf4421
    @maf4421 3 года назад +2

    Thank you Dr. Bharatendra Rai for explaining PCA in detail. Can you please explain how to find weights of a variable by PCA for making a composite index? Is it rotation values that are for PC1, PC2, etc.? For example, if I have (I=w1*X+w2*Y+w3*Z) then how to find w1, w2, w3 by PCA.

    • @bkrai
      @bkrai  3 года назад

      For calculations you can refer to any textbook.

  • @BbakMs
    @BbakMs 7 лет назад +1

    Sir, I am doing PCA analysis on DJ 30 Stocks and when I view pca$loadings for 30 variables, I noticed that some were not displayed. For example, Component 1 has -0.218 for Apple but then shows none for JPM, what does this mean?

  • @hridayborah9750
    @hridayborah9750 5 лет назад +3

    ggbiplot not getting installed when tried the way in the video,please advise how to install

    • @bkrai
      @bkrai  4 года назад +1

      You can try this:
      library(devtools)
      install_github("vqv/ggbiplot")

    • @himanshusharma-uq4eh
      @himanshusharma-uq4eh 4 года назад +1

      @@bkrai
      try this
      install.packages("remotes")
      remotes::install_github("vqv/ggbiplot")
      it will help.

    • @bkrai
      @bkrai  4 года назад +1

      Thanks!

  • @rainbowdu509
    @rainbowdu509 7 лет назад +2

    Hi..good day bharatendra..I want to replace one my columns with value 1 for all its elements,what is the code in R studio..thanks for your time?

    • @bkrai
      @bkrai  7 лет назад

      suppose you are using following data:
      data(iris)
      To add what you indicated to a "new" column, you can use:
      iris$new

    • @rainbowdu509
      @rainbowdu509 7 лет назад

      thanx for ur ans ..I do already have a column with different values,I wanna replace all values on that column with just 1

    • @bkrai
      @bkrai  7 лет назад +1

      So for iris data if you want to change all values for Sepal.Length variable to 1, you can use:
      iris$Sepal.Length

  • @earlymorningcodes6100
    @earlymorningcodes6100 4 года назад +1

    scatter Plat and Correlation- 2:04

  • @victorhenostroza1871
    @victorhenostroza1871 5 лет назад +3

    Dear Teacher, I can`t install ggbiplot from github, is there other way to install it?

    • @victorhenostroza1871
      @victorhenostroza1871 5 лет назад +2

      My R version is 3.6.0

    • @bkrai
      @bkrai  5 лет назад +1

      you can try this:
      library(devtools)
      install_github("vqv/ggbiplot")

    • @victorhenostroza1871
      @victorhenostroza1871 5 лет назад +1

      Thanks, I also found other way to plot the PCA:
      library(ggfortify)
      autoplot(pc, data = training_set, colour = 'Species',
      loadings = TRUE, loadings.colour = 'blue',
      loadings.label = TRUE, loadings.label.size = 3)

    • @bkrai
      @bkrai  5 лет назад

      Thanks for the update!

  • @SaiKiran-zu3xk
    @SaiKiran-zu3xk 4 года назад +2

    How to know the exact names of the variables after doing PCA like they are before

    • @bkrai
      @bkrai  4 года назад

      Each pc is a combination of all variables and all variables retain their original name.

  • @GeFaaaA
    @GeFaaaA 5 лет назад +1

    Hello very nice video!!! i have a question. Do you how i choose how many PC i have to use and which ones ???

    • @bkrai
      @bkrai  5 лет назад +1

      When you have many PCs, you can select first few that capture almost all variability contained in data.

    • @GeFaaaA
      @GeFaaaA 5 лет назад +1

      @@bkrai thank you for your response! So I have to test every possible model , right? Do you know if I can use something like a criterion ?

    • @bkrai
      @bkrai  4 года назад

      It is good to capture over 80% of the variability.

  • @indranipal8131
    @indranipal8131 4 года назад +2

    Do you have a video on PCA for unsupervised learning via clustering and similarity ranking?

    • @bkrai
      @bkrai  4 года назад

      not yet.

  • @scholars.home999
    @scholars.home999 4 года назад +1

    Sir, can you please suggest how I can perform PCA on my Panel Data? -Regards

  • @shapeletter
    @shapeletter 4 года назад +2

    What is the difference between using "scale." and "scale"? Is it in order to use z-score vs. min-max?

    • @bkrai
      @bkrai  4 года назад

      Here the code requires scale. to be used. It uses z-score.

    • @shapeletter
      @shapeletter 4 года назад +1

      @@bkrai Okay, thanks! I will try it out! :)

    • @bkrai
      @bkrai  4 года назад

      Welcome!

  • @ramp2011
    @ramp2011 7 лет назад +2

    Awesome video. Thank you. As time permits can you do a video on use of caret package? thank you

    • @bkrai
      @bkrai  5 лет назад

      Saw this today. Thanks for comments!

  • @keshavnemeli
    @keshavnemeli 6 лет назад +1

    @5:47 He says the Average of the variables are converted to zeroes
    @6:34 The means(Average) are non-zero
    I Don't understand can anyone Explain?/

    • @bkrai
      @bkrai  6 лет назад +1

      @5.47 refers to standardizing process before principal component analysis.
      @6.34 provides means of original dataset

  • @mukeshchoudhary2842
    @mukeshchoudhary2842 4 года назад +2

    Great video.. What if we want to include factor-like "Control and Heat" for genotypes? Please suggest

    • @bkrai
      @bkrai  3 года назад

      It should work fine.

  • @ariannaschmid3805
    @ariannaschmid3805 2 года назад +1

    Why do you predict before you build the model? Shouldn't it be the other way around?

    • @bkrai
      @bkrai  2 года назад

      If you are referring to 18:34 time point, note that the predict function is using principal component 'model'.

    • @ariannaschmid3805
      @ariannaschmid3805 2 года назад +1

      @@bkrai What is it that you are trying to predict there? Compared to what you would predict using the regression model?

    • @bkrai
      @bkrai  2 года назад

      Using predict function we are generating principal components. Later, we are using these principal components for developing a classification model. This is a small dataset just to illustrate the process. And will be useful for high dimensional data where one deals with 1000s of variables.

  • @rashmisajwan1724
    @rashmisajwan1724 7 лет назад +1

    I'm using stata, are there any specific commands for principal component analysis PCA in PANEL DATA Or Just simply run PCA after standardizing variables?

    • @bkrai
      @bkrai  7 лет назад

      I've not used stata, so difficult to say what command will be correct.

  • @Jubo256
    @Jubo256 6 лет назад +2

    Hello, you put training [5] to reference the column on trg variable....
    shouldn't it be training[ , 5]?

    • @bkrai
      @bkrai  4 года назад

      It is training[ , 5] in the video.

  • @adityapatnaik7078
    @adityapatnaik7078 6 лет назад +3

    too good!! plz make more such videos...plz!

    • @bkrai
      @bkrai  6 лет назад

      Thanks for comments! You may find this useful too:
      ruclips.net/p/PL34t5iLfZddu8M0jd7pjSVUjvjBOBdYZ1

  • @andreafiore8373
    @andreafiore8373 4 года назад +1

    Thank you, this video will be really helpful to complete my thesis :)

    • @bkrai
      @bkrai  4 года назад +1

      Good luck!

  • @prithvivasireddy5564
    @prithvivasireddy5564 5 лет назад +2

    Awesome video sir...kudos... :)
    1 doubt though .... 20:48 - why are we using 2 components only? How do we know how many principal components to use?(species ~ PC1 + PC2)

    • @bkrai
      @bkrai  5 лет назад

      2 PCs capture more than 95% of the variability in the data. Other 2 only add about 5%. So you can choose to have PCs that capture over 80% or 90% of the variability.

  • @safezonesharing914
    @safezonesharing914 6 лет назад +2

    Thank you for your VDO.
    My R version is 3.5.1 and it cannot allow ggbiplot.
    Do you have any package instead of ggbiplot ?

    • @bkrai
      @bkrai  6 лет назад +1

      Try installing it by running this line:
      install_github("ggbiplot", "vqv")

    • @safezonesharing914
      @safezonesharing914 6 лет назад

      @@bkrai Thank you for your kindly replying
      When I ran it, it would shown like this.
      Error in install_github("ggbiplot", "vqv") :
      could not find function "install_github"

    • @ashishsangwan5925
      @ashishsangwan5925 6 лет назад

      @@safezonesharing914 I'm also getting the same error

    • @ashishsangwan5925
      @ashishsangwan5925 6 лет назад +1

      @@safezonesharing914 try below command. It worked for me
      library(devtools)
      install_github("vqv/ggbiplot")

    • @alexandrec.2939
      @alexandrec.2939 5 лет назад +1

      @@ashishsangwan5925 Arf, for few seconds I believed you were my saver ^^. But nope, your alternative didn't work as well

  • @ziauddinkhan4189
    @ziauddinkhan4189 5 лет назад +1

    Can u upload a tutorial on boxcox transformation for pca to remove the skewness in the data thanks

    • @bkrai
      @bkrai  5 лет назад

      thanks for the suggestion! I've added it to my list.

  • @safeeqahmed3306
    @safeeqahmed3306 6 лет назад +2

    Great video. I have one doubt. What does the stddev attribute of PC contain? Standard deviations of the variables are already in scale..so what does stddev represent? Thanks a lot

    • @bkrai
      @bkrai  6 лет назад

      At what point in time do you see this?

    • @safeeqahmed3306
      @safeeqahmed3306 6 лет назад +1

      Bharatendra Rai sorry it’s sdev attribute of pc and in 9:48 while showing the summary of pc, I would like to know what the standard deviation row denote..thanks a lot

    • @bkrai
      @bkrai  6 лет назад

      It is standard deviation related to principal components. It helps to estimate what percentage of variability is captured by each principal component.

    • @safeeqahmed3306
      @safeeqahmed3306 6 лет назад +1

      Bharatendra Rai thanks a lot. I understand this now

  • @eniolaa77
    @eniolaa77 7 лет назад

    how do i get the file

    • @bkrai
      @bkrai  7 лет назад

      here is the link:
      drive.google.com/open?id=0B5W8CO0Gb2GGYTQ2SWxkc2FCVGs

  • @siddharthabingi
    @siddharthabingi 7 лет назад +3

    Great lecture. Thanks.

    • @bkrai
      @bkrai  5 лет назад

      Thanks!

  • @banamalipanigrahi9972
    @banamalipanigrahi9972 4 года назад +1

    Thank you Sir, I understood so many things. While I was checking with my dataset, I facing little bit problem at nnet library function, the error is coming by saying like that;
    trg$Depth

    • @bkrai
      @bkrai  4 года назад

      I'm nor sure what type of variable is 'depth'. Note that the response variable should be of factor type.

    • @banamalipanigrahi9972
      @banamalipanigrahi9972 4 года назад +1

      Thanks Sir. Yes, it might be the reason... I have three different depths (Sur, Mid, Bot) where I am measuring different chemical compounds/parameters. I want to classify the different parameters with respect to different depths and try to check how efficiently PCA is classifying.

    • @bkrai
      @bkrai  4 года назад

      you can convert Depth to factor using:
      data$Depth

    • @banamalipanigrahi9972
      @banamalipanigrahi9972 4 года назад

      @@bkrai Sir I did it.. thanks a lot for your kind reply. It is much appreciated 🙏🙏🙏

  • @garykuleck1320
    @garykuleck1320 3 года назад +1

    Dr. Rai,
    Thanks for this informative video. I am having a problem getting the predict function to work with the model created on the training dataset. I am getting two errors(paraphrased): 1. NAs not allowed in subscripted assignments; 2. newdata has 1900 rows but variables found have 8100 rows. I think it is looking for the same number of rows in the test dataset. Is there something I am doing wrong? Appreciate any feedback.

    • @bkrai
      @bkrai  3 года назад

      NAs occur when there is missing data. For handling missing values, refer to:
      ruclips.net/video/An7nPLJ0fsg/видео.html

  • @sainandankandikattu9077
    @sainandankandikattu9077 6 лет назад +2

    Awesome video! Could you plz add Partial least squares regression and principal components regression to your playlist! That would be of great help. Thanks in advance!

    • @bkrai
      @bkrai  4 года назад

      Thanks for suggestions!

  • @samdavepollard
    @samdavepollard 7 лет назад +2

    Thank You - this was extremely useful.
    Very nice channel you have here - easy sub.

    • @bkrai
      @bkrai  5 лет назад

      Thanks for comments!

  • @parametersofstatistics2145
    @parametersofstatistics2145 4 года назад +1

    Thanks sir .....can u please tell me how start learning on R from beginning?

    • @bkrai
      @bkrai  4 года назад

      You can start with this playlist:
      ruclips.net/p/PL34t5iLfZddv8tJkZboegN6tmyh2-zr_T

  • @raghavendras5331
    @raghavendras5331 6 лет назад +1

    What is the difference between princomp() and prcomp() commands in r

    • @bkrai
      @bkrai  5 лет назад

      I saw this today. princomp() uses spectral decomposition approach and prcomp() uses singular value decomposition, prcomp() is usually preferred.

  • @aparnakanduri1111
    @aparnakanduri1111 6 лет назад

    Hi Sir,
    How can we detect outliers in PCA

  • @saifsplaka
    @saifsplaka 7 лет назад +2

    Hi Sir,Could you take one session on SVD in R and also some theoretical explanation on it. I m finding it very difficult to understand it with most of the material available on the net.

  • @numitayogesh9280
    @numitayogesh9280 7 лет назад +3

    great lecture..please share your thoughts on machine learning introduction too

    • @bkrai
      @bkrai  7 лет назад

      For machine learning such random forest, neural networks, support vector machines, and extreme gradient boosting, you can refer to following:
      ruclips.net/p/PL34t5iLfZddu8M0jd7pjSVUjvjBOBdYZ1

  • @sebvangeli
    @sebvangeli 7 лет назад +3

    Great work! Thank you

  • @PrimoSchnevi
    @PrimoSchnevi 4 года назад +2

    Hello. I dont know anything about Principal Component Analysis in R: Example with Predictive Model & Biplot Interpretation and i will never need to since thats not in my line of work. I Appreciate your Intromusic though. You are a true champ Bharatendra and enrich this world with your presence. Also that intro music fucking slaps.

    • @bkrai
      @bkrai  4 года назад

      Thanks for comments!