Multiple Correspondence Analysis with FactoMineR

Поделиться
HTML-код
  • Опубликовано: 21 авг 2024
  • How to perform MCA with the R software and the package FactoMineR?
    How to describe the dimensions?

Комментарии • 41

  • @cbartthompson1
    @cbartthompson1 6 лет назад +1

    These videos are tremendously helpful, and well-produced. Thank you very much Drs. Husson and Houee-Bigot. Please keep up the good work!

  • @HussonFrancois
    @HussonFrancois  6 лет назад +3

    You can read this paper:
    Husson, F. & Josse, J. (2014). Multiple Correspondence Analysis. In The Visualization and Verbalization of Data, Greenacre et Blasius, Chapman & Hall

  • @metalliquero100
    @metalliquero100 7 лет назад

    Muchas gracias por el vídeo, fue realmente ilustrativo y me ha ayudado para realizar unos análisis que necesito hacer para un trabajo de la universidad

  • @JamesHarperUS
    @JamesHarperUS 6 лет назад

    Excellent video! Thank you very much!

  • @danieltoscohernandez7552
    @danieltoscohernandez7552 Год назад

    Thank you very much for the video, it's quite useful! One question, in the FactoMineR the variance would correspond to the inertia? If not, is there any formula to obtain it?

    • @HussonFrancois
      @HussonFrancois  Год назад +1

      Yes, the inertia is the variance. We say inertia because it is multidimensional.

  • @YashadaKul
    @YashadaKul 6 лет назад +1

    Great video! Can you please tell me what you mean by "supplementary category" in your video?

    • @manpy5504
      @manpy5504 6 лет назад

      how to individually color and label in the mca plot

    • @HussonFrancois
      @HussonFrancois  6 лет назад +1

      Supplementary variables are variables that are not used to construct the dimensions of MCA, but that are used to interpret these dimensions.

    • @christiansetzkorn6241
      @christiansetzkorn6241 5 лет назад

      @@HussonFrancois So they do not affect the plot at all? Thanks.

  • @SilvanaBuilesG
    @SilvanaBuilesG 7 лет назад

    Dear Francois, very useful video! Thanks ! I have a doubt: My sample is conformed by 232 farmers, which are divided into 45 variables. Three of those variables are not important for the cluster I will run after the MCA, but they are important in terms of location and nature of the producer. e.g. I have all my 232 individuals categorised by their municipalities, by some agroecological zones where those municipalities are located, and by the nature of farmers themselves: whether he or she is a single farmer or whether the individual features as an agricultural enterprise....How could I exclude those three variables from the MCA analysis, specifically how to make explicit in the MCA formule you are showing, that I do not want MCA to take those variables into account to obtain the main components?

    • @HussonFrancois
      @HussonFrancois  7 лет назад

      You can use these variables as supplementary variables, using the quali.sup argument.

    • @SilvanaBuilesG
      @SilvanaBuilesG 7 лет назад

      Merci bcp! :)

  • @Its-N
    @Its-N 5 лет назад

    If I use MCA for Feature Selection (variable selection), how could I know which variables is good enough as Cluster input based on above results a.k.a MCA.summary? Please, answer. Thanks.

  • @nadiak4805
    @nadiak4805 4 года назад

    Hello! Thank you for the video. If we wish to do rotation, can we use var$coord?

  • @yasmine11095
    @yasmine11095 2 года назад

    Hi! Thank you for the video and the package! Is it possible to analyze both continuous and categorical data with MCA?
    Thank you in advance!

    • @HussonFrancois
      @HussonFrancois  2 года назад +1

      In MCA, all the variables that construct the dimensions are categorical, but some continuous variables can be used as supplementary variables (they do not participate to the construction, but can be used to interpret; see the variable age in the video). If you want that both continuous and categorical are used to construct the dimensions, you need to use the FAMD method (Factrial Analysos for Mixed Data; function FAMD in FactoMineR).

    • @HussonFrancois
      @HussonFrancois  2 года назад

      You can have continuous data as supplementary variables, so they do not participate to the construction of the dimensions. If you want that the continuous variables contribute to the construction of dimensions, you should use FAMD: factorial analysis for mixed data.

  • @sabrinahabich-sobiegalla6039
    @sabrinahabich-sobiegalla6039 4 года назад

    Thank you so much for this very helpful video. I have been trying to include survey weights through the row.w option. Can you please give me an advise or an example on how to do that? I have been trying to refer to my "weight" column in different ways, but I always get the error that length of 'dimnames' is not equal to array extent. My weights are between ~0.9 and ~1.4. Trying to turn them into integers changes them to 0 and 1, which doesn't help either. Any hint is much appreciated.

    • @HussonFrancois
      @HussonFrancois  4 года назад

      Hi,
      The argument row.w must be a vector with the weights. So you must use for instance if you have 150 individuals in your data set and if you give a weight of 1 for the first 100 and a weight of 0.5 for the last 50:
      MCA(MyData, row.w=c(rep(1,100), rep(0.5, 50)))

    • @sabrinahabich-sobiegalla6039
      @sabrinahabich-sobiegalla6039 4 года назад

      @@HussonFrancois Thank you so much for your quick reply and help!

  • @gabeavakianorona2551
    @gabeavakianorona2551 6 лет назад +1

    Hi, what can we use for factor/dimension scores for individuals?

    • @HussonFrancois
      @HussonFrancois  6 лет назад +1

      you should use the object
      $ind$coord

    • @gabeavakianorona2551
      @gabeavakianorona2551 6 лет назад

      Thank you very much, however, I still have another question:
      When I pull the mean across all five dimensions, each one equals zero. How can I use coordinates in regression if all the means are zero? Is there a paper that you are aware of and that I can read to understand the interpretations of the coordinates and if it is proper to use them in further analysis.

  • @deepakkannan7409
    @deepakkannan7409 6 лет назад

    Hi
    I am getting this error
    Error in gene[gene >= 6] = 6] - 1 :
    NAs are not allowed in subscripted assignments
    >
    please help me on this , as I could not find anything related to this in internet

    • @HussonFrancois
      @HussonFrancois  6 лет назад

      Can you tell more about your error. What are your lines of code? Can you check that your data are read as you want (quantitative variables and qualitative variables are the variables you want, check with summary).

  • @abderrahmanoujalal1243
    @abderrahmanoujalal1243 3 года назад

    Where can i find the script file?
    this video is very useful!

    • @HussonFrancois
      @HussonFrancois  3 года назад

      All the material is here: husson.github.io/MOOC.html#AnaDOGB

  • @helloWorldPlus
    @helloWorldPlus 4 года назад

    Hello, where can i find the data set? Thanks

    • @HussonFrancois
      @HussonFrancois  4 года назад +2

      Tous les jeux de données et supports de cours sont ici : husson.github.io/MOOC.html#AnaDo

  • @agassinarcadius9767
    @agassinarcadius9767 3 года назад

    great but where can I get the scrip

    • @HussonFrancois
      @HussonFrancois  3 года назад

      All the material is here: husson.github.io/MOOC.html#MCAcourse

  • @SilvanaBuilesG
    @SilvanaBuilesG 7 лет назад

    Francois Hi!. When I run the MCA, I obtain a lot of dimensions, more than 30. I already eliminate some variables that may have caused noise ( i checked or their correlation with the dimensions), so I run it once again, and even if now the number of dimensions is decreased, I still got like 60 dimensions. What to do in this case?...maybe I shuld not be using MCA ? how do you do corelation between categorical variables in R?

    • @HussonFrancois
      @HussonFrancois  7 лет назад +1

      The number of dimensions is equal to the total number of categories minus the number of variables. So if you have many categorical variables with many categories, there will be lot of dimensions. But, MCA allows you to summarize the information on the first dimensions, and this is why it is useful.
      When you explore a dataset, you are interested by this dataset so you have to keep all the variables as active. And then, you interpret the first dimensions. So, don't suppress the variables to increase the percentage of variance explained by the first dimensions.

    • @SilvanaBuilesG
      @SilvanaBuilesG 7 лет назад

      Francois, thanks for you answer :).
      Still, I have a doubt. Normally, we decide to run a principal component method after obtaining correlations between the variables we're examining. If the variables have significant correlation, it is suggested to do PCA or MCA. So, my first uestion is how do you check for the correlation between categorical variables in R before doing the MCA?..... I know ( a priori) that some of my variables will not explain a lot o the variation of the data, take for example, my variable of FERT, which refers to whether farmers apply or not apply fertilizer to the crop I'm studying : 70% of them said they do apply something.....so, based on that 70%, would you still consider that variable in the analysis? ( i have some 6 variables with similar percentages)

    • @HussonFrancois
      @HussonFrancois  7 лет назад +1

      No, the objective of MCA or PCA is to describe a dataset, i.e. the proximities between individuals (taking into account all the variables) as well as the links between the variables. And the principal component methods tell you what are the variables that are linked, but you do not have to previously suppress any variables.

    • @SilvanaBuilesG
      @SilvanaBuilesG 7 лет назад

      Merci Francois!

    • @SilvanaBuilesG
      @SilvanaBuilesG 7 лет назад

      Francois! Hi. i am writing my paper on the analysis I did some months ago. I would like to ask you how to use dimdesc to display all the dimensions with the categories and variables that explain each of them, and not only the three that the program displays by default.

  • @1982Dibya
    @1982Dibya 7 лет назад +1

    Helpful video but couldn't understand what she says.Is she speaking in English?

    • @MrGbruges
      @MrGbruges 5 лет назад +1

      Yes, french accent :)