K-Means Cluster Analysis in SPSS (SPSS Tutorial Video #30)

Поделиться
HTML-код
  • Опубликовано: 3 окт 2024

Комментарии • 71

  • @xeniavlasenko9830
    @xeniavlasenko9830 3 года назад +2

    This is the 5th video I wath on K-Means and it FINALLY made sense. Thank you so much!

    • @DataDemystified
      @DataDemystified  3 года назад

      I'm so glad to hear that! Is there something in particular that made the content here more understandable? I ask so that I can make sure to incorporate that type of teaching in my other videos. Thanks!

    • @xeniavlasenko9830
      @xeniavlasenko9830 3 года назад +1

      @@DataDemystified I guess commenting along the way on how to interpret the results/ how all these program steps and numbers in the tables are part of the "story" was particularly helpful :)

    • @DataDemystified
      @DataDemystified  3 года назад +1

      @@xeniavlasenko9830 Thank you for the feedback! I will make sure to incorporate it into new tutorial videos!

  • @bernardoluca6613
    @bernardoluca6613 3 года назад +3

    Fantastic explanation! nothing to do with all those videos out there! keep going like this!

  • @divyajaiswal4330
    @divyajaiswal4330 Год назад +2

    Can k means clustering data be represented graphically? If yes, how?

  • @erikailles9598
    @erikailles9598 3 года назад +1

    You are a hero!

  • @dsavkay
    @dsavkay 6 месяцев назад

    Great advanced info, subscribed!

  • @GenuineReciprocity
    @GenuineReciprocity 2 года назад +4

    Your videos are so easy to understand and its so amazing how many people your kindness has been helping! I have a small question and was wondering if you can share your insight about it if you have time available. A study that I am trying to replicate has categorized individuals based on whether they score above or below the mean on two variables (i.e., high high, high low, low high, low low - 4 categories). I was advised that that technique was crude and that I should instead use a cluster analysis to categorize the groups. Why would cluster analysis be a better statistical analyses than what the original authors did in categorizing the variables? Sorry to trouble you! I look forward to more of your incredibly helpful videos!

  • @naren1705
    @naren1705 Месяц назад

    Thank you for the excellent videos on clustering. I have a dataset in which the majority of the variables are categorical. Which clustering method would be best for categorical variables? If I convert the categorical variable to have values '1', '2', '3' and so on, could i then use this 'numerised' dataset in hierarchical or k-mean clustering? I would greatly appreciate your thoughts on these questions.

  • @elhamtorkashvand
    @elhamtorkashvand Год назад

    thank you for the great video, would you please explain about how to apply elbow method to find cluster number?

  • @zahraalinam62
    @zahraalinam62 5 месяцев назад

    Which method of hierarchical or K-means is the most appropriate for dichotomous variables with binary coding (0,1) showing the presence and absence of a variable?

  • @vindaflyfox
    @vindaflyfox 4 месяца назад

    Hello, I am wanting to follow this process by doing a hierarchical cluster analysis to determine the k for my k-means analysis. My question is, my variables are not all on the same scale so in the hierarchical cluster analysis I will need to convert them into z-scores or something similar so they are comparable. How does this impact the k-means cluster analysis? Do I need to do an extra step here or will my variables already be converted and able to be used again after the hierachical analysis?

  • @joycethegreat9259
    @joycethegreat9259 5 месяцев назад

    During my conjoint analysis, there is no important value and utilities because spss stated "no analysis is performed because there are no valid cases" how to solve this. I did cluster analysis to get the utilities and std.error of each cluster but after performing conjoint to my one cluster, conjoint won't show results. Please help. I have no missing values, no duplication and whatsoever.

  • @roshikaranjan
    @roshikaranjan 17 дней назад

    Can K-means cluster analysis be performed with all categorical variables?

  • @mahdifareghi3916
    @mahdifareghi3916 11 месяцев назад

    Hello if any video about anaaysis kmean results deeper

  • @rabeeyafarooq2788
    @rabeeyafarooq2788 6 месяцев назад

    How do we define the names as to what is increasing and what is not

  • @zahraalinam62
    @zahraalinam62 5 месяцев назад

    In case the Sig for some variables is bigger than .001 what should we do? Should we screen and remove them and do the cluster analysis again?

  • @lingkan1984
    @lingkan1984 10 месяцев назад

    To cluster analysis for multimorbidity, is there any special format to arrange the data?

  • @transitionperf_MPO
    @transitionperf_MPO 3 года назад +1

    Thank you for a great explanation! I was wondering how to view demographic characteristics between each established cluster. For example, viewing percentage breakdowns of age, gender, etc. in each cluster. Thanks!

  • @mehmettolgataner8878
    @mehmettolgataner8878 5 месяцев назад

    Is it the same on SPSS29?

  • @siddharthgowda4056
    @siddharthgowda4056 13 дней назад

    the final conclusion of that data

  • @tacs3
    @tacs3 10 месяцев назад

    how can we plot this data in spss the way R does? is there a way?

  • @ezeugochukukere1538
    @ezeugochukukere1538 2 года назад

    This is very helpful. Oddly enough the reason i came across this video was because i was searching on how to calculate the initial cluster centers in SPSS.
    I need them for my R script to perfectly replicate the K mean clusters analysis i run in spss...inputting the initial cluster centers calculated in SPSS provides the exact same results for the final cluster solution in R
    ...it was the first thing you said we don't need but i am pretty desperate in my search to find out how those initial cluster centers are calculated. Any help you could provide would be huge

  • @sachikogaming1137
    @sachikogaming1137 2 года назад

    Is it necessary to correlate first the variables before proceeding to clustering. Is it important to select only variables that are correlated, for analysis.

    • @DataDemystified
      @DataDemystified  2 года назад

      Nope. Clustering does not require variables to be correlated.

  • @deborahhaile4191
    @deborahhaile4191 2 года назад

    How can run k-mean clustering algorithm for 40 sample with four variables to group the sample to into two?

  • @musiknation7218
    @musiknation7218 2 года назад

    I need to do assignments between kmean and improved kmean cluster analysis,can pls tell me how to do that

  • @katiesharp8080
    @katiesharp8080 3 года назад +4

    Hi I love your videos, really helping me analysis my dissertation data :) I was wondering if you had any videos that touched on how to identify the characteristics of your clusters? i.e. age, gender, those sort of things?

    • @DataDemystified
      @DataDemystified  3 года назад +2

      I don’t, but basically you’re just going to run either t-tests/ANOVA or cross tabs. You’d use the cluster number as the independent variable and your demographic as the dependent variable. I have a bunch of videos on those techniques in the SPSS playlist on this channel. Good luck!

  • @GhadeerShm
    @GhadeerShm Год назад

    hi can I did references for the way how you had selected the variables ? or what it is called ?

  • @StevenWang82
    @StevenWang82 Год назад

    Thank you very much, this video is very easy to understand !!

  • @musiknation7218
    @musiknation7218 2 года назад

    How to do improved k mean cluster analysis

  • @mariabecker1803
    @mariabecker1803 3 года назад +1

    Dear Jeff, I was wondering if I could ask you one more question. As I am working with z-scores and trying to compare the means (of z-scores) at the end of the cluster analysis in order to show the difference of variables within and between the clusters, I encountered very high means of z-scores ranging up to 4 or 5. Could this be an indication of outliers? Would you suggest me to remove all the outliers before the analysis or would this change the dataset too much and you would just report it as it is? Thank you!!

    • @DataDemystified
      @DataDemystified  3 года назад

      4-5 on a z-score is pretty high. We typically consider statistical outliers as being more than 3 standard deviations from the mean (which translates to a z-score of 3 or more). The choice to remove data, based on outliers, however, is a lot more complex. Did you pre-specifiy that you would do so? Are you doing it because your results, inclusive of the outliers don't "Look good"? The point is to make sure that your exclusion isn't going to artificially inflate Type 1 error (p-hacking). Good luck!

    • @mariabecker1803
      @mariabecker1803 3 года назад +1

      @@DataDemystified I did not pre-specify that I would do that. Just compared to other cluster analysis, with other data, and their results (mean z-scores), mine are very high, so I thought that I might have done a mistake and that it would be best to remove the outliers. However, I do not want to manipulate my data. Maybe it is enough to just mention the high z-scores but leave them in the data? Thank you!

    • @DataDemystified
      @DataDemystified  3 года назад

      ​@@mariabecker1803 I don't know what context you're reporting in (academic paper, school assignment, etc...) but transparency is always a good thing. At minimum, add a footnote with the explanation. Better yet is a robustness check that is explicitly exploratory: see what happens when you drop those outliers. Do the results meaningfully change? If they do, report that and speculate as to why. If they don't, report that as well with a note about how your results are robust to their removal.

    • @mariabecker1803
      @mariabecker1803 3 года назад

      @@DataDemystified Dear Jeff, it´s part of my dissertation so I really want to do a thorough job. I will definetly do a robustness check and am curious to see what will change. So thank you for your advice!

  • @tracyquetzal9477
    @tracyquetzal9477 2 года назад

    Hi Professor, very good presentation. I would like to know how can you understand your cluster in order to label them? What patterns do you find to classify your cluster?

  • @aarinwood4522
    @aarinwood4522 2 года назад

    Great series of videos -- thank you! I do have one follow up question: What are the sample size requirements for Cluster Analysis? Thank you!

  • @aviralbhatt1664
    @aviralbhatt1664 2 года назад

    Hello, I have a doubt and I would really appreciate if you could clarify it. So do we use Hierarchial Cluster Analysis to identify the potential clusters and then K-Means to understand how those clusters are different from each other?

    • @DataDemystified
      @DataDemystified  2 года назад +1

      We use Hierarchical Cluster analysis to identify the most likely # of clusters. We then use k-means to actually create those clusters and explore them. Hope that helps!

    • @aviralbhatt1664
      @aviralbhatt1664 2 года назад

      @@DataDemystified yes it does thanks alot 🙌

  • @anass2243
    @anass2243 Год назад

    I really thank you for this great series of videos they have been so much useful in my research

  • @tacs3
    @tacs3 10 месяцев назад

    thank you so much! for this one and the hierarchical one!

  • @mariabecker1803
    @mariabecker1803 3 года назад +1

    Hi, I was wondering how to read in cluster centers from an external file (after having done the hierarchical clustering) as SPSS always shows error messages (not correct format or one variable name is incorrect). Do you have a video for that? or any solution to my problem?

    • @DataDemystified
      @DataDemystified  3 года назад

      Sorry you're having trouble with that. I don't have a video on the topic and don't often import cluster centers from an external file. Is there a reason you are doing it that way rather than natively running the analysis on the data?

    • @mariabecker1803
      @mariabecker1803 3 года назад

      @@DataDemystified yes, I am using k-means clustering in order to validate the cluster centers/numbers of clusters that I have calculated with hierarchical clustering. Therefore, I want to use the cluster centers that I have (from the hierarcical clustering) as a starting point and see what changes when I do the k-means clustering. However, no matter what I do (even when I do everything according to the literature) I get error messages and SPSS has troubles reading in the cluster centres from an external file. Would you know what I could do to avoid the error messages and get my results?

    • @DataDemystified
      @DataDemystified  3 года назад

      @@mariabecker1803 Got it. One option is to just re-run your hierarchical clustering with the original data and then, in the same data file, run the k-means clustering. Save the cluster membership for both analyses, and then do your comparison. If that's not possible and the import isn't working, you can always do it manually. As in, sort the data by some identifier and copy and paste the column of data from your original data (where the hierarchical analysis is) into the new data file (where you plan to run k-means). I hope that helps!

    • @mariabecker1803
      @mariabecker1803 3 года назад

      @@DataDemystified Thank you! I have tried that already and it works to compare the two in the same data file. This is not the problem. However, I saw that the cluster memberships are completely different (hierarchical and kmeans), therefore I wanted to do the k-means clustering with the same cluster centers as I discovered in the hierarchical in order to see where the difference is when both have the same starting point, if that makes sense? It is just that there is no other way in order to put in the starting points (cluster centers) manually and only do it with the read in, I guess? which in my case is not working. Therefore, I do not know how to proceed.

    • @DataDemystified
      @DataDemystified  3 года назад

      @@mariabecker1803 My only suggestion at this point is to make sure you are using Ward's Method in your hierarchical clustering. That tends to give results closest to k-means. Good luck!

  • @jessicamartin1446
    @jessicamartin1446 2 года назад

    Great! I was able to complete my entire assignment, using only this video

  • @LXiao33
    @LXiao33 3 года назад +1

    brilliant! thank you for uploading this video!

    • @DataDemystified
      @DataDemystified  3 года назад

      My pleasure!

    • @LXiao33
      @LXiao33 3 года назад

      @@DataDemystified I wonder whether I should choose cluster analysis in SPSS or perform latent class analysis using Mplus to identify the underlying groups in my data, I am still a bit confused. Can you kindly provide some advice? Thank you.

    • @DataDemystified
      @DataDemystified  3 года назад

      @@LXiao33 That entirely depends on your research question. Without knowing that, I really can't answer your question. Sorry!

  • @abdullahisani9746
    @abdullahisani9746 Год назад

    Thanks for the demonstration

  • @nawilliam2754
    @nawilliam2754 2 года назад

    After a long search , finally something easy to understand

  • @martinpeikert6746
    @martinpeikert6746 Год назад

    So clear, thank you so much!

  • @lydialim1993
    @lydialim1993 3 года назад +1

    Wonderful series! Keep it up!

    • @DataDemystified
      @DataDemystified  3 года назад

      Thank you! Any topics you'd specifically like to see covered?

    • @lydialim1993
      @lydialim1993 3 года назад

      @@DataDemystified Any chance you'll do one on Structural Equation Modelling? Like I know it's a bunch of regressions under the hood, but it would be nice to see a proper demo of how to use one in real life.

    • @DataDemystified
      @DataDemystified  3 года назад +1

      @@lydialim1993 Great idea, but I don't know if that'll happen any time soon. The challenge is that you need the AMOS package for SPSS, which most people don't have (including me, at the moment). That said, I'll look into how much demand there is for something like this! Thanks for the suggestion!

  • @miakirk7010
    @miakirk7010 2 года назад

    Very clear explanations. Thank you.

  • @Netsi-ed6ee
    @Netsi-ed6ee 6 месяцев назад

    prof i need your support can you help me