Identifying Multivariate Outliers with Mahalanobis Distance in SPSS

Поделиться
HTML-код
  • Опубликовано: 2 янв 2025

Комментарии • 135

  • @scarlettthorn9060
    @scarlettthorn9060 3 года назад +27

    Honestly at this point I want to acknowledge you in my thesis thank you notes. Thank you Dr Grande, you are a gem.

  • @Swityie
    @Swityie 2 года назад +1

    Dr Todd, you've saved my life! I was dying with the Mahalonobis!!! Was crying at midnight while getting stuck at this.
    Thank you again!

  • @mimimcgee5512
    @mimimcgee5512 3 года назад +4

    Thank you for another helpful video. I am just a month or so away from receiving my doctorate and your videos have greatly assisted me in that! I'm brushing up in prep for my final defense and appreciate all of your videos. Thank you!

  • @yacinehajji1784
    @yacinehajji1784 9 лет назад +1

    I would like to thank you for speaking loudly and slowly, very usefull for someone not native English like me.

  • @shapsgh
    @shapsgh 3 года назад

    Just Realized that the values of MD and Chi-Square test exactly match the output of the AMOS' outlier table. Thanks Dr. Grande

  • @thomasbarnes5703
    @thomasbarnes5703 3 года назад

    Thank You Dr. Grande, I have no background in statistics....yet had to take a course as a portion of my degree requirements. Your video have really helped me understand this very difficult subject!!

  •  3 года назад

    This video was very helpful! Thanks for sharing your knowledge for free on RUclips!

  • @ThatFellowOnline
    @ThatFellowOnline 7 лет назад

    Fabulous video, explained clearly, concisely. I like how you have also shown the importance of labelling data properly and presentation (decimals) etc as this is really important when keeping data organised i.e. not just focusing on having a tidy output.

    • @DrGrande
      @DrGrande  7 лет назад

      I am glad you found this video useful - thanks for watching.

  • @krunal699
    @krunal699 2 года назад

    Dr Grande you are a saviour! Thank You!

  • @arnelferaer6486
    @arnelferaer6486 4 года назад +6

    Dude you're a legend. Thank you for this.

  • @naftalibendavid
    @naftalibendavid 3 года назад

    This has proven so helpful again and again! Thanks.

  • @CoreFocusCoaching
    @CoreFocusCoaching 3 года назад +1

    Amazing!! You should do a separate video for the Chi-square distribution. Nowhere on RUclips is the second part to the explanation and because it is not overtly flagged in the title it does not show up.
    Either way thank you so much!!

  • @efrestein
    @efrestein 5 лет назад +1

    Your videos add a ton of value!

  • @oscarespinozaparra6840
    @oscarespinozaparra6840 8 лет назад

    Thank you Todd Grande for this extraordinary how to video. This was a prayer answered and feel so much better listening and following your instructions. I want to express how sincerely grateful for the detail analysis and steps you indicated on this video.

  • @denniscraggs8393
    @denniscraggs8393 6 лет назад

    I liked your presentation. SPSS has evolved from the old text script product. I am a current user of both Minitab and Matlab.
    I am studying the Mahalanobis Distance and see that it has many applications. The SAE and ZVEI published a standard where electronics were judged to be fit for use in a temperature x voltage environment defined by a potato shape. However, they never provided a method of dealing with the different unit scale distances. I am thinking the Mahalanobis Distance would be a more technically correct means of classifying a component's fitness for use in a temperature x voltage environment.

  • @zarifbaihaqi8538
    @zarifbaihaqi8538 4 года назад +1

    Thank you very much Dr Tod..you helped me a lot.....

  • @lyrahazel2079
    @lyrahazel2079 4 года назад

    Omg thank you i was so frustated . My data wouldnt met the normal multivar assumption until i stumbled onto this!

  • @voltisathartori6451
    @voltisathartori6451 6 лет назад

    Thank you Dr Todd, for such a awesome explanation.It was very beneficial for my study to move on.

  • @mohammedimam3651
    @mohammedimam3651 3 года назад

    Wooooow! This is extremely useful! Thank you! 👌

  • @muhammadfaisal9918
    @muhammadfaisal9918 5 лет назад +4

    Thank you Dr. Todd for your awesome work. This is a very useful video. I am wondering if you could mention the reference for this process (or a reference for the significance value - is it by Tabachnick & Fidell 2007?). Many thanks

    • @sebastiankruse4981
      @sebastiankruse4981 3 года назад

      Hair et al 2010 also recommend this process. They suggest to divide MD by the number of predictors and then designate outliers in small samples if these values surpasse 2.5 and in large samples if they surpass 4. I think the 2.5 cutoff point corresponds very closely to the .001 p-value used by Dr. Grande.

    • @Lello991
      @Lello991 3 года назад

      @@sebastiankruse4981 Hi! Could you please provide the full reference for Hair et al 2010? Is it this one?
      Hair, J.F., Black, W.C., Babin, B.J., & Anderson, R.E. (2010). Multivariate Data Analysis. Seventh Edition. Prentice Hall, Upper Saddle River, New Jersey

    • @sebastiankruse4981
      @sebastiankruse4981 3 года назад

      @@Lello991 yes, that‘s the one

  • @fasamad6730
    @fasamad6730 7 лет назад

    Wonderful explanation. Enjoyed the session. Thank u Todd Grande it was a great help

    • @DrGrande
      @DrGrande  7 лет назад

      You're welcome, thanks for watching -

  • @herix7342
    @herix7342 3 года назад

    Great contribution! Is there any reference for the described procedure?

  • @ammaarkidwai2732
    @ammaarkidwai2732 4 года назад +1

    Hi Todd! Great video as usual. Why was the cut off for the probability_MD column .001? Is that the norm cut off or based on your data?

  • @ibrahimmkheimer5311
    @ibrahimmkheimer5311 4 года назад +1

    awesome video dr

  • @payonrayaneh
    @payonrayaneh 9 лет назад

    Very useful......Thanks a lot professor Grande.

  • @Elianaco
    @Elianaco Год назад

    Hello, thank you for your helpful videos. Quick one, I'm running a moderation with multiple mediators. Are mediator variables independent variables? I'm trying to run the Malanobis distance but unsure if I should add my mediators to the IV box. Thank you

  • @Thejubeabides24
    @Thejubeabides24 5 лет назад +1

    Excellent video!

  • @wongjanice7753
    @wongjanice7753 8 лет назад +1

    Thank for your sharing! I would like to ask a question: if i detected 8 outliers with Mahalanobis distance, is this necessary for me to delete all outliers ? or 8 outliers out of 200 respondents is still in acceptable range ? is there any reference mention about it ?

    • @j.a.o.5535
      @j.a.o.5535 8 лет назад

      +Wong Janice According to Mead and Craig (2012, Identifying Careless Responses in Survey Data), you may have up to 20 careless responders, especially if you used web-based questionnaires, so I would eliminate those 8 outliers to improve the quality of the data, although it is not always a straightforward rule.

  • @jaynastics2
    @jaynastics2 4 года назад +1

    Very helpful video!

  • @thoshsamanthar4815
    @thoshsamanthar4815 6 лет назад +1

    Dr Todd, the video helped me a lot. I have 2 questions
    1) I have an integrated framework, where analysis is done in 2 stages. Should I check MD for each stage? One of my variable will look like a mediator but it is not. It will be a DV in first stage and subsequently an IV in 2nd stage of the analysis. Stage 1 and stage 2 does not have any connection. I have done each testing and got different Prob_MD / outliers to be deleted.
    2) Should I include demographic questions as part of df, as the prob outliers results are different when I omit or include?

  • @felipemcse
    @felipemcse 8 лет назад +1

    Thanks for the video, Todd. Do you have some references that explains why the number of degree of freedom should be the same of the number of variables?

  • @HarerimanaAlexis
    @HarerimanaAlexis 6 лет назад

    Dear Dr Todd, Thank you very much for this wonderful video. I h
    ave the same question about how do you decide on the degree of freedom, and whether .001 is the absolute rule. Thank you

  • @evannadhim6631
    @evannadhim6631 8 лет назад

    Todd, thank you so much for this clear explanation, but you've done the identification for multivariate outliers with Mahalanobis distance for the cases.
    My question: is there any differnce if we can do it for variables?
    As the variables have their onw distributions while they are affected by the outliers

  • @GeeWhit
    @GeeWhit 8 лет назад +1

    Thanks for the great video!
    Does this method expose two-tailed outliers? If not, how can this be achieved?

  • @MrFoganholo
    @MrFoganholo 9 лет назад +4

    Todd, great explanation! Thanks. One question: Why you used 3 as degree of freedom? Why you used .001 as reference? Can I use for any sample? Thanks again.

    • @DrGrande
      @DrGrande  9 лет назад +9

      +André Foganholo Three degress of freedom were used because there were three variables in the analysis. Using the probability of .001 is a common practice when identifying multivariate outliers.

    • @n.einstein6088
      @n.einstein6088 8 лет назад +18

      +André Foganholo as a reference for the .001 threshold I used Tabachnick, B.G., & Fidell, L.S. (2007). Using Multivariate Statistics (5th Ed.). Boston: Pearson. (p. 74). according to www-01.ibm.com/support/docview.wss?uid=swg21480128. just in case anyone needs that.

    • @wenyuanliu4602
      @wenyuanliu4602 6 лет назад

      Thanks everyone!

    • @richguides10
      @richguides10 5 лет назад

      He used 3 because of the number of independent variables. Thank you

  • @St0rytell3r
    @St0rytell3r 6 лет назад

    Thanks for the video, very thorough.

  • @jameslebron9412
    @jameslebron9412 7 лет назад

    Dear Todd nice video clip. I have a question that in your video i think you are using 3 independent variable and 1 dependent variable so actually you are using 4 variables totally.
    I guess degree of freedom in this case is 4-1 = 3 since you are measuring distance on the 4 dimensional scales.

  • @thewaterhub
    @thewaterhub 9 лет назад

    Thank you, very useful video and clear explanation.

  • @abdulmoeed4661
    @abdulmoeed4661 2 года назад

    If we have more than one independent latent variables, mediators and final dependent variable, how we would place them in the 'Independent & Dependent ' variables list box while doing this test? Thanks Waiting for response.

    • @sinemkaraoglu1717
      @sinemkaraoglu1717 2 года назад

      Hello, did you find the answer to this question?

  • @zohalh14
    @zohalh14 3 года назад

    Thanks for the video! Can you use Mahalanobis distance if your IVs are categorical in a mixed anova?

  • @kamrannawaz
    @kamrannawaz 7 лет назад +1

    Thanks very helpful.....I understand that why you used 3 as DF, however please explain what is Chi Square?

  • @alibezzaa809
    @alibezzaa809 4 года назад

    I really appreciate the efforts your are putting to making concepts easy to understand. Do you have a video on transforming a multivariate outlier to a dummy variable.

  • @thankyou6555
    @thankyou6555 2 года назад

    Thank you! Very helpful.

  • @omidmahdieh7882
    @omidmahdieh7882 2 года назад

    Hello Dr. Grande. Thanks for your helpful demonstration. Can items be used to calculate Mahalanobis distances? Or should I use variables. I mean composite variables.

  • @henkpiet1908
    @henkpiet1908 Год назад

    What do I do if there’s a missing value in one of the scales when I use pair wise deletion for my regression. In that case the mahalanobis distance returns a missing value as well.

  • @karimatouati5256
    @karimatouati5256 4 года назад

    Thank you for this useful video. I have a question please : What to do in case of ordinal variables when checking for these outliers ? what method is the adequate one? Mahala Distance or Cook's Diastance ?
    Does it have sense to apply this method when my data is only composed with ordinal variables and not continuous ones ?

  • @chinchinhoh7893
    @chinchinhoh7893 6 лет назад

    Dr Grande, 1 question. Frequently, the examples of identifying & handling outliers are about independent variables. Does it mean that we don;t have to identify & handle the outliers of dependent variables? TQ!

  • @polomarco1256
    @polomarco1256 4 года назад

    hi. Dr. Todd. Thanks for sharing knowledge. May I ask you something? Can I use Mahalanobis distance for identify multivariate outliers with ordinal data?

  • @moroomario4007
    @moroomario4007 2 года назад

    Sir, if I used a Likert scale, the DV should be the mean score of all the items and IV should be the score of each items?

  • @prof.thakshilakumari7847
    @prof.thakshilakumari7847 6 лет назад

    Thank you so much I followed your video and did the test with my sample. But I have a question on the degree of freedom? why you consider it 3?

  • @khaledlahlouh6944
    @khaledlahlouh6944 4 года назад

    Dear Dr. Todd, how should we do when we have a model with many IV, two mediators and two VD ? should we consider the mediators as IV ?

  • @rashidsaid-ti3jz
    @rashidsaid-ti3jz 5 лет назад

    Thank you Dr.Todd for these useful lessons. Please can you mention for the reference of using formula which you wrote in compute variable.
    1-..chi(mahalanobis, df).
    Thanks alot

    • @nahk-lx2tn
      @nahk-lx2tn 5 лет назад

      rashid said he is not replying to actual questions. That’s sad

    • @rashidsaid-ti3jz
      @rashidsaid-ti3jz 5 лет назад

      @@nahk-lx2tn hi wasim, I found the reference (hair, 2014)

  • @marinacuk1400
    @marinacuk1400 9 лет назад

    Thanks you for this very helpfully video. Whether these method may be applied to lognormal datasets? Whether it is necessary the data to follow a normal distribution?

  • @cecyliaadamczak4301
    @cecyliaadamczak4301 2 года назад

    Hi Dr. Grande, can we include the outcome variable (DV) with the IV in the mahalanobis distance analysis?

  • @ljubomirpupovac2009
    @ljubomirpupovac2009 8 лет назад

    Hi Todd. Thanks for the video. Just one question: your main independent variable is program? Shouldn't we compare MAH_1 value for samples that received treatment and ones that didn't? The things is, main independent variable is not used in the analyze, so whatever value I put there the results (removed cases will be the same). Regards

  • @ravindarmadishetty736
    @ravindarmadishetty736 7 лет назад

    Dear Todd good explanation. The outliers which we got are similar to Residual(Actual-Predicted) outliers to remove from the data?

  • @jongsuksong7493
    @jongsuksong7493 8 лет назад

    Thank you so much for your great explanation! It really helped me a lot!

    • @DrGrande
      @DrGrande  8 лет назад

      I'm glad you found the video useful. Thanks for watching.

  • @chinhankim
    @chinhankim 5 лет назад

    Dr.Grande, I have two independent variables and three mediation variables of one dependent variable. Question is should I put five variables(independent plus mediation variables) to figure out outliers or should I put only two independent variables? Thanks.

  • @annabelleatkin1884
    @annabelleatkin1884 6 лет назад

    Would you include control variables as predictors in the regression? And if you're testing a latent interaction in MPlus, do you simply input the observed variables into the regression in SPSS to do this test?

  • @KristinColletteScott
    @KristinColletteScott 6 лет назад

    Hi Dr. Grande,
    I've got 7 constructs (3 IVs, 3 intermediary, and 1 DV) each with multiple items. How do do you recommend handling these when searching for D2? I also need to test for multivariate normality using the Wald statistic on the same data set. Do you have a video on that?

  • @shafeekafadlikhzamri7068
    @shafeekafadlikhzamri7068 5 лет назад

    hello Dr.Todd. Your video helped a lot and the steps are easily understood. but i seemed to have too many outliers , i would like to have your contact to ask you regarding this matter.

  • @barbaratoson6455
    @barbaratoson6455 7 лет назад

    Great video. Could you recommend a method to identify outliers in an RM ANOVA set up? I am looking for something similar to INFLUENCE option in SAS MIXED procedure but for SPSS

  • @RichardMcCrory_Neph
    @RichardMcCrory_Neph 7 лет назад +1

    +Todd Grande - could I check the degrees of freedom for the Chi-Square distribution is n or n-1. e.g. for 20 variables, is the d.f. 20 or 19?

  • @chriskeran4480
    @chriskeran4480 9 лет назад

    Dr. Grande--thank you kindly. Awesome demonstration. The question I have relates to the number of independent variables (IV) chosen when calculating a Mahalanobis Distance (MD). Should the particular IVs chosen be related in some way or can you through in all of your numeric variables into the one regression when attempting to find multivariate outliers using MD?

  • @hafizahusairi
    @hafizahusairi 5 лет назад +2

    Thank You!! I more understand after watching your video =)

  • @wpadilla72
    @wpadilla72 5 лет назад

    Dear Dr. Grande, my variables are measured by likert scale...how must be applied the Mahalanobis test in this cases?...thanks

  • @micahgardner7836
    @micahgardner7836 3 года назад

    what if one of your variables was excluded by SPSS when calculating Mahalonobis Distance? Are the degrees of freedom the same, or would you minus one? Example, 5 variables entered but one was excluded. Would degrees of freedom be 5 or 4?

  • @kathrinho9136
    @kathrinho9136 9 лет назад

    Hi, I have one question on the method. Hope you can help me :). In your data set, you have your manipulations, descripted as "program" and then you said that you have your independents named "functioning, severity, motivation". 'Why do additional metric independents exist in your file? In my data set I have 2 independents but they are in a nominal scale. So, what do I put in the text box of the linear regression where it says "independents"? Thanks in advance!!

  • @loversloss101
    @loversloss101 5 лет назад

    So what happens when you follow these instructions and every number you get for the MAH_1 is the same?

  • @patfennell
    @patfennell 8 лет назад

    Great video - thanks for posting!

    • @DrGrande
      @DrGrande  8 лет назад

      You're welcome - thanks for watching.

  • @maheshvykuntam2809
    @maheshvykuntam2809 7 лет назад

    +Todd Grande - Thanks a lot for the great explanation. Could you please help me in understanding- 1. Will this process work even if we have missing values. Why do we use DF as 'n' y not n-1.? Thanks a lot for the help.

  • @frajtervivien
    @frajtervivien 9 лет назад +1

    Thank you so much it was a lifesaver!

  • @ninab6136
    @ninab6136 8 лет назад

    so i guess mahalonobis cant be calculated when you have missing values somewhere in the items. any other way i can include those cases?

  • @94bfm
    @94bfm 6 лет назад

    Great explanation! Thank you so much!

  • @HughMupfunya
    @HughMupfunya 5 лет назад +1

    Awesome... Thank you very much

  • @jahanzaibalvi2010
    @jahanzaibalvi2010 8 месяцев назад

    thats great. thank you so much sir

  • @devildman3128
    @devildman3128 9 лет назад

    hi, are there any changes to be made if I find negative values for the probability_MD?

  • @desterward
    @desterward 7 лет назад

    Hi. Is it possible to use it in non-linear multivariate as well? Thanks

  • @guitaqui
    @guitaqui 2 года назад

    Perfect !!! Thank you!!!

  • @rahimbehrad63
    @rahimbehrad63 8 лет назад +1

    Thanks Dear Todd. great !

  • @xunzhou962
    @xunzhou962 9 лет назад

    Exactly what i need! Thank you!

  • @ainannur5836
    @ainannur5836 7 лет назад

    Mr Todd, I have 4 variable; AsliG, AsliB, GreenBP, and BlueBP. I want to know the value of Mahalanobis distance between (AsliG AsliB) and (GreenBP BlueBP). Can I calculate its variable using Mahalanobis distance using SPSS? Why I cant input 2 variable in dependent and two independent other in SPSS?

  • @oliviasimms3897
    @oliviasimms3897 3 года назад

    Hi, does anyone know why it won't give me output when I add two variables to the 'independents box? I can get output for them both separately but cannot get 1 output for them both

  • @next_trip_loading
    @next_trip_loading 6 лет назад

    can we apply ANOVA for the factor at 2 level? I have seen lot of studies using 2 levels and testing it with ANOVA.. Secondly, don't know how they check the normality when they use single item likert scale .. could you please explain me this concept

  • @selamawitweldegebriel3421
    @selamawitweldegebriel3421 4 года назад

    This was very helpful, how do we contact you. Cause I have an urgent problem

  • @godnkr236
    @godnkr236 6 лет назад

    thanks for this amazing video!

  • @adrianfajar323
    @adrianfajar323 5 лет назад

    prof, i have 3 dependent variable and 6 independent variable, how to see mahalanobis ?

  • @madiharazzam1098
    @madiharazzam1098 7 лет назад

    i have a sample of 300 and 2 predictors. what would be the Mahalanobis Distance for it???

  • @harithfarhan5535
    @harithfarhan5535 3 года назад +1

    thanks for this

  • @Oz4rmEg
    @Oz4rmEg 3 года назад

    Best vid ever

  • @moeshams4504
    @moeshams4504 4 года назад +1

    Excellent!

  • @alexandrafiedler3113
    @alexandrafiedler3113 4 года назад

    Do I use for CLP-Analysis (2-waved longitudinal design) the dependend variable time 1 or time 2? sorry but i am confused whether i compute Mahalanobis d for the regression term in my CLP-Model with: Dependend Variable (t2) regressed ON --> Dependend Variable (t1), independent Variable (t1), Moderator (t1). Or it wont matter if I do the mahalanobis for a simple regression time 1: Y1 regressed ON --> X1, M1 (and what about my second independent variable ? - should i put it into the regression for timepoint 1, too?)
    I would be very glad if anybody could help me with this confusion !! :D

  • @priyas8052
    @priyas8052 9 лет назад

    What if you get zero as a result for one of the rows?

  • @zubairawan9088
    @zubairawan9088 Год назад

    Why have you selected the p-value to be 0.001?

  • @jiffylimborks
    @jiffylimborks Месяц назад

    many thanks 🧡

  • @sskshats6453
    @sskshats6453 8 лет назад

    Thanks Alot. May.Allah bless you

  • @sskshats6453
    @sskshats6453 4 года назад

    what if we have 5 dependent variables and just one independent ??

  • @dr.mayankpant1571
    @dr.mayankpant1571 6 лет назад

    If we have 5 iv then what is the degrees of freedom

  • @nerdofilo
    @nerdofilo 8 лет назад

    is this related to MCD?

  • @farhanselfatan
    @farhanselfatan Год назад

    Thank you dr