Thank you for another helpful video. I am just a month or so away from receiving my doctorate and your videos have greatly assisted me in that! I'm brushing up in prep for my final defense and appreciate all of your videos. Thank you!
Thank You Dr. Grande, I have no background in statistics....yet had to take a course as a portion of my degree requirements. Your video have really helped me understand this very difficult subject!!
3 года назад
This video was very helpful! Thanks for sharing your knowledge for free on RUclips!
Fabulous video, explained clearly, concisely. I like how you have also shown the importance of labelling data properly and presentation (decimals) etc as this is really important when keeping data organised i.e. not just focusing on having a tidy output.
Amazing!! You should do a separate video for the Chi-square distribution. Nowhere on RUclips is the second part to the explanation and because it is not overtly flagged in the title it does not show up. Either way thank you so much!!
Thank you Todd Grande for this extraordinary how to video. This was a prayer answered and feel so much better listening and following your instructions. I want to express how sincerely grateful for the detail analysis and steps you indicated on this video.
I liked your presentation. SPSS has evolved from the old text script product. I am a current user of both Minitab and Matlab. I am studying the Mahalanobis Distance and see that it has many applications. The SAE and ZVEI published a standard where electronics were judged to be fit for use in a temperature x voltage environment defined by a potato shape. However, they never provided a method of dealing with the different unit scale distances. I am thinking the Mahalanobis Distance would be a more technically correct means of classifying a component's fitness for use in a temperature x voltage environment.
Thank you Dr. Todd for your awesome work. This is a very useful video. I am wondering if you could mention the reference for this process (or a reference for the significance value - is it by Tabachnick & Fidell 2007?). Many thanks
Hair et al 2010 also recommend this process. They suggest to divide MD by the number of predictors and then designate outliers in small samples if these values surpasse 2.5 and in large samples if they surpass 4. I think the 2.5 cutoff point corresponds very closely to the .001 p-value used by Dr. Grande.
@@sebastiankruse4981 Hi! Could you please provide the full reference for Hair et al 2010? Is it this one? Hair, J.F., Black, W.C., Babin, B.J., & Anderson, R.E. (2010). Multivariate Data Analysis. Seventh Edition. Prentice Hall, Upper Saddle River, New Jersey
Hello, thank you for your helpful videos. Quick one, I'm running a moderation with multiple mediators. Are mediator variables independent variables? I'm trying to run the Malanobis distance but unsure if I should add my mediators to the IV box. Thank you
Thank for your sharing! I would like to ask a question: if i detected 8 outliers with Mahalanobis distance, is this necessary for me to delete all outliers ? or 8 outliers out of 200 respondents is still in acceptable range ? is there any reference mention about it ?
+Wong Janice According to Mead and Craig (2012, Identifying Careless Responses in Survey Data), you may have up to 20 careless responders, especially if you used web-based questionnaires, so I would eliminate those 8 outliers to improve the quality of the data, although it is not always a straightforward rule.
Dr Todd, the video helped me a lot. I have 2 questions 1) I have an integrated framework, where analysis is done in 2 stages. Should I check MD for each stage? One of my variable will look like a mediator but it is not. It will be a DV in first stage and subsequently an IV in 2nd stage of the analysis. Stage 1 and stage 2 does not have any connection. I have done each testing and got different Prob_MD / outliers to be deleted. 2) Should I include demographic questions as part of df, as the prob outliers results are different when I omit or include?
Thanks for the video, Todd. Do you have some references that explains why the number of degree of freedom should be the same of the number of variables?
Dear Dr Todd, Thank you very much for this wonderful video. I h ave the same question about how do you decide on the degree of freedom, and whether .001 is the absolute rule. Thank you
Todd, thank you so much for this clear explanation, but you've done the identification for multivariate outliers with Mahalanobis distance for the cases. My question: is there any differnce if we can do it for variables? As the variables have their onw distributions while they are affected by the outliers
Todd, great explanation! Thanks. One question: Why you used 3 as degree of freedom? Why you used .001 as reference? Can I use for any sample? Thanks again.
+André Foganholo Three degress of freedom were used because there were three variables in the analysis. Using the probability of .001 is a common practice when identifying multivariate outliers.
+André Foganholo as a reference for the .001 threshold I used Tabachnick, B.G., & Fidell, L.S. (2007). Using Multivariate Statistics (5th Ed.). Boston: Pearson. (p. 74). according to www-01.ibm.com/support/docview.wss?uid=swg21480128. just in case anyone needs that.
Dear Todd nice video clip. I have a question that in your video i think you are using 3 independent variable and 1 dependent variable so actually you are using 4 variables totally. I guess degree of freedom in this case is 4-1 = 3 since you are measuring distance on the 4 dimensional scales.
If we have more than one independent latent variables, mediators and final dependent variable, how we would place them in the 'Independent & Dependent ' variables list box while doing this test? Thanks Waiting for response.
I really appreciate the efforts your are putting to making concepts easy to understand. Do you have a video on transforming a multivariate outlier to a dummy variable.
Hello Dr. Grande. Thanks for your helpful demonstration. Can items be used to calculate Mahalanobis distances? Or should I use variables. I mean composite variables.
What do I do if there’s a missing value in one of the scales when I use pair wise deletion for my regression. In that case the mahalanobis distance returns a missing value as well.
Thank you for this useful video. I have a question please : What to do in case of ordinal variables when checking for these outliers ? what method is the adequate one? Mahala Distance or Cook's Diastance ? Does it have sense to apply this method when my data is only composed with ordinal variables and not continuous ones ?
Dr Grande, 1 question. Frequently, the examples of identifying & handling outliers are about independent variables. Does it mean that we don;t have to identify & handle the outliers of dependent variables? TQ!
hi. Dr. Todd. Thanks for sharing knowledge. May I ask you something? Can I use Mahalanobis distance for identify multivariate outliers with ordinal data?
Thank you Dr.Todd for these useful lessons. Please can you mention for the reference of using formula which you wrote in compute variable. 1-..chi(mahalanobis, df). Thanks alot
Thanks you for this very helpfully video. Whether these method may be applied to lognormal datasets? Whether it is necessary the data to follow a normal distribution?
Hi Todd. Thanks for the video. Just one question: your main independent variable is program? Shouldn't we compare MAH_1 value for samples that received treatment and ones that didn't? The things is, main independent variable is not used in the analyze, so whatever value I put there the results (removed cases will be the same). Regards
Dr.Grande, I have two independent variables and three mediation variables of one dependent variable. Question is should I put five variables(independent plus mediation variables) to figure out outliers or should I put only two independent variables? Thanks.
Would you include control variables as predictors in the regression? And if you're testing a latent interaction in MPlus, do you simply input the observed variables into the regression in SPSS to do this test?
Hi Dr. Grande, I've got 7 constructs (3 IVs, 3 intermediary, and 1 DV) each with multiple items. How do do you recommend handling these when searching for D2? I also need to test for multivariate normality using the Wald statistic on the same data set. Do you have a video on that?
hello Dr.Todd. Your video helped a lot and the steps are easily understood. but i seemed to have too many outliers , i would like to have your contact to ask you regarding this matter.
Great video. Could you recommend a method to identify outliers in an RM ANOVA set up? I am looking for something similar to INFLUENCE option in SAS MIXED procedure but for SPSS
Dr. Grande--thank you kindly. Awesome demonstration. The question I have relates to the number of independent variables (IV) chosen when calculating a Mahalanobis Distance (MD). Should the particular IVs chosen be related in some way or can you through in all of your numeric variables into the one regression when attempting to find multivariate outliers using MD?
what if one of your variables was excluded by SPSS when calculating Mahalonobis Distance? Are the degrees of freedom the same, or would you minus one? Example, 5 variables entered but one was excluded. Would degrees of freedom be 5 or 4?
Hi, I have one question on the method. Hope you can help me :). In your data set, you have your manipulations, descripted as "program" and then you said that you have your independents named "functioning, severity, motivation". 'Why do additional metric independents exist in your file? In my data set I have 2 independents but they are in a nominal scale. So, what do I put in the text box of the linear regression where it says "independents"? Thanks in advance!!
+Todd Grande - Thanks a lot for the great explanation. Could you please help me in understanding- 1. Will this process work even if we have missing values. Why do we use DF as 'n' y not n-1.? Thanks a lot for the help.
Mr Todd, I have 4 variable; AsliG, AsliB, GreenBP, and BlueBP. I want to know the value of Mahalanobis distance between (AsliG AsliB) and (GreenBP BlueBP). Can I calculate its variable using Mahalanobis distance using SPSS? Why I cant input 2 variable in dependent and two independent other in SPSS?
Hi, does anyone know why it won't give me output when I add two variables to the 'independents box? I can get output for them both separately but cannot get 1 output for them both
can we apply ANOVA for the factor at 2 level? I have seen lot of studies using 2 levels and testing it with ANOVA.. Secondly, don't know how they check the normality when they use single item likert scale .. could you please explain me this concept
Do I use for CLP-Analysis (2-waved longitudinal design) the dependend variable time 1 or time 2? sorry but i am confused whether i compute Mahalanobis d for the regression term in my CLP-Model with: Dependend Variable (t2) regressed ON --> Dependend Variable (t1), independent Variable (t1), Moderator (t1). Or it wont matter if I do the mahalanobis for a simple regression time 1: Y1 regressed ON --> X1, M1 (and what about my second independent variable ? - should i put it into the regression for timepoint 1, too?) I would be very glad if anybody could help me with this confusion !! :D
Honestly at this point I want to acknowledge you in my thesis thank you notes. Thank you Dr Grande, you are a gem.
did you do it lol?
Dr Todd, you've saved my life! I was dying with the Mahalonobis!!! Was crying at midnight while getting stuck at this.
Thank you again!
Thank you for another helpful video. I am just a month or so away from receiving my doctorate and your videos have greatly assisted me in that! I'm brushing up in prep for my final defense and appreciate all of your videos. Thank you!
I would like to thank you for speaking loudly and slowly, very usefull for someone not native English like me.
Just Realized that the values of MD and Chi-Square test exactly match the output of the AMOS' outlier table. Thanks Dr. Grande
Thank You Dr. Grande, I have no background in statistics....yet had to take a course as a portion of my degree requirements. Your video have really helped me understand this very difficult subject!!
This video was very helpful! Thanks for sharing your knowledge for free on RUclips!
Fabulous video, explained clearly, concisely. I like how you have also shown the importance of labelling data properly and presentation (decimals) etc as this is really important when keeping data organised i.e. not just focusing on having a tidy output.
I am glad you found this video useful - thanks for watching.
Dr Grande you are a saviour! Thank You!
Dude you're a legend. Thank you for this.
This has proven so helpful again and again! Thanks.
Amazing!! You should do a separate video for the Chi-square distribution. Nowhere on RUclips is the second part to the explanation and because it is not overtly flagged in the title it does not show up.
Either way thank you so much!!
Your videos add a ton of value!
Thank you Todd Grande for this extraordinary how to video. This was a prayer answered and feel so much better listening and following your instructions. I want to express how sincerely grateful for the detail analysis and steps you indicated on this video.
I liked your presentation. SPSS has evolved from the old text script product. I am a current user of both Minitab and Matlab.
I am studying the Mahalanobis Distance and see that it has many applications. The SAE and ZVEI published a standard where electronics were judged to be fit for use in a temperature x voltage environment defined by a potato shape. However, they never provided a method of dealing with the different unit scale distances. I am thinking the Mahalanobis Distance would be a more technically correct means of classifying a component's fitness for use in a temperature x voltage environment.
Thank you very much Dr Tod..you helped me a lot.....
Omg thank you i was so frustated . My data wouldnt met the normal multivar assumption until i stumbled onto this!
Thank you Dr Todd, for such a awesome explanation.It was very beneficial for my study to move on.
You're welcome!
Wooooow! This is extremely useful! Thank you! 👌
Thank you Dr. Todd for your awesome work. This is a very useful video. I am wondering if you could mention the reference for this process (or a reference for the significance value - is it by Tabachnick & Fidell 2007?). Many thanks
Hair et al 2010 also recommend this process. They suggest to divide MD by the number of predictors and then designate outliers in small samples if these values surpasse 2.5 and in large samples if they surpass 4. I think the 2.5 cutoff point corresponds very closely to the .001 p-value used by Dr. Grande.
@@sebastiankruse4981 Hi! Could you please provide the full reference for Hair et al 2010? Is it this one?
Hair, J.F., Black, W.C., Babin, B.J., & Anderson, R.E. (2010). Multivariate Data Analysis. Seventh Edition. Prentice Hall, Upper Saddle River, New Jersey
@@Lello991 yes, that‘s the one
Wonderful explanation. Enjoyed the session. Thank u Todd Grande it was a great help
You're welcome, thanks for watching -
Great contribution! Is there any reference for the described procedure?
Hi Todd! Great video as usual. Why was the cut off for the probability_MD column .001? Is that the norm cut off or based on your data?
awesome video dr
Very useful......Thanks a lot professor Grande.
Hello, thank you for your helpful videos. Quick one, I'm running a moderation with multiple mediators. Are mediator variables independent variables? I'm trying to run the Malanobis distance but unsure if I should add my mediators to the IV box. Thank you
Excellent video!
Thank for your sharing! I would like to ask a question: if i detected 8 outliers with Mahalanobis distance, is this necessary for me to delete all outliers ? or 8 outliers out of 200 respondents is still in acceptable range ? is there any reference mention about it ?
+Wong Janice According to Mead and Craig (2012, Identifying Careless Responses in Survey Data), you may have up to 20 careless responders, especially if you used web-based questionnaires, so I would eliminate those 8 outliers to improve the quality of the data, although it is not always a straightforward rule.
Very helpful video!
Dr Todd, the video helped me a lot. I have 2 questions
1) I have an integrated framework, where analysis is done in 2 stages. Should I check MD for each stage? One of my variable will look like a mediator but it is not. It will be a DV in first stage and subsequently an IV in 2nd stage of the analysis. Stage 1 and stage 2 does not have any connection. I have done each testing and got different Prob_MD / outliers to be deleted.
2) Should I include demographic questions as part of df, as the prob outliers results are different when I omit or include?
Thanks for the video, Todd. Do you have some references that explains why the number of degree of freedom should be the same of the number of variables?
Dear Dr Todd, Thank you very much for this wonderful video. I h
ave the same question about how do you decide on the degree of freedom, and whether .001 is the absolute rule. Thank you
Todd, thank you so much for this clear explanation, but you've done the identification for multivariate outliers with Mahalanobis distance for the cases.
My question: is there any differnce if we can do it for variables?
As the variables have their onw distributions while they are affected by the outliers
Thanks for the great video!
Does this method expose two-tailed outliers? If not, how can this be achieved?
Todd, great explanation! Thanks. One question: Why you used 3 as degree of freedom? Why you used .001 as reference? Can I use for any sample? Thanks again.
+André Foganholo Three degress of freedom were used because there were three variables in the analysis. Using the probability of .001 is a common practice when identifying multivariate outliers.
+André Foganholo as a reference for the .001 threshold I used Tabachnick, B.G., & Fidell, L.S. (2007). Using Multivariate Statistics (5th Ed.). Boston: Pearson. (p. 74). according to www-01.ibm.com/support/docview.wss?uid=swg21480128. just in case anyone needs that.
Thanks everyone!
He used 3 because of the number of independent variables. Thank you
Thanks for the video, very thorough.
Dear Todd nice video clip. I have a question that in your video i think you are using 3 independent variable and 1 dependent variable so actually you are using 4 variables totally.
I guess degree of freedom in this case is 4-1 = 3 since you are measuring distance on the 4 dimensional scales.
Thank you, very useful video and clear explanation.
If we have more than one independent latent variables, mediators and final dependent variable, how we would place them in the 'Independent & Dependent ' variables list box while doing this test? Thanks Waiting for response.
Hello, did you find the answer to this question?
Thanks for the video! Can you use Mahalanobis distance if your IVs are categorical in a mixed anova?
Thanks very helpful.....I understand that why you used 3 as DF, however please explain what is Chi Square?
He does not reply Dr. Nawaz :)
I really appreciate the efforts your are putting to making concepts easy to understand. Do you have a video on transforming a multivariate outlier to a dummy variable.
Thank you! Very helpful.
Hello Dr. Grande. Thanks for your helpful demonstration. Can items be used to calculate Mahalanobis distances? Or should I use variables. I mean composite variables.
What do I do if there’s a missing value in one of the scales when I use pair wise deletion for my regression. In that case the mahalanobis distance returns a missing value as well.
Thank you for this useful video. I have a question please : What to do in case of ordinal variables when checking for these outliers ? what method is the adequate one? Mahala Distance or Cook's Diastance ?
Does it have sense to apply this method when my data is only composed with ordinal variables and not continuous ones ?
Dr Grande, 1 question. Frequently, the examples of identifying & handling outliers are about independent variables. Does it mean that we don;t have to identify & handle the outliers of dependent variables? TQ!
hi. Dr. Todd. Thanks for sharing knowledge. May I ask you something? Can I use Mahalanobis distance for identify multivariate outliers with ordinal data?
Sir, if I used a Likert scale, the DV should be the mean score of all the items and IV should be the score of each items?
Thank you so much I followed your video and did the test with my sample. But I have a question on the degree of freedom? why you consider it 3?
Dear Dr. Todd, how should we do when we have a model with many IV, two mediators and two VD ? should we consider the mediators as IV ?
Thank you Dr.Todd for these useful lessons. Please can you mention for the reference of using formula which you wrote in compute variable.
1-..chi(mahalanobis, df).
Thanks alot
rashid said he is not replying to actual questions. That’s sad
@@nahk-lx2tn hi wasim, I found the reference (hair, 2014)
Thanks you for this very helpfully video. Whether these method may be applied to lognormal datasets? Whether it is necessary the data to follow a normal distribution?
Hi Dr. Grande, can we include the outcome variable (DV) with the IV in the mahalanobis distance analysis?
Hi Todd. Thanks for the video. Just one question: your main independent variable is program? Shouldn't we compare MAH_1 value for samples that received treatment and ones that didn't? The things is, main independent variable is not used in the analyze, so whatever value I put there the results (removed cases will be the same). Regards
Dear Todd good explanation. The outliers which we got are similar to Residual(Actual-Predicted) outliers to remove from the data?
Thank you so much for your great explanation! It really helped me a lot!
I'm glad you found the video useful. Thanks for watching.
Dr.Grande, I have two independent variables and three mediation variables of one dependent variable. Question is should I put five variables(independent plus mediation variables) to figure out outliers or should I put only two independent variables? Thanks.
Would you include control variables as predictors in the regression? And if you're testing a latent interaction in MPlus, do you simply input the observed variables into the regression in SPSS to do this test?
Hi Dr. Grande,
I've got 7 constructs (3 IVs, 3 intermediary, and 1 DV) each with multiple items. How do do you recommend handling these when searching for D2? I also need to test for multivariate normality using the Wald statistic on the same data set. Do you have a video on that?
hello Dr.Todd. Your video helped a lot and the steps are easily understood. but i seemed to have too many outliers , i would like to have your contact to ask you regarding this matter.
Great video. Could you recommend a method to identify outliers in an RM ANOVA set up? I am looking for something similar to INFLUENCE option in SAS MIXED procedure but for SPSS
+Todd Grande - could I check the degrees of freedom for the Chi-Square distribution is n or n-1. e.g. for 20 variables, is the d.f. 20 or 19?
Dr. Grande--thank you kindly. Awesome demonstration. The question I have relates to the number of independent variables (IV) chosen when calculating a Mahalanobis Distance (MD). Should the particular IVs chosen be related in some way or can you through in all of your numeric variables into the one regression when attempting to find multivariate outliers using MD?
Thank You!! I more understand after watching your video =)
Dear Dr. Grande, my variables are measured by likert scale...how must be applied the Mahalanobis test in this cases?...thanks
what if one of your variables was excluded by SPSS when calculating Mahalonobis Distance? Are the degrees of freedom the same, or would you minus one? Example, 5 variables entered but one was excluded. Would degrees of freedom be 5 or 4?
Hi, I have one question on the method. Hope you can help me :). In your data set, you have your manipulations, descripted as "program" and then you said that you have your independents named "functioning, severity, motivation". 'Why do additional metric independents exist in your file? In my data set I have 2 independents but they are in a nominal scale. So, what do I put in the text box of the linear regression where it says "independents"? Thanks in advance!!
So what happens when you follow these instructions and every number you get for the MAH_1 is the same?
Great video - thanks for posting!
You're welcome - thanks for watching.
+Todd Grande - Thanks a lot for the great explanation. Could you please help me in understanding- 1. Will this process work even if we have missing values. Why do we use DF as 'n' y not n-1.? Thanks a lot for the help.
Thank you so much it was a lifesaver!
so i guess mahalonobis cant be calculated when you have missing values somewhere in the items. any other way i can include those cases?
Great explanation! Thank you so much!
You're welcome!
Awesome... Thank you very much
thats great. thank you so much sir
hi, are there any changes to be made if I find negative values for the probability_MD?
Hi. Is it possible to use it in non-linear multivariate as well? Thanks
Perfect !!! Thank you!!!
Thanks Dear Todd. great !
Exactly what i need! Thank you!
Mr Todd, I have 4 variable; AsliG, AsliB, GreenBP, and BlueBP. I want to know the value of Mahalanobis distance between (AsliG AsliB) and (GreenBP BlueBP). Can I calculate its variable using Mahalanobis distance using SPSS? Why I cant input 2 variable in dependent and two independent other in SPSS?
Hi, does anyone know why it won't give me output when I add two variables to the 'independents box? I can get output for them both separately but cannot get 1 output for them both
can we apply ANOVA for the factor at 2 level? I have seen lot of studies using 2 levels and testing it with ANOVA.. Secondly, don't know how they check the normality when they use single item likert scale .. could you please explain me this concept
This was very helpful, how do we contact you. Cause I have an urgent problem
thanks for this amazing video!
prof, i have 3 dependent variable and 6 independent variable, how to see mahalanobis ?
i have a sample of 300 and 2 predictors. what would be the Mahalanobis Distance for it???
thanks for this
Best vid ever
Excellent!
Do I use for CLP-Analysis (2-waved longitudinal design) the dependend variable time 1 or time 2? sorry but i am confused whether i compute Mahalanobis d for the regression term in my CLP-Model with: Dependend Variable (t2) regressed ON --> Dependend Variable (t1), independent Variable (t1), Moderator (t1). Or it wont matter if I do the mahalanobis for a simple regression time 1: Y1 regressed ON --> X1, M1 (and what about my second independent variable ? - should i put it into the regression for timepoint 1, too?)
I would be very glad if anybody could help me with this confusion !! :D
What if you get zero as a result for one of the rows?
Why have you selected the p-value to be 0.001?
many thanks 🧡
Thanks Alot. May.Allah bless you
what if we have 5 dependent variables and just one independent ??
If we have 5 iv then what is the degrees of freedom
is this related to MCD?
Thank you dr