How to Use SPSS-Replacing Missing Data Using Multiple Imputation (Regression Method)

Biostatistics Resource Channel

Просмотров 344 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 18 янв 2025

Комментарии • 133

@Flaya12 10 лет назад ⁺¹²
A simple thank you might not be appropriate for this great work you did and shared with the public and by doing this with me.
So I want to tell you, if you ever feel down and or even feel worthless, remember that somewhere in Austria you made someone really happy by doing this tutorial!!! Thanks a lot.
At first I thought it might be a bit long but it was worth every second and you did a really good job.
@stephaniesmith6047 11 лет назад ⁺²
This was a very informative video. I am currently examining some longitudinal data and of course there is a significant amount of attrition. I initially ran a regression analysis using exclude cases listwise but I didn't feel this was the best way to analyze the data. This technique definitely helps address some of those issues. Thank you so much for posting this!
@TheUgly0duckling 11 лет назад ⁺¹¹
Thank you! Saved me and my thesis.
@janecooper358 11 лет назад
Thanks so much for your reply - sorry you misunderstood me, I've got 570 participants so I'll do a EM and see how I go. Thanks again, and thanks for doing the videos - I've just started my PhD and I'm sure I'll be tuning in quite a bit!
@duallumni369 10 лет назад
The video explains the concept in such a easy to follow steps. A great video for multiple imputation technique.
@2Luhna7 11 лет назад ⁺³
hello! thank you so much for the video.
I have a question however. From what I understood you dont get one single databased with missing values replaced; you should work with the pooled results. So, my question is if there is any way to crate a new single database to import to other programs (for instance mplus or lisrel) and work on. I need to do that for CFA on my data...
@singularity00001 10 лет назад
Excellent work you did here! Thank you.
@yad-c3662 11 лет назад
Thanks for doing this! Very clear and helpful
@jessicabarton93 10 лет назад ⁺⁸
what happens if your data is missing not at random? I did the lIttle's test and it was significant. I can't figure out which MI to do in that case
@gregl4740 9 лет назад ⁺¹⁶
Thank you for the tutorial. I just ran this on my dataset successfully. However, I was wondering if there is a way to obtain pooled means and 95% CI's across iterations. For inferential analyses (e.g., correlation), I am able to obtain the pooled statistics. However, when I use Analyze -> Descriptive Statistics -> Explore, it will only give me the descriptive for the original data and each iteration *individually*. Is there a way to obtain the pooled descriptive for variables? Also, is there a way for SPSS to generate a dataset that only contains the imputed data after the final iteration?
Thanks!
@Sspecial_KK 11 лет назад ⁺¹
I saw earlier on your comment what to do on this issue, but I was not able to set min or max value. However, I found out that you can adjust the parameter in the syntax. It did worked out as I saw all the imputed value on the output. Unfortunately, on the data view tab I couldn't see any imputed variable, nor the upper right option to switch different data files. So, what went wrong?
Could you help me out? Thanks in advance!
@sameeral-abdi6870 11 лет назад
The observed discrepancy was because some cases had missing values in all the three included variables for multiple imputations. The problem resolved by adding variable(s) in which these cases had values.
@Zanthias 11 лет назад ⁺²
Hello,
Thank you very much for showing this video. My question is once you get all the five imputed values, is there any rule of thumb as to which of the five you should use for your analysis? Also, I realize that in your t-test example, the pooled values did not have standard deviations. How about if you want to report Std Deviations in your study? If you can kindly let me know, I will appreciate this. How about if I want to create composite, which one of the 5 imputed values should I use? Tx
@seanicusvideo 10 лет назад
this is helpful. the use and purpose of the extra imputation history file might be better elaborated. was very nice to include some references! thanks!
@chetanm12 11 лет назад ⁺¹
First off, thank you so much for posting this video...it was very well made and I look forward to exploring other videos you have. As a follow up question to enemenoff's question...what are the differences for MI for random vs. non-random patterns? Did I miss that part in the video? Do you have a source I could visit? Thank you in advance!
@eligardner5436 9 лет назад ⁺³
hi, thank you for very helpful video. I followed all the steps but my output after running my first ANOVA, only showed the 5 imputations, not the pooled figures. how do I get the pooled figures?
@sabrinadickey2205 11 лет назад ⁺¹
Hello, the video was very helpful. I have a question regarding the use of the iterations. I had 5 iterations and the pooled iteration was not significant p > .05, but I noticed some of the others were significant. Do you ever use one of the iterations or are you only supposed to use the pooled results?
@sameeral-abdi6870 11 лет назад
Thanks for this wonderful demonstration. I am facing a problem when I run this test. Number of missing values entered in the multiple imputation analysis was less than number of missing values across all the variables with missing data. Subsequently, completed data after imputation were less than my original data (valid plus missing). So, how can I fix this problem?
@masumarahim 11 лет назад ⁺⁵
this video was very useful; thank you. however, even when splitting the file by imputation, i cannot get pooled analyses. spss will perform the analysis for the original data and each of the five imputations but will then only give me the means and standard deviations for the pooled data, not, for example, chi-square or t-test values; nor will it give me a p-value. why might this be?
@colanfrost3518 11 лет назад
In your video you said you could only use imputed data for the analyses that have a swirl on it. Is there any possibility to use imputed data with repeated measures analyses in SPSS and how might that work?
@georginamartin9337 9 лет назад ⁺¹
this is really awesome!
@janecooper358 11 лет назад
Hi - thanks for the video - it was really informative. Just a question though... my data has a small amount of missing data - 4 variables, 32 cases, 104 values with .5% missing data overall (104 cases). MCAR was non-significant p=.052 - just! I am running a CFA using AMOS therefore I cannot have missing data. Do you recommend conducting a EM or a Mutiple Imputation or neither? Plus how can I get AMOS to look at the pooled data when conducting the CFA? Thanks !!!!
@SamanthaBalemba 11 лет назад
Will it automatically use the pooled estimates even for more advanced later techniques, like SEM? I'm using AMOS to run my SEM, but I want to make sure the MI results will automatically get used for this (seeing as how it's an addon to SPSS). I was recently informed that you can't run a proper SEM if you have ANY missing data, so I wanted to make sure I fixed that problem...
@TokenFun105 11 лет назад
Thank you very much for this tutorial. However, I notice you did not mention Little's Missing at Random test. Should this not be done prior to all imputation methods? Or is it sufficient to look at the Missing Pattern Values Graph? many thanks
@jamie10157 11 лет назад
Hi thanks for the video it's really useful! Can I just check what exactly the y axis represents when you are looking at the patterns in the diagrame with grey and red squares?
@SamanthaBalemba 11 лет назад
Is there a cut-off for using this method in terms of the percentage of cases missing for specific variables? All of my var's are missing
@shirleyanynameiwant5883 10 лет назад ⁺¹
Thanks for this great video. I found it easy to understand. I now have a data file containing multiple imputations (5 imputations). My question is when reporting the univariate statistics and normality statistics, which results should I report given I have results from the original data set and results from the 5 separate imputations? Thank you in advance.
@carilynne1 10 лет назад ⁺²
Thanks for this video--your RUclips channel is saving my life!
My question is similar to ones asked previously but I could not make sense of the reply about merging data in SPSS.
I have completed multiple imputation for missing data (went great!) but I want to move this dataset into listrel for structural equation modelling. How can I get a single data set with the pooled information, rather than having the individual datasets for the imputation displayed and then SPSS pooling them during any further analysis?
Thanks, Carilynne
@chavianddavid 11 лет назад
The raw data was a dummy variable regression so there are only 1 and 0. Also, the experimental design was such that each respondent had their own design where they saw either all or just a subset of the variables. So I am looking to fill in the coefficients for the variables they did not see.
@marina7181 10 лет назад
great video!!thank you!
@Jbalisasa 9 лет назад ⁺⁵
This is a great presentation. I really enjoyed it. Unfortunately for me, as I tried to follow it to impute my missing data, I keep receiving a warning which says that the imputation model for some variable contains more than 100 parameters. Below is an example of such warnings: "An iteration history output dataset is requested, but cannot be written.
The imputation model for SYNC2 contains more than 100 parameters. No missing values will be imputed. Reducing the number of effects in the imputation model, by merging sparse categories of categorical variables, changing the measurement level of ordinal variables to scale, removing two-way interactions, or specifying constraints on the roles of some variables, may resolve the problem. Alternatively increase the maximum number of parameters allowed on the MAXMODELPARAM keyword of the IMPUTE subcommand.
Execution of this command stops."
This is repeated for quite a number of variables. Can someone help me understand how to hand this trouble? Thank you. Juvenal Balisasa
@tacappaert 9 лет назад
Juvenal Balisasa This likely happening because there is data that SPSS cannot categorize or falls outside of the expected range that you specified. Be sure all categorical data has a coding value and be sure there are no numeric values that are out of the specific range.
@iskyrisky1969 9 лет назад
+TheRMUoHP Biostatistics Resource Channel
I have the same problem.
@iskyrisky1969 9 лет назад
+Juvenal Balisasa
Have u solved your problem. I dpo have the same problem
@AldoAguirreC 9 лет назад ⁺¹
How are degrees of freedom reported after a t-test is performed using multiple imputation? I see that the number of df for the pooled data can be in the thousands, and it does not feel right to report such a high number when the N = 50 for example. Any advice or paper or paper that discusses this issue? Thanks!
@kimconsultants 11 лет назад
So you can impute data only for the variables where > 5% of data are missing? Or, if you impute for one you must impute for all variables that have any missing data? I ask because I have many variables and SPSS doesn't seem to be able to handle all of them at once. This means I have to create multiple imputed data sets and I'm not sure how to combine them all.
@mrflowers1234 9 лет назад ⁺²
When writing a manuscript for a trial that has used multiple imputation to address missing data, what additional reporting should I include? Data pre and post imputation? Anything else?
@kimconsultants 11 лет назад ⁺¹
I have a large number of variables and SPSS does not seem to be able to do the imputation with all the variables at once. So, I did groups of variable separately. However, I get multiple imputed data files. How do you recommend combining the data files?
@sylviaherbozo5811 11 лет назад
Thanks, I was able to get it to work. But I had another question. After running my analyses (t-tests and chi-squares) with the imputed data, I noticed that the sample sizes for each variable on the output are still uneven which normally means some cases weren't used due to missing values. Are these sample sizes supposed to still be uneven? And I just report the total sample size?
@lirpatex 11 лет назад
I would like to do a MANOVA using my imputed dataset, however, when I run the analysis, there isn't any pooled output. Is it okay to report the output from the 5th imputation? Thank you for your great video and help.
@Zisis21r 11 лет назад ⁺¹
thanks very helpful! I have a question - under the Analyse-ImputeMissingData- Constraints tab on the lower "define constraints" variables, SPSS won`t allow me to set Min and Max values for my variables - and I notice the table rows are coloured blue and not white as in the tutorial - could anybody help me work out how I can define my min-max variables?
@missyp017 11 лет назад
paul, did you figure this out? i need to do the same thing...
@seanicusvideo 10 лет назад ⁺³
Question: if results and parameters are "pooled" (and not averaged) what is the specific calculation? e.g. for bivariate correlations, or linear regression outputs, for example?
@katy8791 3 года назад
Hello ! Nice video! Any idea about how to calculate the reliability of a questionnaire in spss if we have missing data in some questions? Is it using the usual cronbach alpha?? And what should be putten in the cells of the missing data in spss?
@chrislittle9839 11 лет назад
For a scale score, would you calculate the aggregated variable from the pooled imputation iterations?
@onlyificanloveyou 11 лет назад
Thank you for making this great video!
I have actually done multiple imputation in Mplus and it generated 10 imputed datasets (all were .dat files). Is there a way to read these files as imputed data sets in SPSS? I need to do matched-pair t-tests by using these values. My stats consultant suggested that I ask SPSS to read these 10 imputed datasets individually, do 10 t-tests, and then average the t-value. However, I like how SPSS pooled the datasets first. Thank you!
@TokenFun105 11 лет назад
Would you use the same process to determine mean and stand deviation of 'pooled data'. I would imagine you could use these estimates to standardise all variables and then re-run the regression on those to obtain the standardised regression coefficients (that SPSS also does not provide)?
@yaldaamirkiaie5303 11 лет назад
Hi Thanks for The video, It is very helpful! A question that I have so far by just watching the video is that when applying "Constrains" min 27:19 there are 2 other options saying "maximum case draws" and "maximum parameters draws". could you please let me know what are those?
@ruskamihkas9723 10 лет назад ⁺²
Not sure if you mentioned these, but I didn't succeed until I changed my missing value codes from 99 to blank (.) and changed ordinal variables into scales. Otherwise it wouldn't do the imputations and didn't even let me specify constrains.
I had a Likert-scale of 1-5
@tacappaert 10 лет назад ⁺¹
Glad, that worked. That is a pretty common issue.
@iskyrisky1969 9 лет назад
+TheRMUoHP Biostatistics Resource Channel
I have to change my nominal to scale
@Sspecial_KK 11 лет назад
I got the same problem, but I managed to run the multiple imputation by adjusting the maxmodelparam (in syntax), cause I was unable to change min and max values. However, I did not see the imputed variable in the data view table. Yet I did see the results of the imputed values in the output file.
How do I get to see the imputed variable in the data view. Thanks in advance
@MoCowbell 11 лет назад
I have missing item level data (from a scale with some missing items) and variable level missing data. Should I first impute the missing items so that everyone has a score on the variable with the items or should I just ignore the fact that I have items and just estimate the missing variable that is composed of the items? Thanks!
@guanlin6123 10 лет назад
Thanks for the helpful video. if we need to remove outliers, removing outliers should run before or after imputing data? if removing outliers should run after imputing data, I wonder how to do that when we have 5 inputted data.
@tacappaert 10 лет назад
Look for and remove outliers before imputation. These videos may help:
How to Use SPSS: Identifying Outliers
How to Use SPSS:Dealing with Outliers
@chavianddavid 11 лет назад
What if I have a conjoint study where I have 36 variables and 300 respondents but each respondent only saw a subset of the 36. So I now have a table where each row is a respondent with a constant and then coefficients for only 25 (or more) of the 35 variables. What would be the approach for replacing the missing values (i.e. the missing coefficients for those variables for that specific respondent)?
@barakatunnisakmohdyusof9938 9 лет назад
thank you. Great presentation. I have one question. Does the imputation can only be focused on primary outcomes?
@benjaminucr6636 11 лет назад
I have run an analysis like the one shown in SPSS 19, and it didnt provide in the output neither the pooled results nor the fraction of missing information. Under "Edition-->options-->multiple imputation" the option "results for imputed and observed sata" is choosen. Any idea about how can I make to get the pooled results and the fraction of missing information in my output?
@jackcannon3359 10 лет назад
This is brilliant! Thanks for posting.... What is the minimum total number of observations (including missing obs) that this technique will work with? I have a dataset with 18 observations from 10 cases (should have 180 points in total) and I am missing 10 data points... Would multiple imputation be appropriate for this two-way repeated measures design? Thanks.
@tacappaert 10 лет назад
That should work fine. I don't know that there is a minimum per se, but as long as your missing data is not the majority of the possible observations, it should work for you.
@rhissarobinson970 10 лет назад ⁺¹
Hello, I really appreciate you sharing this video. It has helped me tremendously to figure out how to understand and implement this method for my data. Would it be possible for you to share the syntax? For some reason, my output for percentage missing (the first output you show us) does not show the mean and standard deviation of the variables in my output. I'm sure it's a just a line I missed in the syntax. Thank you!
@sylviaherbozo5811 11 лет назад
I keep getting a warrning message such as "The imputation model for EDEQ14.1 contains more than 100 parameters. No missing values will be imputed..." Any advice on how to resolve this problem? I tried changing the measurement level but it didn't help. I wasn't sure how to do one of the other suggestions including: Reducing the number of effects in the imputation model, removing two-way interactions, or specifying constraints on the roles of some variables
@skincare2010 11 лет назад
hello, I have a few issues with my dataset: first of all, my dependent variable has a wooping 20% missing values (the question is rather sensitive, so I am considering running two models, one that uses this variable and another that uses a similar one, asked in a different way). Is this ok? Also, many of my variables are categorical or nominal (yes/no, agree/disagree etc). Can I still use this imputation method or is it just for numerical variables? Thanks.
@alipolat5393 11 лет назад
I have MNAR type data with sometime 60 percent missing. What I understand is that if my data is NOT random and if I choose automatic from imputation method tab than SPSS will take care of the non randomness problem of data. Is that correct.
@sabrinadickey2205 10 лет назад
Great video, and very easy to understand! If I wanted to remove multiple imputation from a data set is it possible and how would it be done?
Thank you!
@tacappaert 10 лет назад ⁺²
Do you mean reversing the process, so that the missing values become missing again? I don't know if that is possible but as long as you saved the original data set, you can always revert to that.
@yoox0047 11 лет назад
When you run a hierarchical regression with MI dataset, the output does not provide R, R2, adjusted R square, and F value of pooled imputation (it only provides the calculations for the original and each imputed dataset). It also doesn’t provide beta (standardized coefficients) of pooled imputation (there were only unstandardized coefficients: B and Std. Error) either. However, given these are typical calculations reported in our results, how do we obtain these information from the pooled data?
@Bardiyaz 10 лет назад
Hi thanks for this helpful video.
I have two questions:
1- do we have to include non missing variables in order to get a better prediction for the missing variables?
2- I need to do a propensity score matching after doing multiple imputation on the dataset with generated data, so I actually need a "pooled only" dataset which is the average of all as you said. is there a way to save the pooled only dataset or do I have to calculate the average for each variable and save it separately?
thank you,
@tacappaert 10 лет назад
1. Yes, you should use as many variables as you can to improve the estimation of the missing values.
2. To the best of my knowledge you will have to calculate the mean for each variable.
@ilmamufidah6272 10 лет назад ⁺¹
then, which imputed value is to be used? the fifth one? or we have to avegare all 5 inputted data? it will be exhausting right?
I also have other question.
my data are mostly ordinal (likert scale). But when I tried to run the multiple imputation, the imputed values were beyond the allowable range, some of inputted values were negative, some others were not integer. When I changed the "measure" to "scale" instead of ordinal, then I set the max and min range as well as the rounding, I could get much more beautiful values. Was my approach right?
The last question is the same with anastemi. But then I tried to solve it by specifying the role of each variable and it worked. The problem was, I actually didn't know the role of each variable. I just predicted what the role might be (it was actually my hypothesis for a model I tried to investigate). What do you think? I am afraid that my approach is wrong so the inputted value is not valid or something like that
@tacappaert 10 лет назад
There should be a pooled data value that you can use that aggregates all the imputed attempts.
In regards to the ordinal data that was the correct approach.
@ilmamufidah6272 10 лет назад
***** Is it ok if I just average it? Since I heard that calculating the average is the simplest way to get the pooled data. Where can I found the pooled data?i didn't find any of it
@tacappaert 10 лет назад
Ilma Mufidah
The pooled data should be found in the output as demonstrated in the video. Be sure the data is categorized properly in the Variable View. Be sure it is set up as "Scale/Numeric" data.
@TokenFun105 11 лет назад
Great thanks! I never trust my 'subjective judgement', so I like to rely on both :)
@sivabalaji30 9 лет назад
I have an query…..I am currently working on SPSS on a survey data…..It Contains Many Missing value’s……… Its is Not a Random sample(MNAR)…..what method should I use to replace Missing data ?
@suzanneveger7148 9 лет назад
I have a question. I have imputed the data, and I want to conduct an anova test. In order to interpret the data, how do I need to read this ANOVA table? There is the origin solution, and 5 other solutions. However, I do not find a pooled solution.
What do I need to do here?
@tacappaert 9 лет назад
+Suzanne Veger Unfortunately, not all inferential techniques don't pool the result such as we saw in the t-test example.
@efi225 10 лет назад
thank you so much for this helpful video! I run multiple imputation on my data, but I would like to ask you, where are the pooled values you mentioned? I can only see the values in each imputation.
Also, I have used many different questionnaires in my research,do you think it's better to run multiple imputation to each questionnaire separately?and some of them are multidimensional,does this affect multiple imputation?maybe I should run multiple imputation on the items of each factor separately?
@tacappaert 10 лет назад
The pooled data should be found in the output as demonstrated in the video. Be sure the data is categorized properly in the Variable View. Be sure it is set up as "Scale/Numeric" data.
@tacappaert 10 лет назад
I would run separate imputations for each questionnaire if they are measuring different constructs.
@johannahedlund3708 10 лет назад
I don't understand what values to use. I want a full table instead of the original one that has gaps where the missing data is. Can I use the New Imputation table? But what imputation do I use? Or do I fill in all the gaps with missing data with the same pooled mean from the t-test analysis?
@johannahedlund3708 10 лет назад
I also get an empty Group Statistics table when doing the t-test. The mean is put out as zero and I get a warning saying "No statistics are computed for a split file in the Independent Samples table. The split file is: Imputation number=... The Independent Samples table is not produced." What am I doing wrong?
@georginamartin9337 9 лет назад
I have a dataset that has some missing values represented only by a . and others that have it as a -1 or -9. When I do the imputation the . values are imputed but the assigned missing values remain the same. How do you rectify this?
@tacappaert 9 лет назад
+Georgina Martin If the -1 or -9 values are not actual outcome possibilities, then the values should be cleared and then you can run the imputation.
@georginamartin9337 9 лет назад
excellent thats what I did and it works!
@anastemi 10 лет назад
Hello, thank you for the video, it was very helpful. However when I ran multiple imputation on my data set, I got this message ' the imputation model for 'sex contains more than 100 parameters and no missing values will be added'. So a new data set was not created, Others have cited this as a problem, what should we do? I also have a large amount of missing data about 50%
@tacappaert 10 лет назад ⁺¹
Try this solution direct from IBM: www-304.ibm.com/support/docview.wss?uid=swg21482103
@khushbeensohi4364 10 лет назад
Can I use this method to replace missing data if my data is not normally distributed and hence, I use non-parametric methods?
@tacappaert 10 лет назад
Yes, you can.
@ertugrulsahn 10 лет назад
Do we estimate missing nominal and ordinal values too? If not what we can do for missing nominal and ordinal values (For example nominal: gender, ordinal: perceived income as categorized by low medium high)?
@tacappaert 10 лет назад ⁺¹
Yes, the procedure can estimate those values as well.
@ertugrulsahn 10 лет назад
***** thanks for your care
@annapease 10 лет назад
Hello. I am using longitudinal survey responses with a biased drop out - so there is a great big red patch at the bottom right of my missing value patterns graph! Can you tell me the best multiple imputation method to use? If I delete cases I am also biasing the dataset. I have analysed the raw data so I know what I'm comparing it with but I am struggling with the method of imputation. It's also saved across 5 different datasheets - I need to combine it into one, don't I?! Thanks!
@tacappaert 10 лет назад
I would use the regression method of imputation.
@carolineroth2710 9 лет назад
+Anna Pease I used the aggregate command to get all the pooled datasets back into one (I needed to do further imputations on my data, which I couldn't do (or couldn't figure out how to do) once I did one imputation. The thing to note with this, though, is that you won't be able to see which cases/variables have imputed data, like you can when they're not pooled. The syntax I used was this:
AGGREGATE
/OUTFILE='[location on my computer]\[newfilename.sav]'
/BREAK=[variable to break by, which for me was survey participant ID]
/[string variable 1]=FIRST([string variable 1])
/[string variable 2]=FIRST([string variable 2])
...
/[string variable x]=FIRST([string variable x])
/[imputed scale variable 1]=MEAN([imputed scale variable 1])
/[imputed scale variable 2]=MEAN([imputed scale variable 2])
...
/[imputed scale variable x]=MEAN([imputed scale variable x]).
@mandyruth9954 10 лет назад
Great video! I have run multiple imputation for 2 variables (missing categorical data for 12% of values), however, I notice after 5 iterations, I still have some missing values. Is this normal?
@tacappaert 10 лет назад
Not usually. Be sure that you designated those variables to be imputed.
@Ulli0664 10 лет назад
Thanks for the great video, helps a lot!
Ive 2 types of missing data in my dataset (working with a questionnaire which has several versions) and Ive coded 2 types of missing data:
-9 for actually missing (respondent didnt know / didnt want to give an answer)
-99 for n.a. (respondent didnt see this question and therefore couldnt answer it)
Therefore, I need to somehow exclude the -99 datapoints from the replacement.
Any idea how to do this?
Many, many thanks in advance!
@tacappaert 10 лет назад
You can exclude certain data points by using the Select Cases function and then run the analysis.
@mariakrista100 10 лет назад
What do you do after you get results from 5 imputations?
@tacappaert 10 лет назад
You use that data to replace the missing data points and then run your additional analyses (e.g. t-test).
@d3llikz 9 лет назад
Is this also possible for panel data?
@tacappaert 9 лет назад
+Morten Fjerritslev Can you explain what you mean by panel data?
@tacappaert 10 лет назад ⁺¹
+Shirley anynameIwant, you should report descriptive statistics for pre and post imputation.
@timw.5528 9 лет назад
I have a large number of variables in the imputation model (most of them are nominal) and I keep getting the same error message mentioned by Juvenal below, "...The imputation model for MODEL contains more than 100 parameters. No missing values will be imputed...." I checked all of the variables and they look fine (the nominal variables have values and the numeric variables are within the expected range. If I change one or more of the variables from nominal to scale it seems to work, but then it seems as though the imputations are not going to be accurate as they will be based on linear rather than logistic regression. Any suggestions?
@tacappaert 9 лет назад
+Tim Wadsworth Variables that are ordinal in scale should be categorized as Scale.
@nohadarwish3053 9 лет назад ⁺¹
+TheRMUoHP Biostatistics Resource Channel; Thank you so much for the helpful tutorial. I have this same Warning message every time I am trying to do multiple imputation "The imputation model for Q2_3_TO contains more than 100 parameters. No missing values will be imputed. Reducing the number of effects in the imputation model, by merging sparse categories of categorical variables, changing the measurement level of ordinal variables to scale, removing two-way interactions, or specifying constraints on the roles of some variables, may resolve the problem. Alternatively increase the maximum number of parameters allowed on the MAXMODELPARAM keyword of the IMPUTE subcommand"; although I have checked that all variables are either scale or nominal ones. I have something like 85 variables. What should I do?
@sputaccount6139 11 лет назад
is there a way to get a pooled R-squared value in multiple regression with MI data?
@fazlihaleem6603 9 лет назад
how would i know that the missing data is MR, MCR or missing systematic. if we do not have MR, MCR then in case of systemic missing do we have some solution
@tacappaert 9 лет назад
+Fazal Haleem If your data is missing systematically, then that typically means there is some response bias of some kind (e.g. questions asking for sensitive information or questions that are unclear). You should try and figure why that might be happening so you can address that as a possible validity issue. This technique that I demonstrate can be used with data that are missing systematically.
@jaishrik8691 11 лет назад
The missing variables in my data file have a value of '9'. How do I remove these dummy variables? Thank you.
@antimandril2281 10 лет назад
When I push Ok (pattern) the computer is blocked .- Do someone knows what happened ¿?
@Anika-ze1bh 10 лет назад
Hey,
thank you for the helpful tutorial. Still I have a huge problem with my imputation. After running, it imputes, values that are way to high or even negative. So I defined the range which leads to an error that says something like (mine is in german): "after 200 drawings spss can´t find the imputed value for the variable xxx with it´s defined contraints. Please check if the defined min and max is appropriate or choose a higher maximum case draws" So it stops the imputation.
Can you help me with that?
anika
@neema1506 10 лет назад
great
@tombailey4262 10 лет назад
Hi, thank you for this excellent video. My question appears to be a bit more basic than those below, but I was wondering whether there is any way to store the pooled data set in a separate file. You see I would like to use with an SPSS plug-in e.g. PROCESS, which I don't think will recognise the pooled values as SPSS did with the t-test above?
Regards
Tom
@tacappaert 10 лет назад
I don't know if that is possible. I would suggest contacting IBM SPSS technical support.
@sajatorunczyk6195 9 лет назад
Tom Bailey Tom, I am looking exactly for the same - a way to use PROCESS with data that have been imputed. Did you figure this out?
@tombailey4262 9 лет назад
saja torunczyk not very simplistically although I think you could do it in r and port it back in. One option (not as good) might be to use expectation maximisation in SPSS?
@haliltokay3689 10 лет назад
Thanks for some great videos.
I get a warning message that says after 100 draws, the imputation algorithm cannot find an imputed value under the constraints for variable [X]. This is strange, because the variable is just a 7-point likert-scale. All "I don't know" responses are coded as 999, and as missing values. So, I tried to change the MAXCASEDRAWS. After a few attempts, it accepted 1000000000. I know.
So, it ran the imputations. However, I was met with yet another Warning message. "Some missing values cannot be imputed because a factor in the model has a value that does not appear in the data used to build the model."
Does anyone have any good suggestions for how I can solve this problem?
Just FYI:
- My data is Missing Not at Random (MNAR)
- I have 55 variables
- Sample size of 317.
- Measurement scales: 7-point likert scale + 10-point evaluation scale.
I hope someone will be able to help ASAP.
Thank you.
Halil
@tacappaert 10 лет назад
I think you need to take a close look at the data codes you have used in the variables with missing data. For some reason SPSS cannot recognize those codes and cannot perform the imputation. If I read your post correctly you have coded both missing values and "I don't know" responses as "999". That might be the issue.
@haliltokay3689 10 лет назад
*****
Well, in the software I used for collecting data - all "I don't know" responses received the value 999 so they could easily be identified during data analysis. The, there are also some system-missing data which just do not have any value at all.
But you believe this may be the source of the problem? I have now tried to recode all variables so the only type of missing values is 999. After doing so, I still get the same kind of Warnings.
I have to change the MAXMODELPARAM =500 and the MAXCASEDRAWS=400000 , and still, SPSS does not want to impute the data properly. It says that 'some missing values cannot be imputed' AND 'after 800000 draws, the imputation algorithm cannot find an imputated value ...
So.. Any good ideas for how to solve this problem??
Btw. Thank you guys for such a quick response time!!!
@haliltokay3689 10 лет назад
*****
I also see that others have encountered a problem related to having ONE data-set with the pooled values.
When I run the imputations, SPSS creates a new data file with the original data, the 1-5 imputations - and then that's it. In your video, it is the same. Up on the upper-right-hand section of the screen you can choose between original and the five imputations. But there is not one called pooled data.
Is there any operations in SPSS you can do to have one data file with the pooled values WITHOUT the original data and the five imputations. I just want the pooled data for further analysis. How can I do that?
@tacappaert 10 лет назад
Halil Tokay Any missing values should have cells without a code. The cells should be empty.
@haliltokay3689 10 лет назад
*****
Thanks. That worked. Thank you so much!
Now, how do I transform my data so that I only have the pooled variables without all five imputations, and the original data set?
I just want one dataset without missing values. I do not want all the imputations. Only the pooled variables.
I need to do a PCA followed by regression analysis, so I need a dataset without missing values.
@ia1167 10 лет назад
Hello: first I would like to thank you for this awesome video! It is super clear and super well explained!
I have a question for you. Procedure: According to IBM once one runs MI, following the method of "Fully Conditional Specification" ( FCS; in the output SPSS tells you what method it used) one should verify for FCS convergence, that is, whether it was achieved or not. Problem: This is the part in which I am terribly stuck because I am getting a lot of flat lines in my chart when I test whether FCS convergence was achieved (please look at this link for more info about how to do this: pic.dhe.ibm.com/infocenter/spssstat/v22r0m0/index.jsp?topic=%2Fcom.ibm.spss.statistics.cs%2Fspss%2Ftutorials%2Fmi_fcs-convergence_telco_howto.htm). So, when I looked at my iteration history, for every set of imputed data I am getting the same value no matter what is the number of the iteration. For instance: in my dataset #1 I get the same imputed value for the total score of a questionnaire from iteration 1 to 10000, and so on until the last imputed dataset (these values remain the same within datasets but are different accross datasets). Finally my question: Why do you think FCS convergence is not achieved in this case or why my values are not changing from iteration to iteration? I have been looking in internet what to do about this besides increasing the # of iterations but there is almost no info about it. Please, would you mind giving me your thoughts about this? I will be so grateful.
@tacappaert 10 лет назад
My guess is that the predicted values do not have any variance or have little ability to vary and your iterated values don't change. Generally, this is a good thing indicating that the predicted values are quite accurate being that they don't change between each iteration.
@kloveinn 10 лет назад
Thanks..good stuff but video is zooming stupidly at times...needs better vid editor
@stephaniesmith6047 11 лет назад
This was a very informative video. I am currently examining some longitudinal data and of course there is a significant amount of attrition. I initially ran a regression analysis using exclude cases listwise but I didn't feel this was the best way to compute the data. This technique definitely helps address some of those issues. Thank you so much for posting this!
@sameeral-abdi6870 11 лет назад
The observed discrepancy was because some cases had missing values in all the three included variables for multiple imputations. The problem resolved by adding variable(s) in which these cases had values.

Следующие

Автовоспроизведение