Dear Patrick. Thank you very much for all your enormously helpful videos on SEM!:) I used Likert Scales to measure the latent constructs of my research. As expected, the assumption of multivariate normality did not hold. Would the DWLS (diagonally weighted least squares) estimator be the correct approach here or is the ML with robust standard errors the better variant? I get different results depending on which estimator I use. Thank you in advance.
Hi Mary -- thanks for your nice note. Briefly, there's no crystal clear response to your question. In a strict sense, a Likert scale would require the use of a nonlinear estimator -- either DWLS or FIML with discrete items. These methods are very well developed and work quite well. However, this also introduces complications into the analysis that can lead to problems in model convergence and stability. Some recent simulation findings have suggested that Likert scales with at least five levels can be used with robust ML -- I pasted an exemplar simulation on this topic below. My personal feeling is that if you have at least five levels and have a reasonable distribution across the five options (so not with say 90% all endorsing the first value and the remaining 10% spread out over the other four), the robust ML is a good option. If you don't have a reasonable distribution across levels or have fewer than five, then WLS or FIML are necessary. Hope this helps -- good luck with your work -- patrick Rhemtulla, M., Brosseau-Liard, P. É., & Savalei, V. (2012). When can categorical variables be treated as continuous? A comparison of robust continuous and categorical SEM estimation methods under suboptimal conditions. Psychological methods, 17, 354.
Hello! thanks for your videos! I can see there are advantages of use of latent variables compared to means. I'd like to ask if it's important to test common method bais in SEM.
Thanks for your question. In my opinion, it is always important to consider the potential for method bias in any SEM. The trick is having data available to actually empirically evaluate that hypothesis. If you're interested, there's been some great work done on multi-trait, multi-method models in SEM. The literature is vast, but a couple of readings are below. Good luck with your work -- patrick Koch, T., Eid, M., & Lochner, K. (2018). Multitrait‐Multimethod‐Analysis: The Psychometric Foundation of CFA‐MTMM Models. The Wiley Handbook of Psychometric Testing: A Multidisciplinary Reference on Survey, Scale and Test Development, 781-846. Castro-Schilo, L., Grimm, K. J., & Widaman, K. F. (2016). Augmenting the correlated trait-correlated method model for multitrait-multimethod data. Structural Equation Modeling: A Multidisciplinary Journal, 23(6), 798-818.
Thank you, Prof. Curran. But I still didn't figure out the difference between full info discrete methods you mentioned and logistic regression. Does the difference lie in parameter estimate method? You mentioned earlier that maximum likelihood is not appropriated because sample distribution is highly skewed. Then how do you estimate the parameters given the same link function?
Thanks for the question -- actually, full information ML and logistic regression is exactly the same if you have just a single dependent variable. So imagine you had a set of predictors and a single binary outcome -- you could equivalently estimate that using FIML as an SEM or ML as a logistic regression model; they are identical. But when you start to expand the SEM -- say with mediation or latent factors -- then you can no longer use the logistic regression model. But even in the expanded SEM, you are still using a logit link function with a binomial response distribution, the same as in logistic. Hope this helps -- patrick
Thanks for your viedio! But can I ask a question? When building time series factor analysis (TSFA) or structural equation models (SEM) using time series data, i.i.d and normality condition are obviously violated. In those cases, does the Maximum Likelihood (ML) estimator still exhibit beneficial properties, such as asymptotic unbiasedness, asymptotic efficiency, and asymptotic normality? Additionally, are Z-tests and chi-square tests appropriate for application in this context, and if so, why? Thanks again.
There are several ways to allow for non-independence. For panel data, a common approach is to represent each repeated measure as a manifest variable. Then, the correlations among the repeated measures can be directly modeled, for instance, via a latent curve model. For clustered data, there is a multilevel SEM that models both the within- and between-groups covariance matrices. And -- most relevant for you -- for time series data (many observations per unit), there are extensions to the usual SEM to allow for non-independence within the framework known as Dynamic Structural Equation Modeling. You may want to look into this approach for your situation
Hello, Please, I have a question about the SEM. Could the model contain research constructs (latent variables) for which the measurements come from different populations? I give you an example, in our research hypotheses we believe that the latent variable X has an influence on the latent variable Y. However, measuring the variable X requires a questionnaire with a population X' (students for example) while measuring the variable Y requires a survey with population Y' (teachers). My question is: could the model contain latent variables whose measurement variables are collected from different categories of populations? I ask this question because in the scientific articles that I have consulted, I find that there is always only one questionnaire distributed to respondents to measure all latent variables. Thank you.
Fantastic - and brief - explanation of the essentials. Thanks Pat! Have shared this with colleagues many times.
Thank you so much for your video series! You are fantastic and explain concepts very clearly! I will be sharing these videos with my students! :)
Just finished and understood this playlist, more to go, thank you very much.
Thank you very much, !! Loved your series, it was different from all others and was very interesting and I learnt many things.
Your lecture is very beneficial and easy to understand..The way you explained the moment conditions..thank you very much for the video..
thanks a lot, excellent series of videos on SEM!!!!!!
Very well explained, thank you!
Dear Patrick. Thank you very much for all your enormously helpful videos on SEM!:) I used Likert Scales to measure the latent constructs of my research. As expected, the assumption of multivariate normality did not hold. Would the DWLS (diagonally weighted least squares) estimator be the correct approach here or is the ML with robust standard errors the better variant? I get different results depending on which estimator I use. Thank you in advance.
Hi Mary -- thanks for your nice note. Briefly, there's no crystal clear response to your question. In a strict sense, a Likert scale would require the use of a nonlinear estimator -- either DWLS or FIML with discrete items. These methods are very well developed and work quite well. However, this also introduces complications into the analysis that can lead to problems in model convergence and stability. Some recent simulation findings have suggested that Likert scales with at least five levels can be used with robust ML -- I pasted an exemplar simulation on this topic below. My personal feeling is that if you have at least five levels and have a reasonable distribution across the five options (so not with say 90% all endorsing the first value and the remaining 10% spread out over the other four), the robust ML is a good option. If you don't have a reasonable distribution across levels or have fewer than five, then WLS or FIML are necessary. Hope this helps -- good luck with your work -- patrick
Rhemtulla, M., Brosseau-Liard, P. É., & Savalei, V. (2012). When can categorical variables be treated as continuous? A comparison of robust continuous and categorical SEM estimation methods under suboptimal conditions. Psychological methods, 17, 354.
@@centerstat Thank you very much for the fast and detailed explanation, that helps a lot! :) Best, Mary
Hello! thanks for your videos! I can see there are advantages of use of latent variables compared to means. I'd like to ask if it's important to test common method bais in SEM.
Thanks for your question. In my opinion, it is always important to consider the potential for method bias in any SEM. The trick is having data available to actually empirically evaluate that hypothesis. If you're interested, there's been some great work done on multi-trait, multi-method models in SEM. The literature is vast, but a couple of readings are below. Good luck with your work -- patrick
Koch, T., Eid, M., & Lochner, K. (2018). Multitrait‐Multimethod‐Analysis: The Psychometric Foundation of CFA‐MTMM Models. The Wiley Handbook of Psychometric Testing: A Multidisciplinary Reference on Survey, Scale and Test Development, 781-846.
Castro-Schilo, L., Grimm, K. J., & Widaman, K. F. (2016). Augmenting the correlated trait-correlated method model for multitrait-multimethod data. Structural Equation Modeling: A Multidisciplinary Journal, 23(6), 798-818.
@@centerstat Dear Patrick, thank you for your reply!
Thank you, Prof. Curran. But I still didn't figure out the difference between full info discrete methods you mentioned and logistic regression. Does the difference lie in parameter estimate method? You mentioned earlier that maximum likelihood is not appropriated because sample distribution is highly skewed. Then how do you estimate the parameters given the same link function?
Thanks for the question -- actually, full information ML and logistic regression is exactly the same if you have just a single dependent variable. So imagine you had a set of predictors and a single binary outcome -- you could equivalently estimate that using FIML as an SEM or ML as a logistic regression model; they are identical. But when you start to expand the SEM -- say with mediation or latent factors -- then you can no longer use the logistic regression model. But even in the expanded SEM, you are still using a logit link function with a binomial response distribution, the same as in logistic. Hope this helps -- patrick
Thanks for your viedio! But can I ask a question? When building time series factor analysis (TSFA) or structural equation models (SEM) using time series data, i.i.d and normality condition are obviously violated. In those cases, does the Maximum Likelihood (ML) estimator still exhibit beneficial properties, such as asymptotic unbiasedness, asymptotic efficiency, and asymptotic normality? Additionally, are Z-tests and chi-square tests appropriate for application in this context, and if so, why? Thanks again.
There are several ways to allow for non-independence. For panel data, a common approach is to represent each repeated measure as a manifest variable. Then, the correlations among the repeated measures can be directly modeled, for instance, via a latent curve model. For clustered data, there is a multilevel SEM that models both the within- and between-groups covariance matrices. And -- most relevant for you -- for time series data (many observations per unit), there are extensions to the usual SEM to allow for non-independence within the framework known as Dynamic Structural Equation Modeling. You may want to look into this approach for your situation
@@centerstat Thanks a lot
Hello,
Please, I have a question about the SEM. Could the model contain research constructs (latent variables) for which the measurements come from different populations? I give you an example, in our research hypotheses we believe that the latent variable X has an influence on the latent variable Y. However, measuring the variable X requires a questionnaire with a population X' (students for example) while measuring the variable Y requires a survey with population Y' (teachers). My question is: could the model contain latent variables whose measurement variables are collected from different categories of populations?
I ask this question because in the scientific articles that I have consulted, I find that there is always only one questionnaire distributed to respondents to measure all latent variables.
Thank you.
@@centerstat Thank you very much.