This will not be news to you, but is worth saying: of the several available RUclips courses on this topic, this one is by far the clearest, most informative, most helpful, most coherent, most interesting, and most rewarding. Bravo!
Dr. Curran, you rock! I find your videos/lectures soooo clear, so easy to follow and sooo informative. I wish I had come across your lectures earlier. So very thank you!
Thanks so much for your very kind words. I'm really glad you find them useful. My hope is to put up more episodes sometime soon. Good luck with your work....patrick
Thank you very much for sharing! And I think if possible maybe you can list the recommended references in your video so that we can check them after watching it
Thank you Professor for the amazing video. I have a question regarding the factor score and the subsequent path analysis. As you mentioned in 26:40, this procedure can produce a similar assumption of perfect measurement reliability. I didn't understand how the SEM overcomes this problem. Is it by making these two processes simultaneously?
thanks for your kind words. You are correct: when you use multiple indicator latent factors in the full SEM, you are able to separate the true factor variance from the item-specific residual variance. As such, the latent factors are assumed to be "error free". I'm careful here in saying "assumed" because this only holds if you meet certain underlying assumptions of the model. But in many situations the latent factor is a marked improvement over simple scale scores. Thanks again for watching -- patrick
Hi Patrick, This was a great video and you explain the concepts very well! One question though, what are your thoughts on interpreting the lambdas (factor loadings) to the eta (latent variable)? For example, if I have strong loadings from item 2 that is significant (lavaan in R gives you p-values) can I say that item 2 is a strong indicator of the latent variable? If you can, I'm also wondering how you would interpret the estimate of Item 1, which is often set to 1.0 as you explained at 22:01 that does not give you a significance value. Not sure if these questions make sense but would love to hear your thoughts as I'm just starting to learn about SEMs. Thanks!
Hi Jin -- thanks for your nice comments. There are generally three ways we can assess "importance" of an item. The first is statistical significance; the second is communality of the item (this is the same as the item's r-squared; and the third is the magnitude of the standardized loading (which is of course directly tied to both the significance and the communality). The higher the communality (and thus higher the standardized loading) the more strongly the item is related to the factor. Even if you set an item to 1.0 to define the scale of the factor, you still get both a standardized estimate and a communality for the item, so that's how you can assess the relative importance of that item. There are no fixed values that denote a good vs. bad item -- it depends on many things. I hope this is of some use -- good luck with your work -- patrick
Hello Dr. Curran! Thank you for the nice lecture. I have learnt that there are various rotations other than what you mentioned (i.e., oblique rotation). I was wondering that whether there are any standard specification about which rotation to use in what type of analysis. Can you please explain further in this regard? Thank you once again.
Hi -- thanks for your nice note. It turns out there are a mind-numbing number of potential rotations from which to choose. I think the general belief is that the primary choice is between orthogonal (where the factors do not correlate) and oblique (where the factors are allowed to correlate), and the specific type of each is less important. This is not to say there aren't differences across method, but you tend to need to be motivated by a specific goal to choose one over another. The orthogonal choice is often VARIMAX or QUARTIMAX, and for oblique there are several popular ones including PROMAX and direct oblimin, but there are many others. Any good factor analysis book will discuss these in detail. Good luck with your work -- patrick
@@centerstat Thanks a lot for the explanation. I will get in touch for further query. Thank you once again for the great lecture series on SEM. It is really helpful.
Thank you, dear professor I'm a question in Exploratory Factor analysis exactly in step of estimation of parameters ( Factor extraction) . In EFA we have two parameters: 1 The pattern matrix of loadings( lamda) 2 the diagonal matrix of uniqueness or errors noted by ( Psi or theta). And we have a constraint that : diagonal of empirical correlation matrix is equal to the diagonal of implied correlation matrix, implies that a relation between lamda and theta, so that theta is not a parameter libre!! The procedure of estimation is We found the a lamda that minimizes my function maximum likelihood and we compute the diagonal matrix psi!! Thank you in advanced.
Hi Ahmed -- thanks for your note. You raise a great question. If you are conducting a principal components analysis (PCA) then the item residual variance is not an estimated parameter in the model because it is simply the complement of 1.0 (the standardized variance of the item). But if you are doing a "true" factor analysis that allows for unreliability in the items (and thus has some value less than 1.0 on the diagonal of the correlation matrix) then the residuals are an estimated parameter. Traditionally this might be done using principal axis factoring, or more modern methods based on maximum likelihood estimation. So if it is a true factor analysis that allows for unreliability in the items, then the estimated model parameters are the factor loadings and the item residuals. Hope this helps -- good luck with your work -- patrick
@@centerstat Thank you for your définitive answer, he helped me a lot. But the problem for me, is that Factor analysis the diagonal of implied correlation matrix is always 1. That implies the constraint between the matrix of loadings and the residuals. So we have one parameter libre lamda. For what than we done in iterative procedure!! Thank you for send me a reference. Ahmed Seyid PhD student in Statistics.
@@ebnouseyid5518 Hi Ahmed -- in the exploratory factor analysis model the model-implied correlation matrix is not required to have ones on the diagonal. Indeed, the diagonal elements are almost always values less than one and represent what are called the "communalities" -- it is that part of the observed variance of an item that is explained by the underlying factor. In that way, these are precisely the same as a multiple r-squared in the regression model. There are many wonderful resources on these models, but one of my favorite is a book by Tim Brown -- this might be of interest in thinking about these issues. Brown, T. A. (2015). Confirmatory factor analysis for applied research. Guilford publications. There is also a more technical treatment of the EFA in Tucker & MacCallum, and this is freely available online: quantpsych.unc.edu/wp-content/uploads/sites/214/2018/12/TuckerMacCallum1997.pdf I hope these are of some use -- patrick
Great videos, really informative and helpful. I was wondering about the differentiation between latent variables and scale variables, this is something that I haven't seen in many videos explaining SEM and CFA. What is the rationale that explains the increased power in using latent factors and not scale variables based on the mean of all items? I hope you'll see this, and thanks again for the videos!
Hi -- thanks for the nice words. Briefly, power typically is higher with latent factors because you are disattenuating your regression coefficients from the deleterious effects of measurement error. That is, the GLM assumes all variables are measured without error -- so using scale scores in a regression leads to biased regressions (if error is in the IV) and inflated SEs (if error is in the DV). That's why it can be very important to consider latent factors if possible. Good luck with your work -- patrick
Thanks for the question -- Briefly, principal components analysis is a data-reduction method based on eigenvalues and eigenvectors of the observed correlation matrix. It assumes that all observed variance is available for factoring and thus assumes there is no measurement error in the items. As such, the components are direct weighted composites of the observed items. In contrast, exploratory factor analysis (that encompasses a broad range of methods of estimation, maximum likelihood being the most commonly used) assumes some value *less* than the observed variance is available for factoring and thus builds a model to account for measurement error (so-called "unique" factors). As such, the factors cannot be directly computed (as in PCA) but are inferred by the items. A few cites are below. I hope this is of some use -- patrick MacCallum, R. C. (2009). Factor analysis. In R. E. Millsap & A. Maydeu-Olivares (Eds.), The Sage handbook of quantitative methods in psychology (pp. 123-147). Sage Publications Ltd. doi.org/10.4135/9780857020994.n6 Widaman, K. F. (1993). Common factor analysis versus principal component analysis: Differential bias in representing model parameters?. Multivariate behavioral research, 28(3), 263-311. Widaman, K. F. (2018). On common factor and principal component representations of data: Implications for theory and for confirmatory replications. Structural Equation Modeling: A Multidisciplinary Journal, 25(6), 829-847.
Thanks Dear professor. I have a question on the assumptions of Exploratory Factor analysis model: In EFA, in most cases the researchers assume that the measurement errors are not correlated between the factors and not correlated between them. This assumption isn't valid in reality?
Thanks for your question -- in EFA it is nearly universally assumed that the residuals are not correlated. Further, it can be assumed that the factors themselves are uncorrelated, but this need not hold -- it is quite common to use methods of rotation that allow for correlated factors within the EFA. However, moving from the EFA to the CFA allows you access to correlated residuals if you have hypotheses about these. Indeed, in CFA you can make this a testable hypothesis (e.g., comparing model fit with and without correlated residuals). Hope this helps -- patrick
@@centerstat many thanks Dear professor, You can give me a list of clues that can help me compare between the EFA with correlated errors and EFA with uncorrelated errors. Thanks in advance.
@@ebnouseyid5518 Thanks for your nice words. Just so you know, Dan Bauer and I have a fully free 3-day workshop on structural equation models that goes into great detail on confirmatory factor analysis -- see centerstat.org/ for more information -- patrick
This will not be news to you, but is worth saying: of the several available RUclips courses on this topic, this one is by far the clearest, most informative, most helpful, most coherent, most interesting, and most rewarding. Bravo!
your style of presentation and clarity make following your presentations simple and easy to grab
Very good videos, as I read in another comment, it's nice when someone who really knows what he is talking about explains a topic like this.
Dr. Curran, you rock! I find your videos/lectures soooo clear, so easy to follow and sooo informative. I wish I had come across your lectures earlier. So very thank you!
Thanks so much for your very kind words. I'm really glad you find them useful. My hope is to put up more episodes sometime soon. Good luck with your work....patrick
Thank you. Its so informative for first timers of SEM
Thank you for this great video! I am looking forward to your workshop next week:)
Thank you professor for sharing your invaluable knowledge with us. Very helpful videos.
This is life changing, thank you!
Lauren -- you're very sweet. Thanks for the kind words. Good luck with your work -- patrick
I freaking love this stuff!
Aaah, thank you for this! Excellent work and very interesting as well!
Brilliantly explained !! Thank you !!
a good explanation Dr for first timer. i got lost in reading about this from the books/articles.
thank you, very informative and clear
Thank you very much for sharing! And I think if possible maybe you can list the recommended references in your video so that we can check them after watching it
Thank you very much! It helps me a lot!
Thank you Professor for the amazing video.
I have a question regarding the factor score and the subsequent path analysis. As you mentioned in 26:40, this procedure can produce a similar assumption of perfect measurement reliability. I didn't understand how the SEM overcomes this problem. Is it by making these two processes simultaneously?
thanks for your kind words. You are correct: when you use multiple indicator latent factors in the full SEM, you are able to separate the true factor variance from the item-specific residual variance. As such, the latent factors are assumed to be "error free". I'm careful here in saying "assumed" because this only holds if you meet certain underlying assumptions of the model. But in many situations the latent factor is a marked improvement over simple scale scores. Thanks again for watching -- patrick
Hi Patrick, This was a great video and you explain the concepts very well! One question though, what are your thoughts on interpreting the lambdas (factor loadings) to the eta (latent variable)? For example, if I have strong loadings from item 2 that is significant (lavaan in R gives you p-values) can I say that item 2 is a strong indicator of the latent variable? If you can, I'm also wondering how you would interpret the estimate of Item 1, which is often set to 1.0 as you explained at 22:01 that does not give you a significance value. Not sure if these questions make sense but would love to hear your thoughts as I'm just starting to learn about SEMs. Thanks!
Hi Jin -- thanks for your nice comments. There are generally three ways we can assess "importance" of an item. The first is statistical significance; the second is communality of the item (this is the same as the item's r-squared; and the third is the magnitude of the standardized loading (which is of course directly tied to both the significance and the communality). The higher the communality (and thus higher the standardized loading) the more strongly the item is related to the factor. Even if you set an item to 1.0 to define the scale of the factor, you still get both a standardized estimate and a communality for the item, so that's how you can assess the relative importance of that item. There are no fixed values that denote a good vs. bad item -- it depends on many things. I hope this is of some use -- good luck with your work -- patrick
@@pjcurran496 Thank you so much for the detailed response. This was tremendously helpful.
Hello Dr. Curran! Thank you for the nice lecture. I have learnt that there are various rotations other than what you mentioned (i.e., oblique rotation). I was wondering that whether there are any standard specification about which rotation to use in what type of analysis. Can you please explain further in this regard? Thank you once again.
Hi -- thanks for your nice note. It turns out there are a mind-numbing number of potential rotations from which to choose. I think the general belief is that the primary choice is between orthogonal (where the factors do not correlate) and oblique (where the factors are allowed to correlate), and the specific type of each is less important. This is not to say there aren't differences across method, but you tend to need to be motivated by a specific goal to choose one over another. The orthogonal choice is often VARIMAX or QUARTIMAX, and for oblique there are several popular ones including PROMAX and direct oblimin, but there are many others. Any good factor analysis book will discuss these in detail. Good luck with your work -- patrick
@@centerstat Thanks a lot for the explanation. I will get in touch for further query. Thank you once again for the great lecture series on SEM. It is really helpful.
Thank you, dear professor
I'm a question in Exploratory Factor analysis exactly in step of estimation of parameters ( Factor extraction) .
In EFA we have two parameters:
1 The pattern matrix of loadings( lamda)
2 the diagonal matrix of uniqueness or errors noted by ( Psi or theta).
And we have a constraint that : diagonal of empirical correlation matrix is equal to the diagonal of implied correlation matrix, implies that a relation between lamda and theta, so that theta is not a parameter libre!!
The procedure of estimation is
We found the a lamda that minimizes my function maximum likelihood and we compute the diagonal matrix psi!!
Thank you in advanced.
Hi Ahmed -- thanks for your note. You raise a great question. If you are conducting a principal components analysis (PCA) then the item residual variance is not an estimated parameter in the model because it is simply the complement of 1.0 (the standardized variance of the item). But if you are doing a "true" factor analysis that allows for unreliability in the items (and thus has some value less than 1.0 on the diagonal of the correlation matrix) then the residuals are an estimated parameter. Traditionally this might be done using principal axis factoring, or more modern methods based on maximum likelihood estimation. So if it is a true factor analysis that allows for unreliability in the items, then the estimated model parameters are the factor loadings and the item residuals. Hope this helps -- good luck with your work -- patrick
@@centerstat Thank you for your définitive answer, he helped me a lot.
But the problem for me, is that Factor analysis the diagonal of implied correlation matrix is always 1.
That implies the constraint between the matrix of loadings and the residuals.
So we have one parameter libre lamda.
For what than we done in iterative procedure!!
Thank you for send me a reference.
Ahmed Seyid PhD student in Statistics.
@@ebnouseyid5518 Hi Ahmed -- in the exploratory factor analysis model the model-implied correlation matrix is not required to have ones on the diagonal. Indeed, the diagonal elements are almost always values less than one and represent what are called the "communalities" -- it is that part of the observed variance of an item that is explained by the underlying factor. In that way, these are precisely the same as a multiple r-squared in the regression model. There are many wonderful resources on these models, but one of my favorite is a book by Tim Brown -- this might be of interest in thinking about these issues.
Brown, T. A. (2015). Confirmatory factor analysis for applied research. Guilford publications.
There is also a more technical treatment of the EFA in Tucker & MacCallum, and this is freely available online:
quantpsych.unc.edu/wp-content/uploads/sites/214/2018/12/TuckerMacCallum1997.pdf
I hope these are of some use -- patrick
@@pjcurran496 many thanks
Excellent!
Very good super video
Great videos, really informative and helpful. I was wondering about the differentiation between latent variables and scale variables, this is something that I haven't seen in many videos explaining SEM and CFA. What is the rationale that explains the increased power in using latent factors and not scale variables based on the mean of all items? I hope you'll see this, and thanks again for the videos!
Hi -- thanks for the nice words. Briefly, power typically is higher with latent factors because you are disattenuating your regression coefficients from the deleterious effects of measurement error. That is, the GLM assumes all variables are measured without error -- so using scale scores in a regression leads to biased regressions (if error is in the IV) and inflated SEs (if error is in the DV). That's why it can be very important to consider latent factors if possible. Good luck with your work -- patrick
@@centerstat Thank you very much for your reply! So happy to have come across your videos.
Thanks
Prof. Curran, what is the difference of underlying logic between principal component analysis and exploratory factor analysis?
Thanks for the question -- Briefly, principal components analysis is a data-reduction method based on eigenvalues and eigenvectors of the observed correlation matrix. It assumes that all observed variance is available for factoring and thus assumes there is no measurement error in the items. As such, the components are direct weighted composites of the observed items. In contrast, exploratory factor analysis (that encompasses a broad range of methods of estimation, maximum likelihood being the most commonly used) assumes some value *less* than the observed variance is available for factoring and thus builds a model to account for measurement error (so-called "unique" factors). As such, the factors cannot be directly computed (as in PCA) but are inferred by the items. A few cites are below. I hope this is of some use -- patrick
MacCallum, R. C. (2009). Factor analysis. In R. E. Millsap & A. Maydeu-Olivares (Eds.), The Sage handbook of quantitative methods in psychology (pp. 123-147). Sage Publications Ltd. doi.org/10.4135/9780857020994.n6
Widaman, K. F. (1993). Common factor analysis versus principal component analysis: Differential bias in representing model parameters?. Multivariate behavioral research, 28(3), 263-311.
Widaman, K. F. (2018). On common factor and principal component representations of data: Implications for theory and for confirmatory replications. Structural Equation Modeling: A Multidisciplinary Journal, 25(6), 829-847.
Thanks Dear professor.
I have a question on the assumptions of Exploratory Factor analysis model:
In EFA, in most cases the researchers assume that the measurement errors are not correlated between the factors and not correlated between them.
This assumption isn't valid in reality?
Thanks for your question -- in EFA it is nearly universally assumed that the residuals are not correlated. Further, it can be assumed that the factors themselves are uncorrelated, but this need not hold -- it is quite common to use methods of rotation that allow for correlated factors within the EFA. However, moving from the EFA to the CFA allows you access to correlated residuals if you have hypotheses about these. Indeed, in CFA you can make this a testable hypothesis (e.g., comparing model fit with and without correlated residuals). Hope this helps -- patrick
@@centerstat many thanks Dear professor, You can give me a list of clues that can help me compare between the EFA with correlated errors and EFA with uncorrelated errors.
Thanks in advance.
@@ebnouseyid5518 Thanks for your nice words. Just so you know, Dan Bauer and I have a fully free 3-day workshop on structural equation models that goes into great detail on confirmatory factor analysis -- see centerstat.org/ for more information -- patrick
@@centerstat Many thanks