I am conducting a path model using WLSMV estimation. The test statistic and degrees of freedom in the Model Test Baseline Model output are incorrect. #408

weberhw · 2025-01-02T11:43:03Z

Model Test User Model:
Standard Scaled
Test Statistic 321.547 294.829
Degrees of freedom 33 33
P-value (Chi-square) 0.000 0.000
Scaling correction factor 1.102
Shift parameter 2.964
simple second-order correction

Model Test Baseline Model:

Test statistic 54.250 51.783
Degrees of freedom 15 15
P-value 0.000 0.000
Scaling correction factor 1.067

User Model versus Baseline Model:

Comparative Fit Index (CFI) 0.000 0.000
Tucker-Lewis Index (TLI) -2.342 -2.236

yrosseel · 2025-01-02T16:15:51Z

Could please give a bit more context. For example, the model (syntax) that you used to fit the model? And which variables are considered to be categorical or numeric.

weberhw · 2025-01-02T16:47:08Z

path_model = " respect ~ unconfor respect ~ harass gen_eq ~ respect gen_eq ~ self fearf ~ self rap_mth ~ gen_eq porn_vio ~ respect rape_3 ~ nutrit rape_3 ~ fearf rape_3 ~ gen_eq rape_3 ~ self rape_3 ~ unconfor rape_3 ~ harass rape_3 ~ sex rape_3 ~ age " path.m = lavaan::sem(model = path_model, data = path_da1, estimator = "WLSMV", missing = "listwise", ordered = c("rape_3") path.s = summary(path.m, fit.measures = TRUE, standardized = TRUE, rsquare = TRUE) Yves Rosseel ***@***.***> 於 2025年1月3日週五上午12:16寫道：

…

Could please give a bit more context. For example, the model (syntax) that you used to fit the model? And which variables are considered to be categorical or numeric. — Reply to this email directly, view it on GitHub <#408 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BOEHEJZAX4BHMDIOGQPJZMT2IVQ43AVCNFSM6AAAAABUP2CCK6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNRYGAZDGNJYGM> . You are receiving this because you authored the thread.Message ID: ***@***.***>

weberhw · 2025-01-02T16:51:52Z

path_da1 = na.omit(offend_lavaan) Weber Hwang ***@***.***> 於 2025年1月3日週五上午12:46寫道：

…

path_model = " respect ~ unconfor respect ~ harass gen_eq ~ respect gen_eq ~ self fearf ~ self rap_mth ~ gen_eq porn_vio ~ respect rape_3 ~ nutrit rape_3 ~ fearf rape_3 ~ gen_eq rape_3 ~ self rape_3 ~ unconfor rape_3 ~ harass rape_3 ~ sex rape_3 ~ age " path.m = lavaan::sem(model = path_model, data = path_da1, estimator = "WLSMV", missing = "listwise", ordered = c("rape_3") path.s = summary(path.m, fit.measures = TRUE, standardized = TRUE, rsquare = TRUE) Yves Rosseel ***@***.***> 於 2025年1月3日週五上午12:16寫道： > Could please give a bit more context. For example, the model (syntax) > that you used to fit the model? And which variables are considered to be > categorical or numeric. > > — > Reply to this email directly, view it on GitHub > <#408 (comment)>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/BOEHEJZAX4BHMDIOGQPJZMT2IVQ43AVCNFSM6AAAAABUP2CCK6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNRYGAZDGNJYGM> > . > You are receiving this because you authored the thread.Message ID: > ***@***.***> >

yrosseel · 2025-01-03T17:14:53Z

Ok. So let's break this down. We have 6 dependent variables (1 is categorical), and 6 exogenous variables. I will assume rape_3 is binary.

Because lavaan is using conditional.x = TRUE by default, we have the following parameters in the unrestricted/saturated model:

6*5/2 = 15 correlation/covariances
5 variances for the non-categorical y's
1 threshold
5 means for the non-categorical y's
6*6 regression coefficients for the slope structure

Together, there are 62 parameters.

The user-specified model has (only) 29 free parameters, resulting in 62-29=33 degrees of freedom. So far so good.

Consider now the baseline model. You can 'see' the parameter table as follows:
as.data.frame(path.m@baseline$partable)
These are the free parameters of the baseline model:

5 variances (for the non-categorical y's)
1 threshold
5 means
... AND 36 regression coefficients for the slope structure.

And therefore, we have 47 free parameters (more than the user-specified model), resulting in 62-47=15 df.

So why do we add the 36 regression coefficients as 'free parameters' in the baseline model? Because we think this is more fair for the baseline model. Otherwise, we fix them to zero, resulting in a terrible fit for the baseline model, and this results in inflated/too-optimistic CFI/TLI measures. Of course, if the user-specified model fails to capture most of these regression effects, this may result in a user-specified model that fits worse than the baseline model.

If you don't like this behavior, you can change it by using the option baseline.conditional.x.free.slopes = FALSE. Or you can switch to conditional.x = FALSE, and then you will get df=51 for the baseline model.

weberhw · 2025-01-06T02:01:34Z

Thank you for your valuable response. When testing, adding only baseline.conditional.x.free.slopes = FALSE resulted in a model degree of freedom of 51, and the CFI and TLI indices calculated were as expected. However, adding only conditional.x = FALSE led to the following error message: Error in eigen(VarCov, symmetric = TRUE, only.values = TRUE): 'x' contains infinite or missing values In addition: Warning message: In sqrt(A1[[g]]): NaNs produced When both baseline.conditional.x.free.slopes = FALSE and conditional.x = FALSE were added, the following error message was encountered: Error in eigen(VarCov, symmetric = TRUE, only.values = TRUE): 'x' contains infinite or missing values In addition: Warning message: In sqrt(A1[[g]]): NaNs produced Additionally, when I switched the model to use ML estimation without adding baseline.conditional.x.free.slopes = FALSE, the degrees of freedom obtained were 51, and the CFI and TLI values were as expected. Why, then, does the same model have different degree of freedom calculations under WLSMV and ML estimation methods? Yves Rosseel ***@***.***> 於 2025年1月4日週六上午1:15寫道：

…

Ok. So let's break this down. We have 6 dependent variables (1 is categorical), and 6 exogenous variables. I will assume rape_3 is binary. Because lavaan is using conditional.x = TRUE by default, we have the following parameters in the unrestricted/saturated model: - 6*5/2 = 15 correlation/covariances - 5 variances for the non-categorical y's - 1 threshold - 5 means for the non-categorical y's - 6*6 regression coefficients for the slope structure Together, there are 62 parameters. The user-specified model has (only) 29 free parameters, resulting in 62-29=33 degrees of freedom. So far so good. Consider now the baseline model. You can 'see' the parameter table as follows: ***@***.***$partable) These are the free parameters of the baseline model: - 5 variances (for the non-categorical y's) - 1 threshold - 5 means - ... AND 36 regression coefficients for the slope structure. And therefore, we have 47 free parameters (more than the user-specified model), resulting in 62-47=15 df. So why do we add the 36 regression coefficients as 'free parameters' in the baseline model? Because we think this is more fair for the baseline model. Otherwise, we fix them to zero, resulting in a terrible fit for the baseline model, and this results in inflated/too-optimistic CFI/TLI measures. Of course, if the user-specified model fails to capture most of these regression effects, this may result in a user-specified model that fits worse than the baseline model. If you don't like this behavior, you can change it by using the option baseline.conditional.x.free.slopes = FALSE. Or you can switch to conditional.x = FALSE, and then you will get df=51 for the baseline model. — Reply to this email directly, view it on GitHub <#408 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BOEHEJ4XKD5NNLRO3Y6HV7T2I3ASFAVCNFSM6AAAAABUP2CCK6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNRZGU3DCNJXHE> . You are receiving this because you authored the thread.Message ID: ***@***.***>

yrosseel · 2025-01-06T17:15:52Z

Hm. To understand the error you get when using conditional.x = FALSE ( 'x' contains infinite or missing values), I would need to see the data. Can you send it to me?

If you use conditional.x = FALSE , the baseline.conditional.x.free.slopes option has no effect, as it is only relevant for the conditional setting.

If you switch to ML, by default, lavaan uses conditional.x = FALSE. So the difference (in degrees of freedom) is only due to the switch to conditional.x = TRUE when data is categorical.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I am conducting a path model using WLSMV estimation. The test statistic and degrees of freedom in the Model Test Baseline Model output are incorrect. #408

I am conducting a path model using WLSMV estimation. The test statistic and degrees of freedom in the Model Test Baseline Model output are incorrect. #408

weberhw commented Jan 2, 2025

yrosseel commented Jan 2, 2025

weberhw commented Jan 2, 2025 via email

weberhw commented Jan 2, 2025 via email

yrosseel commented Jan 3, 2025

weberhw commented Jan 6, 2025 via email

yrosseel commented Jan 6, 2025

I am conducting a path model using WLSMV estimation. The test statistic and degrees of freedom in the Model Test Baseline Model output are incorrect. #408

I am conducting a path model using WLSMV estimation. The test statistic and degrees of freedom in the Model Test Baseline Model output are incorrect. #408

Comments

weberhw commented Jan 2, 2025

yrosseel commented Jan 2, 2025

weberhw commented Jan 2, 2025 via email

weberhw commented Jan 2, 2025 via email

yrosseel commented Jan 3, 2025

weberhw commented Jan 6, 2025 via email

yrosseel commented Jan 6, 2025