Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I am conducting a path model using WLSMV estimation. The test statistic and degrees of freedom in the Model Test Baseline Model output are incorrect. #408

Open
weberhw opened this issue Jan 2, 2025 · 6 comments

Comments

@weberhw
Copy link

weberhw commented Jan 2, 2025

Model Test User Model:
Standard Scaled
Test Statistic 321.547 294.829
Degrees of freedom 33 33
P-value (Chi-square) 0.000 0.000
Scaling correction factor 1.102
Shift parameter 2.964
simple second-order correction

Model Test Baseline Model:

Test statistic 54.250 51.783
Degrees of freedom 15 15
P-value 0.000 0.000
Scaling correction factor 1.067

User Model versus Baseline Model:

Comparative Fit Index (CFI) 0.000 0.000
Tucker-Lewis Index (TLI) -2.342 -2.236

@yrosseel
Copy link
Owner

yrosseel commented Jan 2, 2025

Could please give a bit more context. For example, the model (syntax) that you used to fit the model? And which variables are considered to be categorical or numeric.

@weberhw
Copy link
Author

weberhw commented Jan 2, 2025 via email

@weberhw
Copy link
Author

weberhw commented Jan 2, 2025 via email

@yrosseel
Copy link
Owner

yrosseel commented Jan 3, 2025

Ok. So let's break this down. We have 6 dependent variables (1 is categorical), and 6 exogenous variables. I will assume rape_3 is binary.

Because lavaan is using conditional.x = TRUE by default, we have the following parameters in the unrestricted/saturated model:

  • 6*5/2 = 15 correlation/covariances
  • 5 variances for the non-categorical y's
  • 1 threshold
  • 5 means for the non-categorical y's
  • 6*6 regression coefficients for the slope structure

Together, there are 62 parameters.

The user-specified model has (only) 29 free parameters, resulting in 62-29=33 degrees of freedom. So far so good.

Consider now the baseline model. You can 'see' the parameter table as follows:
as.data.frame(path.m@baseline$partable)
These are the free parameters of the baseline model:

  • 5 variances (for the non-categorical y's)
  • 1 threshold
  • 5 means
  • ... AND 36 regression coefficients for the slope structure.

And therefore, we have 47 free parameters (more than the user-specified model), resulting in 62-47=15 df.

So why do we add the 36 regression coefficients as 'free parameters' in the baseline model? Because we think this is more fair for the baseline model. Otherwise, we fix them to zero, resulting in a terrible fit for the baseline model, and this results in inflated/too-optimistic CFI/TLI measures. Of course, if the user-specified model fails to capture most of these regression effects, this may result in a user-specified model that fits worse than the baseline model.

If you don't like this behavior, you can change it by using the option baseline.conditional.x.free.slopes = FALSE. Or you can switch to conditional.x = FALSE, and then you will get df=51 for the baseline model.

@weberhw
Copy link
Author

weberhw commented Jan 6, 2025 via email

@yrosseel
Copy link
Owner

yrosseel commented Jan 6, 2025

Hm. To understand the error you get when using conditional.x = FALSE ( 'x' contains infinite or missing values), I would need to see the data. Can you send it to me?

If you use conditional.x = FALSE , the baseline.conditional.x.free.slopes option has no effect, as it is only relevant for the conditional setting.

If you switch to ML, by default, lavaan uses conditional.x = FALSE. So the difference (in degrees of freedom) is only due to the switch to conditional.x = TRUE when data is categorical.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants