Factor Analysis - CFA

Last updated on 2025-06-24 | Edit this page

Overview

Questions

What is a confirmatory factor analysis?
How can I run CFA in R?
How can I specify CFA models using lavaan?
How can I interpret the results of a CFA and EFA in R?

Objectives

Understand the difference between exploratory and confirmatory factor analysis
Learn how to conduct confirmatory factor analysis in R
Learn how to interpret the results of confirmatory factor analysis in R

Confirmatory Factor Analysis (CFA)

In the previous lesson, we approached the DASS as if we had no knowledge about the underlying factor structure. However, this is not really the case, since there is a specific theory proposing three interrelated factors. We can use confirmatory factor analysis to test whether this theory can adequately explain the item correlations found here.

Importantly, we not only need a theory of the number of factors, but also which items are supposed to load on which factor. This will be necessary when establishing our CFA model. Furthermore, we will test whether a one-factor model also is supported by the data and compare these two models, evaluating which one is better.

The main R package we will be using is lavaan. This package is widely used for Structural Equation Modeling (SEM) in R. CFA is a specific type of model in the general SEM framework. The lavaan homepage provides a great tutorial on CFA, as well (https://lavaan.ugent.be/tutorial/cfa.html). For a more detailed tutorial in SEM and mathematics behind CFA, please refer to the lavaan homepage or this introduction.

For now, let’s walk through the steps of a CFA in our specific case. First of all, in order to do confirmatory factor analysis, we need to have a hypothesis to confirm. How many factors, which variables are supposed to span which factor? Secondly, we need to investigate the correlation matrix for any issues (low correlations, correlations = 1). Then, we need to define our model properly using lavaan syntax. And finally, we can run our model and interpret our results.

Model Specification

Let’s start by defining the theory and the model syntax.

The DASS has 42 items and measures three scales: depression, anxiety, and stress. These scale are said to be intercorrelated, as they share common causes (genetic, environmental etc.). Furthermore, the DASS provides a detailed manual as to which items belong to which scales. The sequence of items and their respective scales is:

R

library(dplyr)

OUTPUT


Attaching package: 'dplyr'

OUTPUT

The following objects are masked from 'package:stats':

    filter, lag

OUTPUT

The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

R

dass_data <- read.csv("data/kaggle_dass/data.csv")

fa_data <- dass_data %>% 
  filter(testelapse < 600) %>% 
  select(starts_with("Q")& ends_with("A"))

item_scales_dass <- c(
  "S", "A", "D", "A", "D", "S", "A", 
  "S", "A", "D", "S", "S", "D", "S",
  "A", "D", "D", "S", "A", "A", "D",
  "S", "A", "D", "A", "D", "S", "A", 
  "S", "A", "D", "S", "S", "D", "S",
  "A", "D", "D", "S", "A", "A", "D"
)

item_scales_dass_data <- data.frame(
  item_nr = 1:42,
  scale = item_scales_dass
)

item_scales_dass_data

OUTPUT

   item_nr scale
1        1     S
2        2     A
3        3     D
4        4     A
5        5     D
6        6     S
7        7     A
8        8     S
9        9     A
10      10     D
11      11     S
12      12     S
13      13     D
14      14     S
15      15     A
16      16     D
17      17     D
18      18     S
19      19     A
20      20     A
21      21     D
22      22     S
23      23     A
24      24     D
25      25     A
26      26     D
27      27     S
28      28     A
29      29     S
30      30     A
31      31     D
32      32     S
33      33     S
34      34     D
35      35     S
36      36     A
37      37     D
38      38     D
39      39     S
40      40     A
41      41     A
42      42     D

Scales vs. Factors

There is an important difference between scales and factors. Again, factors represent mathematical constructs discovered or constructed during factor analysis. Scales represent a collection of items bound together by the authors of the test and said to measure a certain construct. These two words cannot be used synonymously. Scales refer to item collections whereas factors refer to solutions of factor analysis. A scale can build a factor, in fact its items should span a common factor if the scale is constructed well. But a scale is not the same thing as a factor.

Now, in order to define our model we need to keep the distinction between manifest and latent variables in mind. Our manifest, measured variables are the items. The factors are the latent variable. We can also define which manifest variables contribute to which latent variables. In lavaan, this can be specified in the model code using a special syntax.

R

library(lavaan)

OUTPUT

This is lavaan 0.6-19
lavaan is FREE software! Please report any bugs.

R

model_cfa_3facs <- c(
  "
  # Model Syntax
  
  # =~ is like in linear regression
  # left hand is latent factor, right hand manifest variables contributing
  
  # Define which items load on which factor
  depression =~ Q3A + Q5A + Q10A + Q13A + Q16A + Q17A + Q21A + Q24A + Q26A + Q31A + Q34A + Q37A + Q38A + Q42A
  
  anxiety =~ Q2A + Q4A + Q7A + Q9A + Q15A + Q19A + Q20A + Q23A + Q25A + Q28A + Q30A + Q36A + Q40A + Q41A
  
  stress =~ Q1A + Q6A + Q8A + Q11A + Q12A + Q14A + Q18A + Q22A + Q27A + Q29A + Q32A + Q33A + Q35A + Q39A
  
  # Define correlations between factors using ~~
  depression ~~ anxiety
  depression ~~ stress
  anxiety ~~ stress
  "
)

Now, we can fit this model using cfa().

R

fit_cfa_3facs <- cfa(
  model = model_cfa_3facs, 
  data = fa_data,
  )

To inspect the model results, we use summary().

R

summary(fit_cfa_3facs, fit.measures = TRUE, standardize = TRUE)

OUTPUT

lavaan 0.6-19 ended normally after 46 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                        87

  Number of observations                         37242

Model Test User Model:

  Test statistic                              107309.084
  Degrees of freedom                                 816
  P-value (Chi-square)                             0.000

Model Test Baseline Model:

  Test statistic                           1037764.006
  Degrees of freedom                               861
  P-value                                        0.000

User Model versus Baseline Model:

  Comparative Fit Index (CFI)                    0.897
  Tucker-Lewis Index (TLI)                       0.892

Loglikelihood and Information Criteria:

  Loglikelihood user model (H0)           -1857160.022
  Loglikelihood unrestricted model (H1)   -1803505.480

  Akaike (AIC)                             3714494.044
  Bayesian (BIC)                           3715235.735
  Sample-size adjusted Bayesian (SABIC)    3714959.249

Root Mean Square Error of Approximation:

  RMSEA                                          0.059
  90 Percent confidence interval - lower         0.059
  90 Percent confidence interval - upper         0.059
  P-value H_0: RMSEA <= 0.050                    0.000
  P-value H_0: RMSEA >= 0.080                    0.000

Standardized Root Mean Square Residual:

  SRMR                                           0.042

Parameter Estimates:

  Standard errors                             Standard
  Information                                 Expected
  Information saturated (h1) model          Structured

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  depression =~
    Q3A               1.000                               0.808    0.776
    Q5A               0.983    0.006  155.298    0.000    0.794    0.742
    Q10A              1.156    0.007  175.483    0.000    0.934    0.818
    Q13A              1.077    0.006  173.188    0.000    0.870    0.810
    Q16A              1.077    0.006  165.717    0.000    0.870    0.782
    Q17A              1.158    0.007  172.607    0.000    0.936    0.807
    Q21A              1.221    0.007  182.633    0.000    0.986    0.844
    Q24A              0.987    0.006  159.408    0.000    0.797    0.758
    Q26A              1.020    0.006  162.876    0.000    0.824    0.771
    Q31A              0.966    0.006  156.243    0.000    0.780    0.746
    Q34A              1.169    0.007  175.678    0.000    0.944    0.819
    Q37A              1.118    0.007  168.220    0.000    0.903    0.791
    Q38A              1.229    0.007  180.035    0.000    0.993    0.834
    Q42A              0.826    0.006  131.716    0.000    0.668    0.646
  anxiety =~
    Q2A               1.000                               0.563    0.506
    Q4A               1.282    0.014   93.089    0.000    0.722    0.691
    Q7A               1.303    0.014   94.229    0.000    0.734    0.707
    Q9A               1.323    0.014   93.485    0.000    0.745    0.696
    Q15A              1.116    0.013   89.050    0.000    0.629    0.636
    Q19A              1.077    0.013   83.014    0.000    0.606    0.565
    Q20A              1.473    0.015   96.526    0.000    0.830    0.742
    Q23A              0.898    0.011   84.756    0.000    0.506    0.584
    Q25A              1.244    0.014   89.922    0.000    0.701    0.647
    Q28A              1.468    0.015   98.046    0.000    0.827    0.767
    Q30A              1.260    0.014   90.684    0.000    0.710    0.657
    Q36A              1.469    0.015   96.658    0.000    0.827    0.744
    Q40A              1.374    0.015   93.607    0.000    0.774    0.698
    Q41A              1.256    0.014   91.905    0.000    0.707    0.674
  stress =~
    Q1A               1.000                               0.776    0.750
    Q6A               0.943    0.007  137.700    0.000    0.732    0.695
    Q8A               0.974    0.007  142.400    0.000    0.756    0.717
    Q11A              1.030    0.007  152.388    0.000    0.799    0.761
    Q12A              0.969    0.007  139.655    0.000    0.752    0.704
    Q14A              0.797    0.007  111.328    0.000    0.619    0.572
    Q18A              0.824    0.007  116.739    0.000    0.639    0.598
    Q22A              0.947    0.007  141.490    0.000    0.735    0.713
    Q27A              0.995    0.007  146.258    0.000    0.772    0.734
    Q29A              1.020    0.007  149.022    0.000    0.791    0.746
    Q32A              0.872    0.007  130.508    0.000    0.677    0.663
    Q33A              0.970    0.007  142.051    0.000    0.753    0.715
    Q35A              0.849    0.007  129.451    0.000    0.658    0.658
    Q39A              0.956    0.007  144.729    0.000    0.742    0.727

Covariances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  depression ~~
    anxiety           0.326    0.004   73.602    0.000    0.716    0.716
    stress            0.491    0.005   93.990    0.000    0.784    0.784
  anxiety ~~
    stress            0.381    0.005   76.884    0.000    0.873    0.873

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .Q3A               0.431    0.003  128.109    0.000    0.431    0.398
   .Q5A               0.515    0.004  129.707    0.000    0.515    0.450
   .Q10A              0.431    0.003  125.297    0.000    0.431    0.331
   .Q13A              0.398    0.003  125.958    0.000    0.398    0.345
   .Q16A              0.481    0.004  127.782    0.000    0.481    0.389
   .Q17A              0.467    0.004  126.117    0.000    0.467    0.348
   .Q21A              0.394    0.003  122.826    0.000    0.394    0.288
   .Q24A              0.471    0.004  129.017    0.000    0.471    0.426
   .Q26A              0.463    0.004  128.367    0.000    0.463    0.405
   .Q31A              0.486    0.004  129.556    0.000    0.486    0.444
   .Q34A              0.439    0.004  125.238    0.000    0.439    0.330
   .Q37A              0.487    0.004  127.220    0.000    0.487    0.374
   .Q38A              0.430    0.003  123.805    0.000    0.430    0.304
   .Q42A              0.622    0.005  132.512    0.000    0.622    0.583
   .Q2A               0.922    0.007  133.250    0.000    0.922    0.744
   .Q4A               0.572    0.004  127.940    0.000    0.572    0.523
   .Q7A               0.539    0.004  127.113    0.000    0.539    0.500
   .Q9A               0.590    0.005  127.667    0.000    0.590    0.515
   .Q15A              0.581    0.004  130.109    0.000    0.581    0.595
   .Q19A              0.785    0.006  132.085    0.000    0.785    0.681
   .Q20A              0.561    0.004  124.984    0.000    0.561    0.449
   .Q23A              0.493    0.004  131.617    0.000    0.493    0.658
   .Q25A              0.681    0.005  129.719    0.000    0.681    0.581
   .Q28A              0.478    0.004  123.083    0.000    0.478    0.412
   .Q30A              0.662    0.005  129.348    0.000    0.662    0.568
   .Q36A              0.551    0.004  124.837    0.000    0.551    0.446
   .Q40A              0.631    0.005  127.580    0.000    0.631    0.513
   .Q41A              0.601    0.005  128.683    0.000    0.601    0.546
   .Q1A               0.467    0.004  126.294    0.000    0.467    0.437
   .Q6A               0.571    0.004  129.082    0.000    0.571    0.516
   .Q8A               0.541    0.004  128.138    0.000    0.541    0.486
   .Q11A              0.464    0.004  125.600    0.000    0.464    0.421
   .Q12A              0.574    0.004  128.705    0.000    0.574    0.504
   .Q14A              0.785    0.006  132.627    0.000    0.785    0.672
   .Q18A              0.733    0.006  132.077    0.000    0.733    0.642
   .Q22A              0.523    0.004  128.331    0.000    0.523    0.492
   .Q27A              0.510    0.004  127.255    0.000    0.510    0.461
   .Q29A              0.498    0.004  126.551    0.000    0.498    0.443
   .Q32A              0.585    0.004  130.300    0.000    0.585    0.561
   .Q33A              0.541    0.004  128.213    0.000    0.541    0.489
   .Q35A              0.568    0.004  130.460    0.000    0.568    0.567
   .Q39A              0.491    0.004  127.618    0.000    0.491    0.471
    depression        0.653    0.007   88.452    0.000    1.000    1.000
    anxiety           0.317    0.006   50.766    0.000    1.000    1.000
    stress            0.602    0.007   83.798    0.000    1.000    1.000

Now, this output contains several key pieces of information:

fit statistics for the model
information on item loadings on the latent variables (factors)
information on the correlations between the latent variables
information on the variance of latent and manifest variables

Let’s walk through the output one-by-one.

Fit measures

The model fit is evaluated using three key criteria.

The first is the Chi^2 statistic. This evaluates whether the model adequately reproduces the covariance matrix in the real data or whether there is some deviation. Significant results indicate that the model does not reproduce the covariance matrix. However, this is largely dependent on sample size and not informative in practice. The CFA model is always a reduction of the underlying covariance matrix. That is the whole point of a model. Therefore, do not interpret the chi-square test and its associated p-value. Nonetheless, it is customary to report this information when presenting the model fit.

The following two fit statistics are independent of sample size and widely accepted in practice.

The CFI (Comparative Fit Index) is an indicator of fit. It ranges between 0 and 1 with 0 meaning horrible fit and 1 reflecting perfect fit. Values above 0.8 are acceptable and CFI > 0.9 is considered good. This index is independent of sample size and robust against violations of some assumptions CFA makes. It is however dependent on the number of free parameters as it does not include a penalty for additional parameters. Therefore, models with more parameter will generally have better CFI than parsimonious models with fewer parameters.

The RMSEA (Root Mean Squared Error Approximation) is an indicator of misfit that includes a penalty for the number of parameters. The higher the RMSEA, the worse. Generally, RMSEA < 0.05 is considered good and RMSEA < 0.08 acceptable. Models with RMSEA > 0.10 should not be interpreted. This fit statistic indicates whether the model fits the data well while controlling for the number of parameters specified.

It is customary to report all three of these measures when reporting the results of a CFA. Similarly, you should also interpret both the CFI and RMSEA, as bad fits in both statistics can indicate issues in the model specification. The above model would be reported as follows.

The model with three correlated factors showed acceptable fit to the data \(\chi^2(816) = 107309.08, p < 0.001, CFI = 0.90, RMSEA = 0.06\), 95% CI = \([0.06; 0.06]\).

In a future section, we will explore how you can compare models and select models based on these fit statistics. Here, it is important to focus on fit statistics that incorporate the number of free parameters like the RMSEA. Otherwise, more complex models will always outperform less complex models.

In addition to the statistics presented above, model comparison is often based on information criteria, like the AIC (Akaike Information Criterion) or the BIC (Bayesian Information Criterion). Here, lower values indicate better models. These information criteria also penalize complexity and are thus well suited for model comparison. More on this in the later section.

Loadings on latent variables

The next section in the model summary output display information on the latent variables. Specifically how highly the manifest variables “load” onto the latent variables they were assigned to. Here, loadings > 0.6 are considered high and at least one of the manifest variables should load highly on the latent variables. If all loadings are low, this might indicate that the factor you are trying to define does not really exist, as there is little relation between the manifest variables specified to load on this factor.

Importantly, the output shows two columns with loadings. The first column “Estimate” shows unstandardized loadings (like regression weights in a linear model). The last two columns “std.lv” and “std.all” show two different standardization techniques of the loadings. For now, focus on the “std.all” column. This can be interpreted in similar fashion as the factor loadings in the EFA.

In our example, these loadings are looking very good, especially for the depression scale. But for all scales, almost all items show standardized loadings of > 0.6, indicating that they reflect the underlying factor well.

Key Difference between EFA and CFA

These loadings highlight a key difference between CFA and a factor analysis with a fixed number of factors as conducted following an EFA.

In the EFA case, item cross-loadings are allowed. An item can load onto all factors. This is not the case in theory (unless explicitly stated). Here, the items only load onto the factor as specified in the model syntax. This is also why we obtain different loadings and correlations between factors in this CFA example compared to the three-factor FA from the previous episode.

Correlations between latent variables

The next section in the output informs us about the covariances between the latent variables. The “Estimate” column again shows the unstandardized solution while the “std.all” column reflects the standardized solution. The “std.all” column can be interpreted like a correlation (only on a latent level).

Here, we can see high correlations > 0.7 between our three factors. Anxiety and stress even correlate to 0.87! Maybe there is one underlying “g” factor of depression in our data after all? We will investigate this further in the section regarding model comparison.

Variances

The last section of the summary output displays the variances of the errors of manifest variables and the variance of the latent variables. Here, you should make sure that all values in “Estimate” are positive and that the variance of the latent variables differs significantly from 0, the results for this test are printed in “P(>|z|)”.

For CFA this section is not that relevant and plays a larger role in other SEM models.

Model Comparison

The last thing we will learn in this episode is how to leverage the fit statistics and other model information to compare different models. For example, remember that the EFA showed that the first factor had an extremely large eigenvalue, indicating that one factor already explained a substantial amount of variance.

A skeptic of the DASS might argue that it does not measure three distinct scales after all, but rather one unifying “I’m doing bad” construct. Therefore, this skeptic might generate a competing model of the DASS, with only one factor on which all items are loading.

Let’s first test the fit statistics of this model and then compare the two approaches.

R

model_cfa_1fac <- c(
  "
  # Model Syntax G Factor Model DASS
  
  # Define only one general factor
  g_dass =~ Q3A + Q5A + Q10A + Q13A + Q16A + Q17A + Q21A + Q24A + Q26A + Q31A + Q34A + Q37A + Q38A + Q42A +Q2A + Q4A + Q7A + Q9A + Q15A + Q19A + Q20A + Q23A + Q25A + Q28A + Q30A + Q36A + Q40A + Q41A + Q1A + Q6A + Q8A + Q11A + Q12A + Q14A + Q18A + Q22A + Q27A + Q29A + Q32A + Q33A + Q35A + Q39A
  "
)

fit_cfa_1fac <- cfa(model_cfa_1fac, data = fa_data)

summary(fit_cfa_1fac, fit.measures = TRUE, standardize = TRUE)

OUTPUT

lavaan 0.6-19 ended normally after 25 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                        84

  Number of observations                         37242

Model Test User Model:

  Test statistic                              235304.681
  Degrees of freedom                                 819
  P-value (Chi-square)                             0.000

Model Test Baseline Model:

  Test statistic                           1037764.006
  Degrees of freedom                               861
  P-value                                        0.000

User Model versus Baseline Model:

  Comparative Fit Index (CFI)                    0.774
  Tucker-Lewis Index (TLI)                       0.762

Loglikelihood and Information Criteria:

  Loglikelihood user model (H0)           -1921157.820
  Loglikelihood unrestricted model (H1)   -1803505.480

  Akaike (AIC)                             3842483.641
  Bayesian (BIC)                           3843199.757
  Sample-size adjusted Bayesian (SABIC)    3842932.805

Root Mean Square Error of Approximation:

  RMSEA                                          0.088
  90 Percent confidence interval - lower         0.087
  90 Percent confidence interval - upper         0.088
  P-value H_0: RMSEA <= 0.050                    0.000
  P-value H_0: RMSEA >= 0.080                    1.000

Standardized Root Mean Square Residual:

  SRMR                                           0.067

Parameter Estimates:

  Standard errors                             Standard
  Information                                 Expected
  Information saturated (h1) model          Structured

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  g_dass =~
    Q3A               1.000                               0.761    0.731
    Q5A               1.016    0.007  142.019    0.000    0.773    0.722
    Q10A              1.100    0.008  144.449    0.000    0.837    0.733
    Q13A              1.113    0.007  156.040    0.000    0.846    0.788
    Q16A              1.060    0.007  142.723    0.000    0.806    0.725
    Q17A              1.147    0.008  148.618    0.000    0.872    0.753
    Q21A              1.168    0.008  150.256    0.000    0.889    0.760
    Q24A              0.991    0.007  140.991    0.000    0.754    0.717
    Q26A              1.057    0.007  148.520    0.000    0.804    0.752
    Q31A              0.962    0.007  137.412    0.000    0.732    0.700
    Q34A              1.156    0.008  150.634    0.000    0.879    0.762
    Q37A              1.063    0.008  139.320    0.000    0.809    0.709
    Q38A              1.166    0.008  147.074    0.000    0.887    0.745
    Q42A              0.865    0.007  124.363    0.000    0.658    0.637
    Q2A               0.663    0.008   87.349    0.000    0.504    0.453
    Q4A               0.806    0.007  114.105    0.000    0.613    0.587
    Q7A               0.806    0.007  114.965    0.000    0.613    0.591
    Q9A               0.901    0.007  125.202    0.000    0.686    0.641
    Q15A              0.748    0.007  111.994    0.000    0.569    0.576
    Q19A              0.675    0.007   92.306    0.000    0.513    0.478
    Q20A              1.008    0.007  134.599    0.000    0.767    0.686
    Q23A              0.585    0.006   99.515    0.000    0.445    0.514
    Q25A              0.787    0.007  107.302    0.000    0.599    0.553
    Q28A              0.968    0.007  134.071    0.000    0.737    0.684
    Q30A              0.907    0.007  124.927    0.000    0.690    0.639
    Q36A              1.034    0.007  139.134    0.000    0.787    0.708
    Q40A              0.935    0.007  125.352    0.000    0.711    0.641
    Q41A              0.773    0.007  108.798    0.000    0.588    0.560
    Q1A               0.956    0.007  138.241    0.000    0.727    0.704
    Q6A               0.869    0.007  122.694    0.000    0.661    0.629
    Q8A               0.961    0.007  136.064    0.000    0.731    0.693
    Q11A              0.987    0.007  140.627    0.000    0.751    0.715
    Q12A              0.943    0.007  131.722    0.000    0.718    0.672
    Q14A              0.724    0.007   98.622    0.000    0.551    0.510
    Q18A              0.784    0.007  108.318    0.000    0.596    0.558
    Q22A              0.929    0.007  134.440    0.000    0.707    0.685
    Q27A              0.950    0.007  134.779    0.000    0.723    0.687
    Q29A              0.972    0.007  137.073    0.000    0.740    0.698
    Q32A              0.826    0.007  119.966    0.000    0.628    0.615
    Q33A              0.954    0.007  135.255    0.000    0.725    0.689
    Q35A              0.812    0.007  120.437    0.000    0.618    0.618
    Q39A              0.916    0.007  133.922    0.000    0.697    0.683

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .Q3A               0.505    0.004  132.069    0.000    0.505    0.466
   .Q5A               0.550    0.004  132.300    0.000    0.550    0.479
   .Q10A              0.602    0.005  132.010    0.000    0.602    0.463
   .Q13A              0.439    0.003  130.205    0.000    0.439    0.380
   .Q16A              0.587    0.004  132.219    0.000    0.587    0.475
   .Q17A              0.582    0.004  131.451    0.000    0.582    0.433
   .Q21A              0.576    0.004  131.207    0.000    0.576    0.422
   .Q24A              0.539    0.004  132.416    0.000    0.539    0.486
   .Q26A              0.496    0.004  131.465    0.000    0.496    0.434
   .Q31A              0.559    0.004  132.788    0.000    0.559    0.511
   .Q34A              0.557    0.004  131.148    0.000    0.557    0.419
   .Q37A              0.648    0.005  132.595    0.000    0.648    0.498
   .Q38A              0.629    0.005  131.668    0.000    0.629    0.444
   .Q42A              0.635    0.005  133.849    0.000    0.635    0.595
   .Q2A               0.985    0.007  135.470    0.000    0.985    0.795
   .Q4A               0.717    0.005  134.451    0.000    0.717    0.656
   .Q7A               0.702    0.005  134.407    0.000    0.702    0.651
   .Q9A               0.675    0.005  133.792    0.000    0.675    0.589
   .Q15A              0.652    0.005  134.557    0.000    0.652    0.668
   .Q19A              0.889    0.007  135.325    0.000    0.889    0.771
   .Q20A              0.662    0.005  133.053    0.000    0.662    0.529
   .Q23A              0.551    0.004  135.083    0.000    0.551    0.736
   .Q25A              0.814    0.006  134.773    0.000    0.814    0.694
   .Q28A              0.619    0.005  133.100    0.000    0.619    0.533
   .Q30A              0.689    0.005  133.811    0.000    0.689    0.591
   .Q36A              0.616    0.005  132.614    0.000    0.616    0.499
   .Q40A              0.724    0.005  133.782    0.000    0.724    0.588
   .Q41A              0.755    0.006  134.707    0.000    0.755    0.686
   .Q1A               0.539    0.004  132.706    0.000    0.539    0.505
   .Q6A               0.669    0.005  133.958    0.000    0.669    0.605
   .Q8A               0.577    0.004  132.918    0.000    0.577    0.520
   .Q11A              0.539    0.004  132.455    0.000    0.539    0.489
   .Q12A              0.624    0.005  133.301    0.000    0.624    0.548
   .Q14A              0.864    0.006  135.115    0.000    0.864    0.740
   .Q18A              0.786    0.006  134.728    0.000    0.786    0.689
   .Q22A              0.564    0.004  133.067    0.000    0.564    0.530
   .Q27A              0.584    0.004  133.037    0.000    0.584    0.528
   .Q29A              0.576    0.004  132.822    0.000    0.576    0.513
   .Q32A              0.648    0.005  134.127    0.000    0.648    0.621
   .Q33A              0.581    0.004  132.993    0.000    0.581    0.525
   .Q35A              0.620    0.005  134.099    0.000    0.620    0.619
   .Q39A              0.556    0.004  133.113    0.000    0.556    0.534
    g_dass            0.579    0.007   81.612    0.000    1.000    1.000

The one factor model showed a poor fit to the data \(\chi^2(819) = 235304.68, p < 0.001, CFI = 0.77, RMSEA = 0.09\), 95% CI = \([0.09; 0.09]\).

This already shows that the idea of only one unifying factor in the DASS does not reflect the data well. We can formally test the difference in the \(\chi^2\) statistic using anova(), not to be confused with the ANOVA.

R

anova(fit_cfa_1fac, fit_cfa_3facs)

OUTPUT


Chi-Squared Difference Test

               Df     AIC     BIC  Chisq Chisq diff  RMSEA Df diff Pr(>Chisq)
fit_cfa_3facs 816 3714494 3715236 107309
fit_cfa_1fac  819 3842484 3843200 235305     127996 1.0703       3  < 2.2e-16

fit_cfa_3facs
fit_cfa_1fac  ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

This output displays that the three factor model shows better information criteria (AIC and BIC) and that it also has lower (better) \(\chi^2\) statistics, significantly so. Coupled with an improved RMSEA for the three factor model, this evidence strongly suggests that the three factor model is better suited to explain the DASS data than the one factor model.

Challenge 1

Read in data and answer the question: do we have a hypothesis as to how the structure should be?

Challenge 2

Run a CFA with three factors

Challenge 3

Interpret some results

Challenge 4

Run a cfa with one factor, interpret some results

Challenge 5

Compare the models from challenge 3 and challenge 4, which one fits better?

Key Points

Something