Explore | TED – Truth Effect Database (TED)

An overview of TED

The following overview and analysis are a living version of the analysis conducted in TED’s introductory paper. They will be updated as new data is included and may thus deviate from the published results. On this page, you can find an overview of included data, a brief meta-analysis on the truth effect within TED, and additional models estimating variability in the truth effect on subject, statement, or experiment level.

In the current version of TED, we included 56 studies from 27 publications, spanning 12002 participants contributing 778741 trials. A complete list of the included publications can be found in the Table “overview of studies included in TED”.

Sample composition ranged from 29 to 949 participants. On average, studies included 218.04 participants (\(\mu_{age} =\) 33.17,\(\sigma_{age} =\) 7.22). An overview of the rating scale usage for truth judgments and the use of a filler task over all included studies can be found in the figure below.

Overview of Study-related variables in TED

On average, studies employed 62.90 (\(SD =\) 39.97) statements per participant in the judgment session and in 88.64 % of procedure settings exactly 50% of statements were repeated. Of 88 judgment phases, 75.00 % were conducted on the same day as the exposure phase. The average delay between exposure and judgment phase if both were conducted on the same day was 3.77 minutes. The average delay between exposure and judgment phase, given the judgment phase was conducted at least one day after the exposure phase, was 7.45 days. An overview of additional variables pertaining to the procedure of the included studies can be found in the Figure below.

Overview of Procedure-related variables in TED

Detailed information on the statements presented is available for 53 out of 56 studies. Data on the accuracy of a statement is available for 359113 (52.41 %) of trials, the exact statement text is available for 306387 (44.72 %) of trials, and response times are available for 111077 (16.21 %) of trials.

Overview of studies included in TED
publication_id	study_id	procedure_id	n_participants	student_sample	truth_rating_steps	repetition_time	n_statements
1	1	1	186	NA	2	5	56
2	2	2	138	NA	2	0	36
2	2	3	138	NA	2	10080	36
3	3	4	103	0	6	0	120
3	4	5	99	1	2	0	200
3	5	6	68	1	6	0	200
3	6	7	89	1	6	2880	200
4	7	8	380	0	2	0	40
5	8	9	283	0	6	0	40
5	9	10	271	1	6	0	40
5	10	11	200	0	6	0	40
5	11	12	299	0	6	0	40
5	12	13	291	0	6	0	40
6	13	14	113	1	6	1	40
6	13	15	113	1	6	1	40
6	13	16	113	1	6	1	40
6	13	17	113	1	6	1	40
6	13	18	113	1	6	1	40
6	14	19	430	0	6	1	40
6	14	20	430	0	6	1	40
6	14	21	430	0	6	1	40
6	14	22	430	0	6	1	40
6	14	23	430	0	6	1	40
7	15	24	371	0	11	10080	12
7	16	25	939	0	11	1	12
7	16	26	939	0	11	10080	12
7	17	27	408	0	11	10080	12
8	18	28	503	0	2	0	80
9	19	29	82	1	6	4	120
9	20	30	68	1	6	4	120
10	21	31	507	0	7	0	32
10	21	32	507	0	7	1440	32
10	21	33	507	0	7	10080	32
10	21	34	507	0	7	43200	32
11	22	35	220	1	6	5760	72
11	22	36	220	1	6	5760	72
11	23	37	282	0	6	20	72
11	23	38	282	0	6	20	72
11	23	39	282	0	6	20	72
11	24	40	405	0	6	20	72
11	24	41	405	0	6	20	72
12	25	42	240	0	101	0	16
13	26	44	60	0	6	5	80
14	27	45	526	0	6	0	105
15	28	47	54	1	6	10	88
16	29	48	139	0	5	10	20
17	30	49	267	0	5	10	20
18	32	53	66	1	6	10	88
18	32	54	66	1	6	10	88
19	33	55	65	1	6	10080	88
19	33	56	65	1	6	10080	88
19	33	57	65	1	6	10080	88
19	33	58	65	1	6	10080	88
19	34	59	202	0	6	0	80
19	34	60	202	0	6	0	80
20	35	61	73	0	6	3	56
20	36	62	79	1	2	0	56
20	36	63	79	1	2	20160	56
21	37	64	91	1	6	2	60
21	38	65	64	1	6	0	60
21	39	66	80	1	6	0	54
22	40	69	70	1	2	5	80
22	40	70	70	1	2	5	80
22	41	73	149	1	2	5	120
22	41	74	149	1	2	5	120
22	42	75	98	1	2	0	32
22	42	76	98	1	2	0	32
22	42	77	98	1	2	0	32
22	42	78	98	1	2	0	32
23	43	79	64	1	6	10080	84
23	43	80	64	1	6	10080	84
23	44	81	64	1	6	10080	84
23	45	82	65	1	2	10080	80
24	46	83	89	1	6	5760	72
25	47	84	409	0	6	2	28
25	48	85	949	0	4	1	24
25	48	86	949	0	4	1	24
25	49	87	940	0	4	2	16
25	49	88	940	0	4	2	16
25	49	89	940	0	4	10080	24
25	49	90	940	0	4	10080	24
26	50	91	29	1	2	9	120
26	51	92	41	1	101	9	120
26	52	93	42	1	101	9	120
26	53	94	37	1	101	9	80
27	54	95	132	0	6	0	56
27	55	96	102	0	6	0	48
27	56	97	104	1	6	1	56
^a A note goes here.

Meta-Analysis

The following provides an illustrative meta-analysis of effect sizes derived from the TED Truth Effect database. It is based on trial-level data and demonstrates how a meta-analysis could be conducted. This example is not a definitive guide, nor does TED represent a comprehensive or random sample of all studies, since it only includes studies with openly available trial-level data.

Here, we included only studies with a heterogeneous presentation criterion (“between-items criterion”; Dechene et al., 2010). Effect sizes were calculated using Hedges’ g, derived as follows:

For each subject within a study, the repeated and new average responses were calculated.
Hedges’ g was computed per study using the effsize::cohen.d() function with the paired correction.
Variances of the effect sizes were extracted to serve as input for the meta-analysis.
The meta-analysis accounts for multiple entries per publication (as some publications have multiple studies).

Some small deviations from the originally reported effect sizes in the individual studies may exist. But we applied no additional exclusion criteria and tried to exclude (during encoding) all subjects excluded in the original studies.

First, we access the database and retrieve trial-level data:

library(acdcquery)

# Replace with your local path
conn <- connect_to_db("path/to/ted.db")

analysis_data <- query_db(
    conn,
    arguments,
    target_vars = c("default", "study_id", "publication_id", "authors", "conducted"),
    target_table = "observation_table"
  ) %>% 
  filter(phase == "test") %>% 
  filter(!is.na(repeated), !is.na(response)) 

# Here we only use data where the test phase has both 
# repeated and new statements
has_complete_data <- analysis_data %>% 
  count(procedure_id, subject, repeated) %>% 
  count(procedure_id, subject) %>% 
  mutate(
    has_complete_data = ifelse(n == 2, 1, 0)
  )

analysis_data <- analysis_data %>% 
  left_join(
    ., has_complete_data
  ) %>% 
  filter(has_complete_data == 1)

Then we compute effect sizes per study using cohen.d.

eff_data <- analysis_data %>% 
  left_join(publications_overview) %>% 
  group_by(publication_id, authors, conducted, study_id, repeated, subject) %>% 
  summarize(
    mean_resp = mean(response, na.rm = TRUE)
  ) %>% 
  mutate(repeated = factor(
    ifelse(repeated > 0, "yes", "no"),
    levels = c("yes", "no"))
    ) %>% 
  pivot_wider(names_from = repeated, values_from = mean_resp) %>% 
  group_by(publication_id, authors, conducted, study_id) %>% 
  nest() %>% 
  mutate(effsize = map(
    data, 
    ~effsize::cohen.d(
      .$yes, 
      .$no, 
      data = ., 
      hedges.correction = TRUE,
      paired = TRUE
      )
    )
  ) %>% 
  mutate(
    estimate = map_dbl(effsize, ~{.$estimate}),
    var = map_dbl(effsize, ~{.$var})
  )

Joining with `by = join_by(publication_id)`
`summarise()` has grouped output by 'publication_id', 'authors', 'conducted',
'study_id', 'repeated'. You can override using the `.groups` argument.

Multi-Level Meta-Analysis

To account for non-independence of effect sizes within publications contributing multiple studies, we fitted a three-level meta-analytic model using rma.mv():

Level 1: Sampling variance of individual effect sizes

Level 2: Heterogeneity between studies within the same publication

Level 3: Heterogeneity between publications

eff_data <- eff_data %>% 
  mutate(label = paste0(authors, " (", conducted, ")"))

res_mv <- rma.mv(yi = estimate,
                 V = var,
                 random = ~ 1 | publication_id/study_id,  # random intercept per study
                 slab = label,
                 data = eff_data,
                 method = "REML")
summary(res_mv)


Multivariate Meta-Analysis Model (k = 49; method: REML)

  logLik  Deviance       AIC       BIC      AICc   
-17.2796   34.5593   40.5593   46.1729   41.1047   

Variance Components:

            estim    sqrt  nlvls  fixed                   factor 
sigma^2.1  0.0847  0.2911     26     no           publication_id 
sigma^2.2  0.0498  0.2231     49     no  publication_id/study_id 

Test for Heterogeneity:
Q(df = 48) = 1190.4241, p-val < .0001

Model Results:

estimate      se     zval    pval   ci.lb   ci.ub      
  0.7022  0.0702  10.0035  <.0001  0.5647  0.8398  *** 

---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The results reveal a large effect of repetition (\(d =\) 0.70; 95 % CI = 0.56, 0.84). This is much larger than the effect size found in Dechene et al. (2010) of around \(d = 0.49\) for the between-items criterion. Notably, this analysis is only based on publications with openly available data. Thus, these results are based on a biased smaller sample of k = 26 publications.

Variance Decomposition

We can calculate variance proportions and I² values to quantify the contributions of sampling error, within-publication heterogeneity, and between-publication heterogeneity. This reveals substantial variance both within a publication and between publications, supporting the use of this multi-level approach.

i2 <- var.comp(res_mv)
i2$plot

Warning in is.na(x): is.na() applied to non-(list or vector) of type 'language'
Warning in is.na(x): is.na() applied to non-(list or vector) of type 'language'
Warning in is.na(x): is.na() applied to non-(list or vector) of type 'language'

Forest Plot

Finally, a forest plot shows the individual study effect sizes, their confidence intervals, and the overall estimate from the multi-level meta-analysis model.

# forest plot for multi-level model
forest(res_mv, 
       slab = paste0(eff_data$authors, " (", eff_data$conducted, ") | Study: ", eff_data$study_id),
       xlab = "Effect size (Hedges' g)",
       refline = 0,
       cex = 0.5,
      )

Summary

This workflow demonstrates how the TED Truth Effect database can be used to:

Compute effect sizes at the trial and study level
Fit multi-level meta-analytic models to account for clustering
Explore variance components and heterogeneity
Visualize results using a forest plot

It is intended as a tutorial example and not a definitive meta-analysis.

Hierachical Bayesian Model

To illustrate the benefits of our large collection of trial-level data, we fitted Bayesian multilevel models predicting truth judgments, with repetition as a fixed effect and random intercepts and slopes at the subject, statement, and procedure levels.

We grouped data at the level of the procedure_table, as this table contains detailed information about each experimental setup (e.g., proportion of repeated items, presence of warnings, number of sessions) beyond what is available in the broader study_table. Each entry in the study_table corresponds to at least one entry in the procedure_table, but a single study may include several procedures that differ in these settings. For example, the same study may have multiple judgment sessions, modify the percentage of repeated stimuli, or warn some participants about the truth effect. These different procedures will then also receive different procedure identifiers, but the same study identifier.

Thus, the procedure identifier (procedure_id) uniquely captures both the study context and its specific experimental conditions. This modeling approach allows us to estimate the variance in the truth effect at three levels simultaneously: (1) variance due to common experimental manipulations and study settings (procedure level), (2) variance due to individual statements (statement level), and (3) variance due to individual differences (subject level).

We analyzed the dichotomous and Likert-type response formats separately due to differences in their scale characteristics. Dichotomous responses (e.g., true/false) require logistic models, whereas Likert-type responses (e.g., 1–5 ratings) allow for linear models. All responses were maximum-normalized to the range 0-1 with one representing the maximum possible response indicating a “true” judgment. The repetition status was mean-centered to aid model estimation, a new statement was coded -0.5 and a repeated statement 0.5.

We ran all models using 4 chains with 3000 iterations per chain, 1000 of which were discarded as warmup-samples, leading to a total of 8000 posterior samples. There were no divergent transitions, no \(\hat{R} > 1.05\), and visual inspection confirmed that the chains mixed well. We used weakly informative priors for the intercept, fixed effect, and standard deviations for all models.

\[Intercept \sim Normal(0.5, 0.5)\] \[b \sim Normal(0, 1)\] \[\sigma \sim Gamma(1, 4)\]

Dichotomous Truth Judgments

The analysis was based on 112399 trials nested within 1576 subjects, 997 statements, and 14 procedures.

The table below provides a summary of parameter estimates. As expected, the model indicated a significant fixed effect of repetition (\(OR =\) 1.79, \(95\% \ CrI =\) [1.51, 2.12]). Notably, the standard deviation of the random slope of repetition was highest at the subject level (\(\sigma =\) 0.72, \(95\% \ CrI =\) [0.68, 0.77]), followed by the procedure level (\(\sigma =\) 0.28, \(95\% \ CrI =\) [0.18, 0.44]), and the statement level (\(\sigma =\) 0.13, \(95\% \ CrI =\) [0.03, 0.19]).

Variance in the truth effect at different levels
Effect	Grouping	Parameter	Estimate	l_95_CrI	u_95_CrI
fixed		Intercept	0.31	0.17	0.44
fixed		repeated	0.58	0.41	0.75
random	procedure	Intercept (sd)	0.20	0.11	0.32
random	procedure	repeated (sd)	0.28	0.18	0.44
random	statement	Intercept (sd)	0.88	0.84	0.93
random	statement	repeated (sd)	0.13	0.03	0.19
random	subject	Intercept (sd)	0.69	0.66	0.72
random	subject	repeated (sd)	0.72	0.68	0.77

Note. N = 112399; N Procedure = 14; N Subjects = 1576; N Statements = 997; l_95_CrI refers to the lower boundary of the 95% credible interval, u_95_CrI refers to the upper boundary

Scale Truth Judgments

The analysis was based on 572775 trials nested within 8309 subjects, 2872 statements, and 65 procedures.

The table below provides a summary of parameter estimates. As expected, the model indicated a significant fixed effect of repetition (\(b =\) 0.08, \(95\% \ CrI =\) [0.07, 0.10]). Again, the standard deviation of the random slope of repetition was highest at the subject level (\(\sigma =\) 0.10, \(95\% \ CrI =\) [0.10, 0.10]), followed by the procedure level (\(\sigma =\) 0.07, \(95\% \ CrI =\) [0.05, 0.08]), and the statement level (\(\sigma =\) 0.03, \(95\% \ CrI =\) [0.02, 0.03]).

Variance in the truth effect at different levels
Effect	Grouping	Parameter	Estimate	l_95_CrI	u_95_CrI
fixed		Intercept	0.54	0.52	0.56
fixed		repeated	0.08	0.07	0.10
random	procedure	Intercept (sd)	0.07	0.06	0.09
random	procedure	repeated (sd)	0.07	0.05	0.08
random	statement	Intercept (sd)	0.11	0.11	0.12
random	statement	repeated (sd)	0.03	0.02	0.03
random	subject	Intercept (sd)	0.10	0.09	0.10
random	subject	repeated (sd)	0.10	0.10	0.10

Note. N = 587999; N Procedure = 66; N Subjects = 8397; N Statements = 2872; l_95_CrI refers to the lower boundary of the 95% credible interval, u_95_CrI refers to the upper boundary

To further explore the influence of temporal delay between the exposure and judgment phases on inter-individual variability in the repetition effect, we included an interaction between subject and temporal delay (same-day vs. different day) in the random effect structure. The model then estimates two standard distributions for the random effect of repetition on the subject level. We can then investigate whether the difference in the standard deviation of the random effect of repetition on the subject-level is different depending on the temporal delay.

The table below provides a summary of parameter estimates. The standard deviation of the random slope of repetition at the subject level for a same-day judgment phase was \(\sigma_0 =\) 0.11 (\(95\% \ CrI =\) [0.10, 0.11]). The standard deviation for the random slope on a later day judgment phase was \(\sigma_1 =\) 0.08 (\(95\% \ CrI =\) [0.08, 0.09]). The difference in standard deviations in the random effect of repetition at the subject level deviated substantially from zero \(\sigma_0 - \sigma_1 =\) 0.02 (\(95\% \ CrI =\) [0.02, 0.02]).

Variance in the truth effect at different levels
Effect	Grouping	Parameter	Estimate	l_95_CrI	u_95_CrI
fixed		Intercept	0.54	0.52	0.56
fixed		repeated	0.08	0.07	0.10
random	procedure	Intercept (sd)	0.07	0.06	0.09
random	procedure	repeated (sd)	0.07	0.06	0.08
random	statement	Intercept (sd)	0.11	0.11	0.12
random	statement	repeated (sd)	0.03	0.02	0.03
random	subject (same day)	Intercept (sd)	0.10	0.10	0.10
random	subject (same day)	repeated (sd)	0.11	0.10	0.11
random	subject (later)	Intercept (sd)	0.08	0.08	0.08
random	subject (later)	repeated (sd)	0.08	0.08	0.09

Note. N = 587999; N Procedure = 66; N Subjects = 9592; N Statements = 2872; l_95_CrI refers to the lower boundary of the 95% credible interval, u_95_CrI refers to the upper boundary

Variance in the truth effect at different levels