Please login first

List of accepted submissions

 
 
Show results per page
Find papers
 
  • Open access
  • 19 Reads

Collective Fluid and Crystallised Intelligence: Is there Evidence of a Double Disadvantage in Linguistically Diverse Groups?

, ,

Understanding how intelligence operates in groups is increasingly important as teamwork becomes foundational in education and work. This exploratory analysis pools secondary data from 190 unique groups (2–5 members) across four quasi-experiments investigating performance on multi-subtest group IQ batteries. These datasets included linguistic, cultural, visuospatial, reasoning, memory, and creative tasks. Across the pooled sample, exploratory and confirmatory factor analyses consistently yielded a robust two-factor structure corresponding to collective fluid intelligence (cFluid) and collective crystallised intelligence (cCrystal). This provides one of the first systematic tests of the Cattell fluid–crystal distinction at the group level, demonstrating that distributed cognitive systems may exhibit the same psychometric differentiation long established for individuals.

A second contribution emerges from examining group language composition and its interaction with the cFluid–cCrystal factors. Groups varied substantially in their proportion of members whose main language was English, enabling tests of how conversational and task-language alignment shape collective cognition. Across studies, the proportion of English-main-language members strongly predicted cCrystal performance, particularly on language and culturally loaded reasoning items. In contrast, language proportion showed only weak associations with cFluid, where tasks emphasised pattern recognition, spatial reasoning, and novel problem-solving. Moreover, English-main-language members more readily engaged in conversation compared to language-minority members, suggesting reduced involvement in communication that supports collective problem-solving. These findings indicate a double disadvantage for language-minority members in group assessments: (1) a task-language disadvantage, where culturally or linguistically loaded items reduce accessibility; and (2) a group-discourse disadvantage, where reduced conversational participation constrains the group’s ability to pool distributed knowledge.

Implications are substantial for theories of intelligence, the measurement of collective cognition, and the design of multicultural teamwork and assessment environments. Results suggest that group intelligence is not merely a reflection of member ability, but an emergent property sensitive to linguistic alignment, communicative access, and task affordances.

  • Open access
  • 15 Reads
The Cross-Temporal Stability of Sex Differences in Piaget’s Water Level Tasks: A Meta-Analytical Multiverse

The Water Level Tasks (WLTs) were originally developed by Jean Piaget to examine children’s understanding of horizontality invariance and have since been widely used as a measure of spatial perception in adults. Piaget assumed that this task should be universally mastered by around the age of nine, thus reflecting the progression of concrete operational thought. However, subsequent research has shown that even some adults have substantial difficulties in correctly solving these tasks and indicated better performance of men compared to women. This sex effect has been often replicated, although its generality remains unclear. To address this gap, we synthesized all available evidence of WLT sex differences from 1964 to 2025 (261 independent samples comprising more than 37,000 participants) by means of a random-effects meta-analysis. To examine the generality of the observed meta-analytical summary effect across potential moderators, we conducted subgroup analyses and precision-weighted meta-regressions. Moreover, we performed meta-analytic specification curve and combinatorial meta-analyses, accounting for all reasonable as well as possible ways of which data to meta-analyze and how to meta-analyze them. Results indicated moderately-sized sex differences in WLT performance favoring men (d = 0.52) that increased with age but diminished over time. Specification curve analyses suggested stability of these sex differences across different test formats, sample characteristics, and meta-analytic model specifications, thus indicating a remarkable generality of this effect. In all, we show that WLT-based sex differences favoring men are moderate and robust but cross-temporally declining. In contrast to Piaget’s assumption of a universal understanding of horizontality invariance by late childhood as measured by the WLT, women are outperformed by men in solving this seemingly simple task.

  • Open access
  • 53 Reads
The Decline Effect Permeates Not Only Intelligence Research, But Psychology as a Whole: Meta-Meta-Analytic Evidence From 648 Meta-Analyses

The term decline effect means that reported effect sizes tend to decrease in strength as evidence accumulates over time, suggesting that early published findings in scientific research are often inflated. Conceptually such declines have been attributed to publication bias, low study power, and questionable research practices. However, systematic empirical evidence of declining effects in psychological science has been limited. In the present meta-meta-analysis, we examined whether these systematic declines occur within intelligence research, other psychological disciplines, and psychological science in general. Across 670 meta-analyses published in six highly visible journals in psychology (k = 62,542, N > 60 million), we found that in intelligence research, declines occurred about twice as often as increases and were substantially larger in size (average misestimations of initial vs. meta-analytical summary effects Δr = .18 vs. .08). Furthermore, initial studies associated with declining effects exhibited somewhat lower average power to detect the summary effect compared to those linked to increases (Mdn power = 52.31% vs. 59.76%). When examining psychology studies in general, virtually identical results were observed. Effect declines outnumbered increases nearly two to one and were considerably larger in strength than increases (Δr = .204 vs. .122). Moreover, original studies associated with declines showed lower power to detect the observed summary effects (M = 48.7%, Mdn = 39.4%), compared to studies linked to underestimations (M = 65.7%, Mdn = 82.7%). In all, our findings show that the decline effect is not limited to a single research domain but instead represents a pervasive challenge across psychological science, rooted in the inflation of early findings and inadequate study power.

  • Open access
  • 8 Reads
Does religiosity prevent cognitive declines? Cross-sectional and longitudinal examinations of religiosity and intelligence associations in elderly Europeans

Introduction: For almost a century, a considerable number of studies reporting associations between intelligence and religiosity have accumulated. However, reported effect size strengths vary substantially between primary studies. This heterogeneity has partly been attributed to measurement modalities. Associations with intelligence tend to be more pronounced for religious beliefs than for religious behaviors. Another factor potentially adding to the heterogeneity is participant age. Evidence suggests a protective effect of religiosity against cognitive declines in older ages. This indicates a decrease in the effect strength of the intelligence and religiosity link over a lifetime. However, these protective effects have been only observed in comparatively religious countries.
Methods: Here, we examine cross-sectional correlations of religiosity and cognitive functions as well as their cross-temporal changes across cohorts over a period of 10 years in European and Israeli participants aged 50+ years (N = 30,424) using the Survey of Health, Ageing, and Retirement in Europe.
Results: We observed meaningful negative associations of all measures of cognitive function and a composite score with religious beliefs (r = -.107). However, associations with religious behavior were negligible (r = .014). Effects appeared to generalize across age, thus contrasting the idea of age-related declines of effect strength in our data.

Conclusions: We presently demonstrate that intelligence is negatively associated with religiosity in elderly European samples. These associations remained stable over increasing participant ages and therefore do not support previous findings suggesting protective effects of religiosity against cognitive declines.

  • Open access
  • 29 Reads
Spatial ability across nations: Measurement invariance of the Three-Dimensional Cubes test in Filipino and Austrian undergraduates
, , , , , , ,

Cross-national comparisons of cognitive test scores are a subject of debate because measurement instruments are predominantly developed and validated on Western participants. Establishing between-nations measurement invariance (MI) is necessary to meaningfully interpret differences in intelligence test scores between countries. IRT-based approaches are particularly suitable for testing MI, because they allow for item-level examination of measurement properties and assessment of between-group test unidimensionality. The Three-Dimensional Cubes Test (3DC) is a Rasch-calibrated measure of visual processing that was originally developed for use in Germanophone countries. Prior studies showed (partial) MI across Austrian, Singaporean and US samples, supporting its suitability for cross-national comparisons. In this study, we compared spatial task performance from N = 300+ undergraduate students, respectively, from Austria and the Philippines. We used a stepwise approach to establish MI by examining Rasch-homogeneity within as well as across countries. Likelihood-ratio tests indicated model fit for all but one item, enabling a comparison of the mean person parameters after excluding the misfitting item. Our results indicate a very large mean difference between the groups (Cohen’s d = 2.22), with the Austrian sample scoring higher than the Filipino sample. The present data clearly show the necessity of establishing measurement invariance for a meaningful cross-national comparison of cognitive subdomains.

  • Open access
  • 21 Reads
Cross-temporal meta-analysis of Trail Making Test performance (1953-2024): A Flynn effect for executive functioning

Cross-temporal intelligence test score changes (the Flynn effect) exhibit domain- and country-specific change trajectories. Process overlap theory suggests that positive intelligence subtest intercorrelations—the positive manifold of intelligence—may be due to executive functioning playing an important role in the performance on all intelligence subdomains. This may mean that changes in executive functioning may be (partly) responsible for observed test score changes in the vein of the Flynn effect. However, change trajectories of executive functioning have so far remained largely unexplored. In a preregistered cross-temporal meta-analysis, we investigated general population changes of trail making test (TMT) performance, a well-established and widely-used test instrument for assessing executive functioning. We identified all available records reporting mean TMT performance from the available literature and predicted TMT outcome measures by data collection year in precision-weighted linear and multiple regressions. Analyzing a large dataset based on 8,000+ studies (k = 16,000+; N = 1,000,000+) published between 1946 and 2025, we show global TMT score increases. Gains in patient samples were more pronounced than gains in healthy samples. Change trajectories were generalized across different TMT outcome measures but were differentiated according to country of data collection. Our findings indicate that executive functioning changes may likely be linked to the Flynn effect in standard domains of cognitive task performance.

  • Open access
  • 26 Reads
Domain-specific patterns of the Flynn effect: A CHC-based meta-analysis of more than a century of IQ changes (1909–2025)

Generational shifts in intelligence test performance, commonly referred to as the Flynn effect, have been the subject of intense empirical investigation. Whilst earlier studies generally had reported systematic gains for full-scale, fluid, and crystallized IQ, more recent findings increasingly indicate heterogeneous change patterns across distinct cognitive domains. Here, we present the first formal meta-analysis of the Flynn effect following the framework of the Cattell–Horn–Carroll (CHC) theory, currently the most widely accepted model of human intelligence. Our quantitative research synthesis integrates more than 3,000 test score changes (1909–2025), based on 2,000,000+ individuals, and including more than 30 CHC stratum I and II domains. Cross-temporal trends were differentiated in terms of sign and magnitude according to specific stratum I domains, with annual changes ranging from –0.11 to +0.30 IQ points. These shifts further varied across countries and were systematically related to national-level macro indicators, such as socioeconomic (in-)equality, education, and healthcare access. The direction and strength of these associations differed across CHC stratum I subdomains. We further show substantial influences of measurement non-invariance and disentangle genuine cognitive change from artifacts of evolving test constructs and psychometric models. In all, we provide the first systematic evidence that the Flynn effect is domain-specific, while limited measurement invariance restricts meaningful interpretation of cross-temporal (domain-specific) trends.

  • Open access
  • 14 Reads
Relations Between Sex, Cognitive Development and Executive Function in Early Childhood: Evidence from the Wechsler Scale and Head–Toes–Knees–Shoulders Task
, ,

Executive function skills, including cognitive flexibility, inhibitory control, and working memory, are foundational for children’s learning, self-regulation, and engagement in complex tasks. Understanding how these capacities relate to broader cognitive development can shed light on the mechanisms that support school readiness. Also, there is a lack of recent research focusing on differences in young children's executive function and cognitive development by sex. This study examined the relationships between young children’s sex, cognitive development, and executive function.

We analyzed data from 136 children aged 3–6 years who participated in two experimental studies conducted as part of a broader research program. Cognitive development was assessed using the Wechsler Preschool and Primary Scale of Intelligence—Fourth Edition (Canadian), capturing verbal reasoning, visual–spatial problem-solving, fluid reasoning, working memory, processing speed, and overall cognitive functioning. Executive function was measured with the Head–Toes–Knees–Shoulders Task (HTKS), which requires children to inhibit automatic responses and apply rule-based behaviour, thereby assessing cognitive flexibility, working memory, and inhibitory control. Parents provided demographic information and details about the home learning environment.

Spearman correlations indicated that HTKS performance was strongly associated with most Wechsler subcomponents, including verbal (ρ = .33, p < .001), visual–spatial (ρ = .45, p < .001), fluid reasoning (ρ = .41, p < .001), and processing speed (ρ = .38, p < .001), but showed a weaker association with working memory (ρ = .29, p < .001). Gender differences were minimal, with only processing speed showing a modest effect (ρ = –.22, p = .025), favouring girls slightly. Overall cognitive functioning was strongly linked to all composite scores, particularly visual–spatial reasoning (ρ = .60, p < .001) and HTKS performance (ρ = .56, p < .001). These findings suggest that (1) executive function relates broadly to cognitive development but not strongly to working memory specifically, (2) visual–spatial abilities contribute substantially to general cognitive performance, and (3) sex differences in cognitive development are minimal. Thus, standardized measures of cognitive development and HTKS may capture distinct yet complementary aspects of cognitive processes.

  • Open access
  • 19 Reads
Where are we at with national IQ databases? 12 key issues and one live experiment

Since the publication of the first estimates of country-level intelligence by Richard Lynn in 1991, systematic databases of national IQs have been constituted and regularly updated (Lynn & Vanhanen, 2002; Lynn & Becker, 2019; Becker, 2023). These databases have made their way into public discourse, often in the form of "world maps of IQ"; for many non-specialists, this is the only point of entry into the scientific literature on intelligence. This means the robustness and validity of national IQ estimates provided by the databases are especially critical. We will provide an overview of 12 key aspects of the national IQ databases that merit closer scrutiny (predominant use of Wechsler scales and Raven's matrices; use of PISA-like assessments as proxy for IQ; possible contribution of non-intellectual individual differences between countries; effects of contextual differences in data collection on country scores; methods for sample selection and inclusion; representativeness of samples for their respective countries as a whole; methods for extracting IQ estimates; causal interpretation of correlations involving national IQs; interpretation of differences between countries as genetically driven; interpretation of differences between countries as evolved racial differences; policy recommendations based on national IQs; political use of national IQ estimates). We will also collect the audience's opinions through a live survey available at any point during the talk, and we will discuss the results as a conclusion.

  • Open access
  • 37 Reads
Adaptive Performance and Cognitive Regulation in Immersive VR Vocational Training

Immersive virtual reality (VR) environments offer novel opportunities to examine cognitive regulation and adaptive performance during complex, ecologically valid tasks. In vocational training contexts, learners must manage instructional support, regulate cognitive effort, and translate guidance into independent action, processes closely related to applied human intelligence. The present study examines behavioral indicators of cognitive regulation in a VR-based coffee preparation task using hand-tracking interaction.

Participants completed the same vocational procedure across two trials: an initial trial with full instructional guidance and a subsequent trial with reduced guidance. Performance was assessed using detailed temporal metrics for task segments, total completion time, interaction errors (object drops), and subjective workload and confidence ratings. Participants varied in prior VR experience and real-world coffee-machine expertise, allowing examination of expert–novice differences across both interaction and task knowledge dimensions.

Analyses focus on changes in performance and perceived workload following the removal of instructional scaffolding. Segment-level time distributions and guidance-related performance costs are examined to identify distinct performance strategies and potential bottlenecks. This study investigates how prior task expertise and VR experience relate to the regulation of cognitive effort under reduced guidance, highlighting the potential of VR-based behavioral measures for studying adaptive performance and applied intelligence in learning environments.

1 2 3
Top