Gestational Age and Neurodevelopmental Outcomes in Preterm Children at Early Preschool Age: A Longitudinal Multidomain Logistic Modeling Study
Article information
Abstract
Purpose
Preterm birth remains a leading cause of long-term neurodevelopmental impairment, yet early evaluations frequently underestimate subsequent deficits. This study examined longitudinal neurodevelopmental trajectories across gestational age groups and identified predictors of developmental delay.
Methods
A retrospective cohort of 532 preterm children, stratified by gestational age, was followed from the neonatal period to early preschool age. Neurodevelopment was assessed using the Korean version of the Bayley Scales of Infant and Toddler Development, Third Edition at 8–12 months (n=481), 13–24 months (n=118), and 25–42 months (n=100). Longitudinal trajectories were analyzed using general linear models, and predictors of developmental delay were identified through multivariable logistic regression.
Results
During the first year, motor scores differed significantly across gestational age groups, with extremely preterm infants showing the lowest values. By the third to fourth years of life, cognitive and language scores diverged markedly, with extremely preterm children exhibiting the steepest decline and additional deficits in motor and adaptive behavior domains. Lower gestational age remained an independent predictor of both cognitive and language delay at early preschool age, while no independent predictors were identified for motor, social-emotional, or adaptive behavior outcomes.
Conclusion
Neurodevelopmental outcomes in preterm children follow dynamic, domain-specific trajectories influenced by gestational age and developmental timing. Motor delays are most evident in infancy, whereas cognitive and language impairments emerge by early preschool age. Gestational age remains a consistent predictor of later delay, emphasizing the need for longitudinal, gestational age–stratified monitoring and early, targeted intervention.
Introduction
The World Health Organization classifies preterm birth into four categories: late (34–<37 weeks), moderate (32–<34 weeks), very (28–<32 weeks), and extremely preterm (<28 weeks) [1]. Despite advances in neonatal care, long-term neurodevelopmental challenges persist, especially among extremely preterm children, as early assessments often underestimate later impairments [2,3]. Children born extremely preterm or with very low birth weight remain particularly vulnerable to cognitive, language, motor, and behavioral difficulties compared with term-born peers [2-5]. Preterm birth disrupts critical brain developmental processes, heightening susceptibility to injury and abnormal maturation [3,4]. Although the incidence of severe brain lesions, such as cystic periventricular leukomalacia (PVL) and high-grade intraventricular hemorrhage (IVH), has declined, diffuse non-cystic injuries remain common and continue to contribute substantially to long-term neurodevelopmental deficits through disrupted white matter maturation and altered cortical connectivity [3,4,6,7]. Furthermore, perinatal and neonatal morbidities are strongly associated with adverse neurodevelopmental outcomes, though it remains uncertain whether they serve as independent predictors or merely reflect underlying immaturity [6,8,9]. Preterm children exhibit age-dependent vulnerability in brain networks that underlie executive and information-processing functions, with connectivity disruptions often persisting into later life. Deficits in attention, memory, language, and adaptive skills may become apparent only at school age, limiting the predictive accuracy of early testing. Comprehensive multimodal assessment and early intervention can enhance risk stratification and improve neurodevelopmental outcomes [6,7,10].
The Bayley Scales of Infant and Toddler Development, Third Edition (Bayley-III), are widely used to evaluate multiple domains of early neurodevelopment in preterm children. Among these, cognitive and language outcomes are regarded as the most representative indicators of early preschool functioning [11-14]. However, predictive validity during infancy remains limited, particularly among extremely preterm children, emphasizing the need for longitudinal follow-up into the preschool period [15-18]. This study aimed to evaluate longitudinal neurodevelopmental outcomes in preterm children up to early preschool age by gestational age and to identify predictors of domain-specific delays using multivariable logistic modeling.
Materials and Methods
1. Study design and population
This retrospective cohort study included preterm children admitted to the Neonatal Intensive Care Regional Center at Soonchunhyang University Cheonan Hospital (Republic of Korea) between December 1, 2015, and December 31, 2023. Children with major congenital anomalies, chromosomal or genetic disorders, congenital hypothyroidism, hearing loss, blindness, severe traumatic or hypoxic-ischemic brain injury (including cystic PVL or high-grade IVH), or insufficient follow-up data were excluded. A total of 532 preterm newborns met the inclusion criteria during the neonatal period: 177 late preterm (34–<37 weeks), 138 moderate preterm (32–<34 weeks), 170 very preterm (28–<32 weeks), and 47 extremely preterm (<28 weeks).
2. Neurodevelopmental assessment
Neurodevelopmental outcomes were evaluated using the Korean version of the Bayley Scales of Infant and Toddler Development, Third Edition (K-Bayley-III), which has been validated and shown to possess strong reliability and construct validity in Korean populations [12]. Composite scores were obtained for five domains—cognitive, language, motor, social-emotional, and adaptive behavior—and were administered at three time points: 8–12 months (first year, n=481) and 13–24 months (second year, n=118) of corrected age, and 25–42 months (third to fourth years, n=100) of calendar age. Developmental delay was defined as a composite score <85 on the K-Bayley-III, a threshold selected for its greater sensitivity in detecting moderate-to-severe impairment among preterm children, given the Bayley-III’s recognized tendency to underestimate developmental deficits compared with earlier editions [13,14].
3. Statistical analysis
All analyses were performed using SPSS version 27.0 (IBM Corp., Armonk, NY, USA) and R software versions 4.3.0 and 4.5.1 (R Foundation for Statistical Computing, Vienna, Austria). Statistical significance was set at P<0.05. Group comparisons were conducted using the Kruskal–Wallis test for continuous variables and Fisher’s exact test for categorical variables. Post hoc pairwise comparisons among groups were carried out using the Wilcoxon rank-sum test, with Holm–Bonferroni correction applied for multiple testing. To evaluate longitudinal differences in neurodevelopmental scores, a subset of 83 children with available data from both the first and third–fourth years was analyzed using a general linear model (GLM) within an analysis of variance (ANOVA) framework to assess between-group effects of gestational age. Post hoc pairwise differences derived from the GLM were corrected using the Tukey method. Models for outcomes at 3–4 years were adjusted for baseline cognitive scores from the first year.
Multivariable logistic regression was conducted to assess the effects of gestational age and other perinatal or neonatal factors on developmental delay at early preschool age. Separate models were constructed for each developmental domain, using binary outcomes at 3–4 years (n=100) as dependent variables. Independent variables were selected based on statistical significance and clinical relevance and were retained if they differed across gestational age groups or were supported by prior evidence, provided that no multicollinearity was present (variance inflation factor <10). Variables with low odds ratios (ORs) or minimal contribution were excluded. Model discrimination was evaluated using the area under the receiver operating characteristic curve.
4. Standard protocol approval
The study protocol was approved by the Institutional Review Board of Soonchunhyang University Cheonan Hospital (approval number 2024-11-031). Written informed consent from patients was waived due to the retrospective nature of the study.
Results
Descriptive statistics stratified by gestational age are presented in Table 1. Maternal demographics were largely comparable across groups, except for significantly higher rates of premature rupture of membranes and lower rates of in vitro fertilization among extremely preterm births. Clinical complications increased with decreasing gestational age. Infants born before 28 weeks had lower 1-minute and 5-minute Apgar scores, along with markedly higher incidences of respiratory distress syndrome (RDS), low-grade IVH, non-cystic PVL, necrotizing enterocolitis (NEC), and cardiovascular shunts (including patent foramen ovale, atrial septal defect, and patent ductus arteriosus). Thyroid dysfunction requiring hormone replacement was also more frequent in this group.
Table 2 summarizes neurodevelopmental outcomes across gestational age groups from infancy to early preschool age. Holm–Bonferroni–adjusted pairwise comparisons showed that, in the first year, motor scores were significantly lower in the extremely preterm group than in the late, moderate, and very preterm groups (P<0.01), while adaptive behavior scores were lower than those of late and moderate preterm peers (P<0.05). Late preterm infants also demonstrated lower language scores compared with very preterm infants (P<0.05), whereas cognitive scores did not differ significantly across groups. By the second year, cognitive performance was lower in late preterm children than in moderate preterm peers (P<0.05). By 3–4 years, disparities became more pronounced: extremely preterm children exhibited markedly lower cognitive scores than all other groups (P<0.01–0.001) and additional deficits in language, motor, and adaptive behavior domains compared with both late and moderate preterm peers (P<0.05–0.01). Very preterm children also had lower cognitive and motor scores relative to late and moderate preterm peers (P<0.01).
Table 3 presents GLM results within an ANOVA framework, demonstrating longitudinal differences in neurodevelopmental scores across groups. In the first year, no significant differences were observed in cognitive scores (F=0.56, P=0.642). By the third to fourth years, however, a distinct gestational age-dependent gradient emerged (F=15.64, P<0.001), with lower cognitive scores in the very preterm and extremely preterm groups compared with the late preterm group (P=0.001 and P<0.001, respectively). Post hoc contrasts revealed that the extremely preterm group scored significantly lower than the moderate preterm (GLM estimate=−18.88, P<0.001) and very preterm groups (GLM estimate=−11.17, P=0.002).
General linear model estimates and pairwise comparisons of composite scores at infancy and early preschool according to gestational age groups (n=83)
A similar pattern was noted for language outcomes, with no significant group differences in the first year (F=0.22, P=0.882), but a significant overall effect by the third to fourth years (F=9.66, P<0.001). Both the very preterm and extremely preterm groups scored significantly lower than the late preterm group, and the extremely preterm group also performed below the moderate preterm group (GLM estimate=−12.74, P=0.037). Adjustment for first-year scores in both domains did not alter these associations (P>0.05). Fig. 1 illustrates the longitudinal trajectories of cognitive (A) and language (B) composite scores across gestational age groups.
Longitudinal trajectories of cognitive (A) and language (B) scores from infancy to early preschool age by gestational age (GA) group. Values represent least-square (LS) means of composite scores. Error bars indicate 95% confidence intervals (CIs). LP, late preterm; MP, moderate preterm; VP, very preterm; EP, extremely preterm.
Group differences in motor scores were evident from the first year (F=3.63, P=0.016). Late preterm infants had slightly lower scores than moderate preterm peers (GLM estimate=6.3, P=0.091), whereas the extremely preterm group showed the lowest values, significantly below the moderate preterm group (GLM estimate=−12.63, P=0.008). By the third–fourth years, the overall effect remained significant (F=10.82, P<0.001), with very preterm and extremely preterm children scoring lower than late preterm peers, and very preterm children also performing below moderate preterm peers (GLM estimate=−7.61, P=0.012). First-year scores were strongly associated with later motor performance (P<0.001), although a consistent gestational age-dependent gradient was not sustained beyond infancy.
In the adaptive behavior domain, no significant group differences were observed during infancy (F=2.23, P=0.090), but by the third–fourth years, the overall effect became significant (F=5.33, P=0.002), with very preterm and extremely preterm children scoring lower than late preterm peers. No other group differences were detected. First-year scores significantly predicted later adaptive performance (P=0.001). Fig. 2 illustrates the longitudinal trajectories of motor and adaptive behavior composite scores.
Longitudinal trajectories of motor (A) and adaptive behavior (B) scores from infancy to early preschool age by gestational age (GA) group. Values represent least-square (LS) means of composite scores. Error bars indicate confidence intervals (CIs). LP, late preterm; MP, moderate preterm; VP, very preterm; EP, extremely preterm.
In the social-emotional domain, the gestational age-dependent effect was not significant at either time point (F=0.18, P=0.913 in the first year; F=2.15, P=0.101 in the third–fourth years), although the first-year score independently predicted later outcomes (GLM estimate=0.32, P=0.003).
At early preschool age (third to fourth years, n=100), developmental delay was observed in 19% of children on the cognitive scale, 40% on the language scale, 13% on the motor scale, 19% on the social-emotional scale, and 48% on the adaptive behavior scale. Multivariable logistic regression analyses based on these binary outcomes yielded the following results. Table 4 shows the cognitive delay model, indicating that lower gestational age was the only independent predictor. Each additional week of gestation was associated with approximately a two-fold reduction in the odds of impairment (OR, 0.50; 95% confidence interval [CI], 0.23 to 0.85; P=0.036). Fig. 3 demonstrates excellent model discrimination (area under the curve [AUC]=0.96).
Multivariable logistic regression analysis for predictors of cognitive developmental delay in preterm children
Receiver operating characteristic curve of the multivariable logistic regression model predicting adverse cognitive outcomes in preterm children. AUC, area under the curve.
Significant predictors of language delay (Table 5) included lower gestational age (OR, 0.745; P=0.027), elevated red blood cell (RBC) count (OR, 8.24; P=0.003), and higher thyroid-stimulating hormone (TSH) levels on day 7 (OR, 1.136; P=0.040), while male sex showed borderline significance (OR, 3.58; P=0.056). The model demonstrated strong discrimination (AUC=0.88) (Fig. 4).
Multivariable logistic regression analysis for predictors of language developmental delay in preterm children
Receiver operating characteristic curve of the multivariable logistic regression model predicting adverse language outcomes in preterm children. AUC, area under the curve.
In the motor, social-emotional, and adaptive behavior delay models (Supplementary Tables 1-3), no independent predictors reached statistical significance; however, model discrimination remained strong (AUC=0.89), good (AUC=0.82), and fair (AUC=0.77), respectively.
Discussion
This study demonstrated gestational age- and domain-specific neurodevelopmental trajectories in preterm children, with multivariable logistic regression clarifying the role of gestational age and related perinatal factors as key determinants of developmental delay at early preschool age. During the first year, motor scores differed significantly across groups, with extremely preterm infants showing the lowest values—reflecting the heightened vulnerability of motor pathways, as cortical–subcortical connectivity and white matter are particularly susceptible to perinatal injury [4,19]. However, this gestational age-dependent gradient was not maintained beyond infancy, as group differences partially converged by early preschool age, suggesting developmental catch-up driven by neural maturation and rehabilitation. The strong association between first-year motor scores and later performance indicates that early motor disparities may serve as valuable predictors of subsequent neurodevelopmental delay [20,21].
Disparities in other domains were modest in infancy but became increasingly pronounced by early preschool age. Cognitive performance exhibited the most consistent gestational age-dependent trajectory; by the third to fourth years, extremely preterm children demonstrated the steepest decline, accompanied by additional deficits in language, motor, and adaptive domains—indicating that developmental risks broaden with age. GLM analyses in the longitudinal subsample confirmed that cognitive and language outcomes at early preschool age were strongly influenced by gestational age, underscoring these domains as reliable indicators of developmental vulnerability in preterm populations [12-14,22]. Modest correlations between infancy and early preschool performance highlight the limited predictive validity of early assessments and the need for continued, gestational age–stratified follow-up beyond 24 months. The delayed manifestation of these disparities supports prior findings that early developmental testing often underestimates later cognitive and language deficits, as Bayley-III scores in infancy may capture transient adaptive responses rather than stable neurocognitive capacity [13,14,18].
These findings align with prior reports showing that neuropsychological deficits in extremely preterm cohorts persist and often intensify beyond toddlerhood, leading to functional impairments during the preschool years [17,23,24] and sustained intelligence quotient (IQ) reductions into later childhood and adulthood [25-27]. Preterm birth disrupts critical phases of brain maturation—including synaptogenesis, myelination, and network integration—processes that are further modulated by the degree of prematurity, endocrine dysregulation, and susceptibility to hypoxic-ischemic and inflammatory injury [3,7,19,23,28,29]. Consequently, developmental outcomes should be interpreted with respect to both gestational age and maturational stage. Early infancy assessments often lack sensitivity to higher-order cognitive functions, as executive systems are immature and difficult to evaluate at that stage [30], and transient compensatory mechanisms during early synaptic proliferation may obscure latent vulnerabilities [31]. As associative and executive cortical regions—particularly prefrontal networks—mature and synaptic pruning refines connectivity, these deficits become more evident [17,31,32]. Exuberant synaptogenesis and widespread neural connectivity in infancy may temporarily mask subtle inefficiencies, but as pruning progresses, underlying cognitive weaknesses emerge [29,31]. This developmental pattern may explain why extremely preterm infants who appear developmentally typical in early assessments often exhibit widening cognitive disparities during the preschool and school years [17,23,24]. Moreover, because developmental testing in this study was corrected for gestational age until 24 months, subtle impairments in infancy may have been underestimated, with more pronounced delays emerging later [33].
The Bayley-III adaptive behavior and social-emotional scales are influenced by multiple biological and environmental factors, and their predictive value may become more apparent at older ages [27,34]. However, scores at 18–24 months have been shown to predict school-age cognitive and behavioral outcomes in preterm children [35]. In our study, adaptive behavior trajectories paralleled cognitive outcomes, with significant group differences emerging by early preschool age, whereas social-emotional outcomes showed only a nonsignificant trend toward vulnerability. Although gestational age–related differences were not apparent during infancy, first-year scores on both scales independently predicted later outcomes, suggesting that early variations may signal subsequent neurodevelopmental and behavioral difficulties. These findings highlight the importance of early risk screening and long-term follow-up for identifying children at increased risk.
Multivariable logistic regression identified lower gestational age as a consistent independent predictor of cognitive and language delay, reinforcing its central role in shaping long-term neurodevelopmental outcomes [3,6]. Consistent with prior studies, the extremely preterm group in our cohort had the highest burden of perinatal complications, including asphyxia, RDS, low-grade IVH, non-cystic PVL, cardiovascular shunts, major brain injuries of prematurity, NEC, and thyroid dysfunction requiring hormone replacement. These morbidities may impair brain maturation through hypoxic-ischemic injury, inflammation, or altered hormonal regulation; however, they also cluster among the most immature neonates, raising debate as to whether they serve as independent predictors or primarily reflect the intrinsic vulnerability of extreme prematurity [1,7,8,26].
In the cognitive delay model, gestational age remained the sole independent predictor of outcome. Sensitivity analyses incorporating perinatal and neonatal morbidities (Table 1), which were excluded from the final model, did not materially change the results, indicating that these conditions likely represent markers of prematurity-related vulnerability rather than independent risk factors. In the language-delay model, lower gestational age, elevated neonatal TSH levels, and higher RBC counts emerged as independent predictors, while male sex demonstrated borderline significance, supporting a multifactorial basis for language delay and underscoring the need for further investigation [22,36]. By contrast, social-emotional and adaptive behavior domains showed no significant associations with biological or perinatal variables, suggesting that their later developmental emergence may be more strongly modulated by environmental influences. Motor delay was also less associated with gestational age by early preschool age, likely reflecting its earlier manifestation and greater responsiveness to postnatal rehabilitation and environmental enrichment.
As reported in previous studies, severe white matter injury and high-grade IVH remain strong predictors of adverse motor and sensory outcomes [3,37]. Other studies have shown that most infants with low-grade IVH or PVL demonstrate normal neurodevelopmental outcomes by 2 years of corrected age [37,38]. Diffuse, non-cystic white matter injury—often undetectable on standard neuroimaging—remains common and may contribute to subtle, long-term deficits [3,4,6]. In our cohort, low-grade IVH and non-cystic PVL were more prevalent among very and extremely preterm infants but were not independent predictors of developmental delay. This likely reflects the limited sensitivity of conventional imaging, compensatory neuroplasticity, modest sample size, and attenuation of effects after adjustment for stronger predictors. Accordingly, mild neuroimaging abnormalities should be interpreted with caution, underscoring the need for multimodal assessment, advanced neuroimaging techniques, and long-term follow-up [7,38].
Although the relatively small sample size at later time points was a limitation, the multivariable logistic regression models demonstrated strong discriminative ability for early risk stratification. Future research should aim to further elucidate mechanistic pathways—including endocrine dysregulation, neuroinflammation, and structural brain alterations—and assess targeted interventions tailored to gestational age and developmental stage [28,29].
In summary, this longitudinal study demonstrated that neurodevelopmental disparities in preterm children follow a dynamic, domain-specific trajectory influenced by gestational age and the timing of assessment. Motor delays were most evident in infancy, whereas cognitive and language impairments became more pronounced by early preschool age. Extremely preterm children exhibited the steepest decline, and gestational age remained a consistent independent predictor of cognitive and language delay at early preschool age. Because developmental assessments during infancy may underestimate later-emerging impairments, longitudinal follow-up with gestational age–stratified monitoring and timely, targeted intervention is essential to promote optimal neurodevelopmental outcomes.
Supplementary material
Supplementary materials related to this article can be found online at https://doi.org/10.26815/acn.2025.01046
Multivariable logistic regression analysis for predictors of motor developmental delay in preterm children
Multivariable logistic regression analysis for predictors of social-emotional developmental delay in preterm children
Multivariable logistic regression analysis for predictors of adaptive behavior developmental delay in preterm children
Notes
Conflicts of interest
No potential conflict of interest relevant to this article was reported.
Author contribution
Conceptualization: JNY and SSK. Data curation: JNY, YKS, NHH, and SAK. Formal analysis: JNY and NHH. Methodology: SAK, JHS, and SSK. Project administration: JHS and SSK. Visualization: YKS and DHK. Writing - original draft: JNY and DHK. Writing - review & editing: JNY.
Acknowledgments
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors. The authors are deeply grateful to all staff members at Soonchunhyang University Cheonan Hospital who contributed to this study.
