by Franco Di Cesare, MD; Leonardo Di Cesare; and Cristiana Di Carlo, MSc 

Dr. F. Di Cesare, Mr. L. Di Cesare, and Ms. Di Carlo are with Leoben Research srl in Rome, Italy.

FUNDING: This study was supported by Leoben Research. 

DISCLOSURES: The authors are employed by Leoben Research. 

Innov Clin Neurosci. 2021;18(10–12):30–37.

ABSTRACT: Objective. The assessment of a child’s cognitive health in developing countries poses significant challenges, including the paucity of valid diagnostic tools. We report the development and initial psychometric evaluation of a new eight-item cognitive ability assessment tool (CAAT-8) for use in an African Sub-Saharan school-aged population.

Design. CAAT-8 reliability and validity were assessed in a field trial. Participants (446 children aged 5–17 years) were recruited at multiple clinical sites and schools. Methods and techniques based on Item Response Theory and Structural Equation Modeling were applied for item analysis and selection, reliability, and validity assessments.

Results. CAAT-8 includes eight cognitive tasks and provides a reliable measure of the factor of Knowledge Processing. Knowledge Processing consistently increased over age (simple regression model, R2=0.44). A poorer health status (e.g., due to a neurological or medical condition or chronic exposure to psychosocial stress and deprivation) was associated with lower Knowledge Processing.

Conclusion. CAAT-8 is a viable methodology for cognitive health assessment in a pediatric school-aged population. The results from this study warrant further research to validate its use in healthcare and clinical research settings.

Keywords: Cognitive assessment, pediatric, instrument development, Item Response Theory, eight-item cognitive ability assessment tool (CAAT-8), developing countries

Cognitive health is an important indicator of a child’s overall health status. There is a general acknowledgement that enabling pediatric cognitive assessment in developing countries can result in better health outcomes and informed health-related policy making;1,2 however, scarce availability of valid diagnostic instruments is still one of the barriers preventing a satisfactory assessment of cognitive health outcomes at individual, group, and population levels.1,2 

Notably, there are three specific areas of unmet healthcare needs in developing countries that cognitive assessment helps to address. Firstly, cognitive impairments have the potential to deter the psychological development of children and prevent their full functioning in life. Early detection and precise definition of cognitive deficits can help identify children who are at increased risk for altered psychological development and inform  clinicians on the most appropriate measures to adopt for clinical management. Secondly, more cognitive assessment tools are needed to advance the field of clinical research and better characterize the manifestations and the burden of medical (e.g., malaria, human immunodeficiency virus-acquired immunodeficiency syndrome [HIV-AIDS] infection, postinfective neurocognitive complications, malnutrition) and psychosocial conditions on children’s cognitive functioning and development. Finally, new cognitive measures are needed in pediatric clinical trial design for the purpose of diagnostic classification and evaluation outcomes (e.g., efficacy, safety, cost-efficiency, cost-effectiveness) of intervention programs.2

Zambia is a defining example of an African Sub-Saharan developing country that would benefit from improved pediatric cognitive assessment. To date, the Panga Munthu (translated as “Make a Person”) test is the only instrument originally developed and standardized on a Zambian pediatric population.3,4 For use in healthcare and research settings, Zambia mostly relies on a handful of other instruments that were originally developed in North American and European countries.3,5–8 The issue of over-reliance on imported instruments is shared by most developing countries worldwide and reflects the general challenge these countries have been facing in producing original tests; the adoption of tests developed and standardized in a different cultural and socioeconomic context carries the risk of poor cross-cultural validity, and ultimately misrepresenting the cognitive abilities of children.9 

In this article, we describe the development of a new cognitive ability assessment tool (CAAT), report the assessment of its psychometric properties in a Zambian school-aged population, and outline the next steps of its validation process.

Materials and Methods

Development of testing methodology. A panel of experts, including Zambian healthcare professionals (three pediatricians, two psychologists, and a pediatric neurologist) and two other experts in methodology of cognitive test development designed the cognitive assessment tool. The group devised a list of cognitive tasks related to six functional domains of interest: orientation, executive, perceptual and psycho-motor, visuo-constructional, memory, and language. The selection of domains for assessment was based on a critical review of currently available pediatric cognitive screening tools and by consensus reached by expert review. A set of cognitive tasks was short-listed according to their relevance to the selected assessment domains and to Zambian children’s everyday life activities (e.g., cognitive tasks usually carried out during social interactions or demanded in an academic or educational environment). Simpler cognitive tasks, easy for administration and scoring, as well as engaging for children to take, were preferred. Eleven cognitive tasks were selected as a compromise to balance the need to assess as wide and diverse a range of cognitive abilities as possible and the constraint posed by the operational requirement to allow a limited time for assessment (less than 10 minutes for administration and scoring). Specific measures (i.e., the choice of the test name, some specific language in the instructions to the child, and the format of test administration) were taken to enhance face validity. CAAT administration guidelines and data capture forms were prepared in three different languages (English, Bemba, and Nyanja).

Field trial. A field trial was conducted to evaluate CAAT reliability and validity in a school-aged population. The field trial obtained approval from the National Health Research Authority and from the University of Zambia Biomedical Research Ethics Committee (reference number 008-08-165, approval on 2/27/2018) and was conducted in accordance with Zambian legal and regulatory requirements. Parents or legal guardians of children participating in this study signed a written informed consent, including assent from the child if aged 12 years or older. Informed consent process was carried out in the presence of an independent witness for illiterate parents or guardians. 

To be included in the study, participants had to be 5 to 17 years of age (including limits), have access to formal education, have experience and familiarity with cognitive tasks similar to the ones required by the CAAT (e.g., drawing), and be able to communicate effectively with the examiner. Exclusion criteria were presence of serious health conditions currently requiring inpatient hospitalization, significant visuo-perceptual disability, significant speech disability, hearing impairments, significant sensory or motor disabilities, current acute medical or mental health conditions, or chronic condition that prevents the execution of cognitive tests. Trial participants were enrolled from August 2018 through October 2019.

The study population was recruited at multiple clinical and school sites in two regions of Zambia: Lusaka metropolitan area and the Copperbelt Province. Schools in Lusaka and the Copperbelt Province were purposely selected to provide a heterogeneous sample representative of the wide socioeconomic, cultural, and linguistic diversity of the Zambian pediatric population across regions. English, Bemba, and Nyanja were languages most frequently spoken in the areas where children were recruited. Clinical sites were outpatient clinics at the University Teaching Hospital Department of Pediatrics and Child Health in Lusaka and tertiary outpatient clinics in the Copperbelt Province, Ndola area. University Teaching Hospital is the highest referral hospital in Zambia, delivering specialized care in pediatrics, and receives patients from all over the country. Clinical sites were selected to ensure that a heterogeneous sample representative of diverse medical and psychosocial conditions affecting health and cognitive development was drawn. Participants and sites for test administration represented a broad sample of children and settings for which CAAT use is intended. 

Sociodemographic and relevant medical history information was collected. A subsample of participants underwent clinical and neurological examination and electroencephalographic (EEG) evaluation.

CAAT was administered as a standardized, structured, performance-based, brief cognitive examination. The child completed a sequence of tasks denoted as Orientation Place, Orientation Time, 3-Item Delayed Recall, Number Sequencing Forward, Number Sequencing Backward, Object Naming, Sentence Repetition, Command Execution, Reading Comprehension, Writing, and Copy Design.  

CAAT is a multi-item instrument, in which items are scored along at least two categories, and categories are ordered such that responses in higher grade categories indicate higher cognitive functioning being measured. In this study, the performance of each cognitive task was noted according to a pre-defined scoring system. Guidelines on CAAT administration and scoring were adopted to maximize inter-rater reliability. All investigators attended a training program. 

STATA software version 15.110 was used for all statistical analyses. 

Methodology for item calibration and selection, CAAT measurement model evaluation, and reliability assessment is outlined below. 

As a first step, a test item analysis was carried out. Descriptive statistics were obtained for each test item. An Item Response Theory graded response model (IRT-GRM) was used for item calibration and to inform on item selection. We aimed to select items with a wide range of difficulty spread across the underlying cognitive function, along with acceptable discrimination. The boundary characteristic curve was obtained and evaluated for each item; item difficulty (i.e., location of an item on the cognitive scale) and discrimination (i.e., the slope of the boundary characteristic curve) parameters were estimated. Items with lesser difficulty were characterized by negative values, and items with higher levels of difficulty were characterized by positive values. Discrimination indicated how much the probability of correct response changed with cognitive function near the item difficulty. An item with a large discrimination value had a high correlation between the underlying cognitive function and the probability of correct response on that item and would differentiate better between low and high levels of the cognitive function. Also, a highly discriminating item would differentiate better, around its difficulty value, between individuals on similar levels of the cognitive function. We set a boundary characteristic curve slope value lower than 1.0 as a cut-off to indicate a poorly discriminating item. 

Thereafter, we analyzed the category characteristics curves (CCCs) and the test characteristics curve (TCC). Item information functions were used to evaluate each item’s contribution to measurement precision throughout the score range. Instrument’s total information function (TIF) was estimated as the sum of the item information functions. Fit to parametric IRT-GRM was evaluated by Chi-square tests comparing the predicted and observed item distribution for various levels for the sum score of all other items.  

As a third analytic step, an exploratory factor analysis (EFA) was run to assess dimensionality and to inform on item selection. Assumptions of independent sampling, normality, linear relationship between pairs of variables, and the variable being correlated at moderate level (r>±0.30) were checked to verify the validity of the application of factor analysis. The Kaiser-Olgin-Maier measure of sample adequacy (KOM-MSA) was determined with a cut-off of KOM-MSA greater than 0.80.11 A principal component analysis was the chosen method of factor extraction. Factors with eigenvalues greater than 1.0 were considered significant; the inspection of scree plot of eigenvalues was also conducted to evaluate the number of factors. We applied Varimax orthogonal factor rotation method with Kaiser’s normalization. Items with factor loadings lower than 0.40 were assessed as of no practical significance.11

Fourth, a confirmatory factor analysis (CFA) was applied to test the measurement model under the assumption that a single-factor model should fit the data, if unidimensionality holds. Structural equation (SE) modeling with variance-covariance input matrix and maximum likelihood estimation method was carried out to derive model estimates. Good-fit indices and relative cut-offs were adopted to evaluate model fit as follows: comparative fit index (CFI) greater than 0.90; Tucker Lewis index (TLI) greater than 0.95; root mean square error of approximation (RMSEA) less than 0.06; standardized root mean squared residual (SRMSR) less than 0.08.12 Modification indices for path coefficients and covariances that were constrained or omitted in the fitted model were evaluated to test whether the correlations between the errors should have been included in the model. Loading estimates were assessed; the item was considered for deletion from the model if the value was lower than 0.50.11 The evaluation of the matrix of covariance of observed and standardized residuals were carried out to test unidimensionality, under the assumption of local independent, so that all pairs of CAAT items should result as uncorrelated after controlling for the latent trait.13,14 Standardized greater than 4.0 residuals was selected as a cut-off to signal an unacceptable degree of error, resulting in the deletion of an item from the model.12 Measure invariance was assessed. Cronbach’s alpha and the average interitem correlation were calculated as other point estimates of internal consistency; we applied cut-off value of alpha greater than 0.80 to determine the threshold for acceptability of reliability estimates for the intended clinical use of the instrument.15

Finally, effects of sex, age, and health status on CAAT score were tested. Nonparametric and parametric tests, linear regression model, and analysis of variance (ANOVA) were used. The CAAT measurement model we intended to validate posited the CAAT total score as a measure of one underlying cognitive function (i.e., unidimensionality of construct); as dependent on age and increasing by age from childhood to reach a plateau in adolescence along the child’s process of cognitive development; and as dependent on health status, whereas health status is a multifactorial construct broadly defined as “a state of complete physical, mental, and social well-being, and not merely the absence of disease,”16 and a poorer health status would be associated with altered development or impairment of cognitive functioning.


Test items analysis and selection. Sample characteristics are summarized in Table 1. All children underwent CAAT assessment; mean time (standard deviation [SD]) for CAAT administration was 403.7 (125.7) seconds. Summary statistics of item analysis are shown in Table 2.

Eleven CAAT items successfully fit with an IRT-GRM. The difficulty parameter of the 11 items showed a range of values from –3.8 to 2.3, indicating item difficulty as widely spread across the underlying cognitive function. Among the 11 items, discrimination, as slope value, ranged from 0.47 (Sentence Repetition) to 9.7 (Reading). For three items, the slope value was below 0.60: Object Naming, 0.54;  Sentence Repetition, 0.47; and 3-Item Delayed Recall, 0.56. These findings suggested that these three items provide much less information compared to the other eight items.  

An EFA was carried out in 446 cases. A two-factor solution emerged. Factor 1 had eigenvalue of 3.9 and a proportion of explained variance of 35 percent; Factor 2 had eigenvalue of 1.4 and a proportion of explained variance of 13 percent. Eight items showed high loading values on Factor 1, ranging from 0.45 to 0.83. The three remaining items, Object Naming, Sentence Repetition, and 3-Item Delayed Recall, showed high loading values (ranging from 0.49–0.68) on Factor 2 and very low loading values on Factor 1 (Object Naming: 0.15; Sentence Repetition: 0.02; 3-Item Delayed Recall: 0.10). The KOM-MSA was 0.90, indicating that factor analysis was valid for this data sample. Other point estimates of internal consistency, (Cronbach’s alpha=0.81; average interitem correlation=0.28) suggested a good reliability.

We did not consider the results of these initial assessments to meet the requirements of reliability and construct validity (i.e., supporting evidence for unidimensionality) we originally set; thus, we dropped the three items (Object Naming, Sentence Repetition, and 3-Item Delayed Recall) with poorer discrimination values. Sequentially, we focused on the evaluation of the psychometric properties of a shorter CAAT version that included only eight items.

Testing unidimensionality of CAAT-8 score. Eight CAAT (CAAT-8) items successfully fit with an IRT-GRM (Table 3). 

Among the eight items, difficulty parameter showed a range with values from –6.2 to 2.3, indicating item difficulty as widely spread across the underlying cognitive function. Most of CAAT-8 items presented difficulty parameter estimates below zero, meaning that they were relatively easy and therefore most useful in discriminating subjects who had lower cognitive function. Discrimination, as slope value, ranged from 0.8 (Number Sequencing Forward) to 9.5 (Reading). Using the 95-percent critical values from the standard normal distribution (–1.96 and 1.96), the TCC plot showed that 95 percent of randomly selected school-aged children scored between 23.2 and 7.42. In other words, about 2.5 percent of randomly selected children were expected to score under 8, which indicates a lower level of cognitive function.

An EFA was carried out on 446 cases. The KOM-MSA was 0.90. One factor with eigenvalue of 4.0 was extracted from the analysis, explaining the 50.12 percent of total variance. All items showed high loading values (ranging from 0.48–0.83) on Factor 1.

We examined the content of the items, found that they fit together conceptually, and named the factor Knowledge Processing. 

As our next step, we conducted a CFA (Table 4). There was no important departure from the assumption of multivariate normality. There was no transformation of data or missing data. CFA showed that a single-factor model could fit the data. We concluded that the hypothesis of unidimensionality could not be rejected, due to good-fit indices satisfying cut-off criteria (CFI: 0.99; TLI: 0.98; RMSEA: 0.041; SRMSR: 0.025). The assessment of loading estimates pointed out that all items had loading estimates higher than 0.70, except Command Execution, which had a loading estimate of 0.60, suggesting a close relationship between items and construct. The evaluation of the matrix of covariances of observed and standardized residuals did not evidence covariances, which suggests an additional dimension or item redundancies. Overall, findings from the CFA provide evidence to support the validity of the measurement model (i.e., the factorial validity of CAAT-8 score and nonrejection of the assumption of test unidimensionality). 

CAAT-8 measurement invariance was tested by evaluating how much the specified unidimensional construct model fits the observed data. The configural invariance by sex group was tested. Model fit indices (χ2[56]: 1349.955, p<0.001; TLI: 0.994; CFI: 0.993; SRMR: 0.050; RMSEA: 0.025) demonstrate an equivalent solution (i.e., unidimensional construct) for both female and male participant groups.

Other point estimates of internal consistency, such as Cronbach’s alpha (0.85) and the average interitem correlation (0.41), indicate a good CAAT-8 reliability.  

In sum, we found that CAAT-8 provided a reliable measure of a speculative construct, which we named Knowledge Processing. CAAT-8 estimates of reliability were satisfactorily high to justify the use of the instrument in clinical settings.

Test scoring. A simple scoring method was applied to derive a Knowledge Processing score. Scores assigned to each of the CAAT-8 items were summed and combined in a summary score. Knowledge Processing score ranged from 0 to 24, with higher values indicating higher functioning. The simple scoring methodology approach was chosen as most practical to use compared to other approaches (e.g., methods based on weighting of individual items or IRT-based scoring methods) requiring more laborious hand computations or computerized scoring.

Effect of sex and age on Knowledge Processing. This assessment was carried out on a sample of children in apparent good health at the time of assessment, although they might have been suffering from a chronic health condition (e.g., nonrecurrent malaria) that did not represent an important risk or causative factor for cognitive impairment. They were children with access to formal education and regularly attending school, with no impairments likely to prevent the execution of CAAT cognitive tasks. Sample characteristics are summarized under the healthy control column in Table 1. Knowledge Processing mean (SD) was 16.8 (4.9). Knowledge Processing assumptions of normality were verified.

One-way ANOVA showed no statistically significant effect of sex on Knowledge Processing at the p<0.05 level for the two conditions (F [1, 335]: 0.00, p=0.99). 

The scatter plot analysis showed a strong positive linear relationship (n=337, r=0.57, p<0.000) between age and Knowledge Processing. 

Simple linear regression pointed to a significant relationship between Knowledge Processing and age (p<0.001). The slope coefficient for age was 0.96, which predicts the Knowledge Processing score will increase by 0.96 for each extra year of age. The R2 value was 0.44. This is quite high, so predictions from the regression equation are sufficiently reliable. The scatter plot of standardized predicted values versus standardized residuals showed that the data met the assumptions of homogeneity of variance and linearity and that the residuals were approximately normally distributed.  

The effect of age was evaluated between childhood (5–7 years of age), late childhood (8–11 years of age), and adolescence (12–17 years of age) age groups. One-way ANOVA confirmed a significant effect of age group on Knowledge Processing at the p<0.05 level for the three conditions (F [2, 334]:133.9, p=0.000). Post-hoc pairwise comparisons using Scheffè’s post-criterion test for significance showed that the mean score for the childhood age group (n=78, mean=12.5, SD=4.1) was significantly lower (p=0.000) than the late childhood (n=98, mean=16.8, SD=4.5) and the adolescence age groups (n=161, mean=20.3, SD=2.2). The early childhood age group was also statistically lower than the adolescence age group (p=0.000). 

Overall, these results show that age did have an effect on Knowledge Processing and functioning increased with age. No effect of sex on Knowledge Processing was found.

Effects of health status and age on Knowledge Processing. The CAAT measurement model posits that a poorer health status is associated with lower cognitive functioning. We compared Knowledge Processing of two groups of participants, poor health status (PHS) and healthy control (HC). A description of sample characteristics is summarized in Table 1. PHS group included children suffering with at least one chronic health condition that is a recognized important risk or causative factor for cognitive impairment and required medical intervention (but not hospitalization) at the time of study assessment. The diagnostic classification was based on the medical review of available health information, such as medical history, clinical and neurological examination, instrumental diagnosis, and reporting from reliable proxies. 

A two-way, between-subjects ANOVA design evaluated main and interaction effects of age group and health status on Knowledge Processing. Knowledge Processing mean (structural equation [SE]), 95% confidence interval (CI) [lower limit (LL), upper limit (UL)] were HC: 17.5, 0.25, [17.0, 18.0] and PHS: 14.7, 0.49, [13.7,15.7]. Distribution of Knowledge Processing score by health status and age group is reported in Table 5. 

ANOVA showed significant main effects by health status and age group. The Knowledge Processing mean score (17.5, SD: 4.6) of HC was higher than PHS (14.7, SD: 5.2); the health status main effect was significant with a moderate effect size (F [1,440]: 43.03, p=0.000, partial ε2=0.09). The age group main effect was also significant, with a larger effect size (F [2,440]: 70.82, p=0.000, partial ε2=0.24). The interaction between disease status and age group approached, but did not reach, statistical significance (F [2,440]: 3.00, p=0.051, partial ε2=0.01). 

Knowledge Processing depends on health status, and poor health status was associated with lower scores, which increased with age. 


CAAT-8 has been developed as a cognitive assessment tool for a school-aged pediatric population. A test development strategy has been centered on the validation of CAAT-8 as a culturally appropriate diagnostic instrument sensitive to the African Sub-Saharan context. 

CAAT-8 is intended for use in healthcare and clinical research settings to detect a status of mild-to-moderate cognitive impairment and is characterized by a user-friendly, simplified, and fast (7 minutes) procedure of administration and scoring. CAAT-8 provides a summary measure of cognitive functioning, termed Knowledge Processing. We define Knowledge Processing as a cognitive function underlying a multielement, organized, complex cognitive behavior of immediate response to environmental demands. Examples of a multielement, organized, cognitive behavior include, but are not limited to, the retrieval of a small cluster (up to 5 elements) of time or place specifiers, the serial recall of items from a short (up to 5 elements) sequence, sentence writing, multistep command execution, and reading comprehension of a simple command.

Results of the initial assessment of psychometric properties provide evidence of CAAT-8 reliability, validity, and potential clinical utility. 

Key findings in our research support the hypothesis of unidimensionality of Knowledge Processing. Items belonging together in CAAT-8 score captured differences in the same underlying construct, Knowledge Processing, as results of the CFA provide evidence of factorial validity of test score interpretation. This is consistent with the item development and selection process that were designed to ensure that all items maximize the probability to capture the intended construct. Then, the criterion of item local independence was met, as the applied measurement model for CAAT-8 did not show correlated errors or loadings from other latent variables, and the model testing did not reveal misfit. Furthermore CAAT-8 fits with the IRT model, also suggesting unidimensionality. Finally, the assessment of measurement invariance (specifically, configural invariance) found that Knowledge Processing has the same pattern of free and fixed loadings across groups (e.g., male vs. female groups). Invariance at the configural level means that the basic organization of the construct (i.e., six loadings on Knowledge Processing latent factor) was demonstrated in both groups. 

Knowledge Processing increased with age from childhood to reach a plateau in adolescence along a child’s process of cognitive development. Regression analysis indicates that 44.5 percent of the variation in Knowledge Processing can be explained by a model containing age alone. This finding supports evidence of validity of use in evaluating a child’s cognitive development. Also, it means that a larger portion of the variation is still unexplained; adding other independent variables could improve the fit of the model. 

Knowledge Processing was dependent on health status, with lower scores associated with poorer health status (e.g., presence of a chronic neurologic or a medical disease or a prolonged psychosocial stress and deprivation). This finding is consistent with previous research pointing at a defining role of health conditions as important risk factors for cognitive impairment or altered cognitive development in children.17 It points to the potential utility of CAAT-8 use in evaluating the effect of health-related risk or protective factors on a child’s cognitive development.

Limitations. One of the limitations of our study is that its design only allowed for an initial and partial testing of psychometric properties. However, study results do enable and inform on next steps of the validation process. 

First, studies to test convergent and discriminant validity of Knowledge Processing score interpretation are necessary; the unidimensionality of the Knowledge Processing score does not necessarily imply that Knowledge Processing measures a unitary psychological process. Future evaluations should evaluate the overlap with other closed constructs (e.g., visuo-constructional ability, episodic memory, executive functioning, intelligence). Multitrait-multimethod matrix could be applied to assess convergent and discriminant validity,18 and the use of tests originally developed or cross-culturally validated in Zambia should be prioritized.8 The evaluation of Knowledge Processing in relation to intelligence, as measured by the Raven’s Progressive Matrices19 or the Panga Munthu test, is of particular interest. Studies have drawn attention to the importance attached to social intelligence in African cultures, such as the ability to deal with socially complex situations, which goes well beyond the traditional concept of intelligence as cognitive processing.20 With this respect, Knowledge Processing should be validated against Zambian measures of social competence, linguistic ability, and other socio-cultural and educational indicators.

Second, CAAT-8 retains a potential for use in hypothesis-driven or exploratory clinical and epidemiology research. CAAT-8 can contribute to better characterization of clinical manifestations and burden of widespread medical or psychosocial conditions (e.g., social deprivation) on a child’s cognitive functioning and development; CAAT-8 could be applied to clinical trial design for the purpose of diagnostic classification or as evaluation outcomes (e.g., efficacy, safety, cost-efficiency) of intervention programs. Further investigation should be undertaken to evaluate the reliability of the Knowledge Processing score when the purpose of the measurement is prediction or monitoring of medical intervention. Specifically, the stability over time of the measure should be addressed by a test-retest study design, and the responsivity to clinical change should be evaluated by a repeated measurement design. A systematic multistep assessment of invariance of Knowledge Processing (i.e., metric, scalar, and residual) should be conducted so that construct can be meaningfully tested or construed across groups, time points, or treatment modality. 

Third, new research should characterize the clinical utility and diagnostic accuracy (i.e., sensitivity and specificity) of CAAT-8 in relation to specific disease or psychosocial conditions.

Another important limitation of our research relates to the characteristics of the study sample, as the sample is not fully representative of the entire Zambian population due to its limited size and its composition. For instance, almost all the participants were from an urban area, whereas most of the Zambian pediatric population lives in rural areas. Additionally, all participants had access to education and regular attendance at school, but in Zambia, as in other Sub-Saharan countries, due to occasionally insufficient educational provisions, as well as social or environmental circumstances, such as displacement by traumatic life events (e.g., following the death of a mother), poverty, severe social deprivation or neglect, many children of school age attend school irregularly or not at all. With this regard, robust evidence indicates that exposure to formal schooling is associated with improved performance on tests of cognitive functioning in specific domains, such as language, attention, memory, and phonological awareness.8

The development of normative data is a critical step to enable a valid and more efficient use of CAAT-8 in healthcare and research settings. A new field trial on a larger and representative normative sample is warranted to determine an accurate set of norm values, along with the preparation of guidelines for healthcare professionals on the interpretation of these norm values. Future work on norming samples should prioritize enrolling a representative group of children living in rural areas, as well as a group of children who do not attend school, to minimize the risk for misdiagnosis in children with a limited exposure to educational settings.


The assessment of a child’s cognitive health in Sub-Saharan developing countries poses significant challenges, including the paucity of valid diagnostic tools. CAAT-8 appears to be a viable methodology for cognitive health assessment in a pediatric school-aged population. The results from the study warrant further research to validate its use in healthcare and clinical research settings.


The authors wish to thank the following people for their contribution to the project: Sr. Virginia Chanda, Dr. Ornella Ciccone, Mr. Kalima Kalima, Mrs. Prisca Kalyeyle, Dr. Syvia Mwanza Kabaghe, Dr. Nfewa Kawatu, Dr. Lisa Nkole, Mr. Aaron Phiri, Dr. Somwe Somwe, Ms. Mercy Sulu, and Mr. Owen Tembo.


  1. Stemler ES, Chamwu F, Chart H, et al. Assessing competencies in reading and mathematics in Zambian children. In: Grigorienko E, ed. Multicultural psychoeducational assessment. New York, USA: Springer; 2009:15–160.
  2. Fernald L, Prado E, Kroger P, Raikes A. Toolkit for Measuring Early Childhood Development in Low- and Middle-income Countries. Washington DC: International Bank for Reconstruction and Development/The World Bank; 2017.
  3. Ezeilo B. Validating Panga Munthu test and Porteus Maze test (wooden form) in Zambia. Int J Psychol. 1978;13(4):333–342. 
  4. Kathuria R, Serpell R. Standardisation of the Panga Muntu test. A non-verbal cognitive test development in Zambia. J Negro Educ. 1998;67(3):228–241.
  5. Fink G, Matafwali B, Moucheraud C, Zuilkowski SS. The Zambian early childhood development project 2010 assessment final report. Cambridge, MA: Harvard University; 2012.
  6. Zuilkowski SS, McCoy DC, Serpell R, et al. Dimensionality and the development of cognitive assessments for children in  Sub-Saharan Africa. J Cross Cult Psychol. 2016;47(3):341–354. 
  7. Irvine SH. Toward a rationale for testing attainments and abilities in Africa. Br J Educ Psychol. 1966;36(1):24–32.
  8. Mulenga K, Ahonen T, Aro M. Performance of Zambian children on the NEPSY: a pilot study. Dev Neuropsychol. 2001;20(1):375–383. 
  9. Matafwali B, Serpell R. Design and validation of assessment tests for young children in Zambia. New Dir Child Adolesc Dev. 2014;2014(146):77–96. 
  10. StataCorp. Stata: Release 15. Statistical Software. College Station, TX: StataCorp LLC; 2017.
  11. Hair JF, Black WC, Babin JB, Anderson ER. Multivariate Data Analysis, 7th Edition. Harlow, UK: Pearson; 2010:102 116 622.
  12. Hu L, Bentler PM. Fit indices in covariance structure modeling: sensitivity to under parameterized model misspecification. Psychol Methods. 1998;3(4):424–453.
  13. Lazarsfeld PF. Latent structure analysis. In: Koch S, ed. Psychology: A Study of a Science. New York, NY: McGraw-Hill; 1959:476–543.
  14. Ziegler M, Hagemann D. Testing the unidimensionality of items. Eur J Psychol Assess. 2015;31(4):231–237. 
  15. Cicchetti DV. Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology. Psychol Assess. 1994;6(4):284–290. 
  16. World Health Organization. Basic Documents: 49th Edition (including amendments adopted up to 31 May 2019). Geneva: World Health Organization; 2020:1.
  17. Gall S, Müller I, Walter C, et al. Associations between selective attention and soil-transmitted helminth infections, socioeconomic status, and physical fitness in disadvantaged children in Port Elizabeth, South Africa: an observational study. PLoS Negl Trop Dis. 2017;11(5):e0005573.  
  18. Campbell DT, Fiske DW. Convergent and discriminant validation by the multitrait-multimethod matrix. Psychol Bull. 1959;56:81–105. 
  19. Wicherts JM, Dolan CV, Carlson JS, van der Maas HLN. Raven’s test performance of Sub-Saharan Africans: average performance, psychometric properties, and the Flynn effect. Learn Individ Differ. 2010;20(3):135–151. 
  20. Serpell R, Simatende B. Contextual responsiveness: an enduring challenge for educational assessment in Africa. J Intell. 2016;4(1):3.