20.4.18

Guidelines for Adolescent Depression in Primary Care (GLAD-PC): Part I. Practice Preparation, Identification, Assessment, and Initial Management.

Zuckerbrot RA, Cheung A, Jensen PS, Stein REK, Laraque D; GLAD-PC STEERING GROUP.
Pediatrics. 2018 Feb 26. pii: e20174081. doi: 10.1542/peds.2017-4081. [Epub ahead
of print]

OBJECTIVES: To update clinical practice guidelines to assist primary care (PC)
clinicians in the management of adolescent depression. This part of the updated
guidelines is used to address practice preparation, identification, assessment,
and initial management of adolescent depression in PC settings.
METHODS: By using a combination of evidence- and consensus-based methodologies,
guidelines were developed by an expert steering committee in 2 phases as informed
by (1) current scientific evidence (published and unpublished) and (2) draft
revision and iteration among the steering committee, which included experts,
clinicians, and youth and families with lived experience.
RESULTS: Guidelines were updated for youth aged 10 to 21 years and correspond to 
initial phases of adolescent depression management in PC, including the
identification of at-risk youth, assessment and diagnosis, and initial
management. The strength of each recommendation and its evidence base are
summarized. The practice preparation, identification, assessment, and initial
management section of the guidelines include recommendations for (1) the
preparation of the PC practice for improved care of adolescents with depression; 
(2) annual universal screening of youth 12 and over at health maintenance visits;
(3) the identification of depression in youth who are at high risk; (4)
systematic assessment procedures by using reliable depression scales, patient and
caregiver interviews, and Diagnostic and Statistical Manual of Mental Disorders, 
Fifth Edition criteria; (5) patient and family psychoeducation; (6) the
establishment of relevant links in the community, and (7) the establishment of a 
safety plan.
CONCLUSIONS: This part of the guidelines is intended to assist PC clinicians in
the identification and initial management of adolescents with depression in an
era of great clinical need and shortage of mental health specialists, but they
cannot replace clinical judgment; these guidelines are not meant to be the sole
source of guidance for depression management in adolescents. Additional research 
that addresses the identification and initial management of youth with depression
in PC is needed, including empirical testing of these guidelines.

26.3.18

Interventions promoting exclusive breastfeeding up to six months after birth: A systematic review and meta-analysis of randomized controlled trials.

Kim SK, Park S, Oh J, Kim J, Ahn S.


BACKGROUND: The World Health Organization (WHO) recommends that mothers practice 
exclusive breastfeeding (EBF) of their infants for 6 months. Various
breastfeeding support interventions have been developed to encourage mothers to
maintain breastfeeding practices. Research aim: This study aims to review how
effectively breastfeeding support interventions enable mothers to practice EBF
for 6 months and to suggest the best intervention strategies.
METHODS: Six databases were searched, including MEDLINE, EMBASE, Cochrane,
CINAHL, PsycINFO, and KoreaMed. The authors independently extracted data from
journals written in English or Korean and published between January 2000 and
August 2017. Randomized controlled trials (RCTs) reporting EBF until 6 months
were screened.
RESULTS: A total of 27 RCTs were reviewed, and 36,051 mothers were included. The 
effectiveness of breastfeeding support interventions to promote EBF for 6 months 
was significant (odds ratio [OR] = 2.77; 95% confidence interval [CI]:
1.81-3.76). A further subgroup analysis of intervention effects shows that a baby
friendly hospital initiative (BFHI) intervention (OR = 5.21; 95% CI: 2.15-12.61),
a combined intervention (OR = 3.56; 95% CI: 1.74-7.26), a professional provider
led intervention (OR = 2.76; 95% CI: 1.76-4.33), having a protocol available for 
the provider training program (OR = 2.87; 95% CI: 1.89-4.37) and implementation
during both the prenatal and postnatal periods (OR = 3.32; 95% CI: 1.83-6.03)
increased the rate of EBF for 6 months.
CONCLUSION: We suggest considering a multicomponent intervention as the primary
strategy and implementing BFHI interventions within hospitals. Evidence indicates
that intervention effectiveness increases when a protocol is available for
provider training, when interventions are conducted from the pre- to postnatal
period, when the hospital and community are connected, and when healthcare
professionals are involved.

23.3.18

The Proposal to Lower P Value Thresholds to .005

John P. A. Ioannidis
JAMA. Published online March 22, 2018. doi:10.1001/jama.2018.1536
P values and accompanying methods of statistical significance testing are creating challenges in biomedical science and other disciplines. The vast majority (96%) of articles that report P values in the abstract, full text, or both include some values of .05 or less.1 However, many of the claims that these reports highlight are likely false.2 Recognizing the major importance of the statistical significance conundrum, the American Statistical Association (ASA) published3 a statement on P values in 2016. The status quo is widely believed to be problematic, but how exactly to fix the problem is far more contentious. The contributors to the ASA statement also wrote 20 independent, accompanying commentaries focusing on different aspects and prioritizing different solutions. Another large coalition of 72 methodologists recently proposed4 a specific, simple move: lowering the routine P value threshold for claiming statistical significance from .05 to .005 for new discoveries. The proposal met with strong endorsement in some circles and concerns in others.
P values are misinterpreted, overtrusted, and misused. The language of the ASA statement enables the dissection of these 3 problems. Multiple misinterpretations of P values exist, but the most common one is that they represent the “probability that the studied hypothesis is true.”3 A P value of .02 (2%) is wrongly considered to mean that the null hypothesis (eg, the drug is as effective as placebo) is 2% likely to be true and the alternative (eg, the drug is more effective than placebo) is 98% likely to be correct. Overtrust ensues when it is forgotten that “proper inference requires full reporting and transparency.”3 Better-looking (smaller) P values alone do not guarantee full reporting and transparency. In fact, smaller P values may hint to selective reporting and nontransparency. The most common misuse of the P value is to make “scientific conclusions and business or policy decisions” based on “whether a P value passes a specific threshold” even though “a P value, or statistical significance, does not measure the size of an effect or the importance of a result,” and “by itself, a P value does not provide a good measure of evidence.”3
These 3 major problems mean that passing a statistical significance threshold (traditionally P = .05) is wrongly equated with a finding or an outcome (eg, an association or a treatment effect) being true, valid, and worth acting on. These misconceptions affect researchers, journals, readers, and users of research articles, and even media and the public who consume scientific information. Most claims supported with P values slightly below .05 are probably false (ie, the claimed associations and treatment effects do not exist). Even among those claims that are true, few are worth acting on in medicine and health care.
Lowering the threshold for claiming statistical significance is an old idea. Several scientific fields have carefully considered how low a P value should be for a research finding to have a sufficiently high chance of being true. For example, adoption of genome-wide significance thresholds (P < 5 × 10−8) in population genomics has made discovered associations highly replicable and these associations also appear consistently when tested in new populations. The human genome is very complex, but the extent of multiplicity of significance testing involved is known, the analyses are systematic and transparent, and a requirement for P < 5 × 10−8 can be cogently arrived at.
However, for most other types of biomedical research, the multiplicity involved is unclear and the analyses are nonsystematic and nontransparent. For most observational exploratory research that lacks preregistered protocols and analysis plans, it is unclear how many analyses were performed and what various analytic paths were explored. Hidden multiplicity, nonsystematic exploration, and selective reporting may affect even experimental research and randomized trials. Even though it is now more common to have a preexisting protocol and statistical analysis plan and preregistration of the trial posted on a public database, there are still substantial degrees of freedom regarding how to analyze data and outcomes and what exactly to present. In addition, many studies in contemporary clinical investigation focus on smaller benefits or risks; therefore, the risk of various biases affecting the results increases.
Moving the P value threshold from .05 to .005 will shift about one-third of the statistically significant results of past biomedical literature to the category of just “suggestive.”1 This shift is essential for those who believe (perhaps crudely) in black and white, significant or nonsignificant categorizations. For the vast majority of past observational research, this recategorization would be welcome. For example, mendelian randomization studies show that only few past claims from observational studies with P < .05 represent causal relationships.5 Thus, the proposed reduction in the level for declaring statistical significance may dismiss mostly noise with relatively little loss of valuable information. For randomized trials, the proportion of true effects that emerge with P values in the window from .005 to .05 will be higher, perhaps the majority in several fields. However, most findings would not represent treatment effects that are large enough for outcomes that are serious enough to make them worthy of further action. Thus, the reduction in the P value threshold may largely do more good than harm, despite also removing an occasional true and useful treatment effect from the coveted significance zone. Regardless, the need for also focusing on the magnitude of all treatment effects and their uncertainty (such as with confidence intervals) cannot be overstated.
Lowering the threshold of statistical significance is a temporizing measure. It would work as a dam that could help gain time and prevent drowning by a flood of statistical significance, while promoting better, more-durable solutions.6 These solutions may involve abandoning statistical significance thresholds or P values entirely. If any thresholds are to continue in use, even lower thresholds are probably preferable for most observational research. Comprehensive reviews (termed umbrella reviews) that have evaluated multiple systematic reviews of observational studies propose a P < 10−6 threshold.5 In addition, falsification end-point methods (ie, using such Pvalue thresholds that almost all well-established null associations will not be able to pass them) also point to very low P values.7 With the advent of big data, statistical significance will increasingly mean very little because extremely low P values are routinely obtained for signals that are too small to be useful even if true.
Adopting lower P value thresholds may help promote a reformed research agenda with fewer, larger, and more carefully conceived and designed studies with sufficient power to pass these more demanding thresholds. However, collateral harms may also emerge. Bias may escalate rather than decrease if researchers and other interested parties (eg, for-profit sponsors) try to find ways to make the results have lower P values. Selected study end points may become even less clinically relevant because it is easier to reach lower P values with weak surrogate end points than with hard clinical outcomes. Moreover, results that pass a lower P value threshold may be limited by greater regression to the mean and new discoveries may have even more exaggerated effect sizes than before.
Because the proposed threshold of P < .005 is imperfect, other more difficult but more durable alternative solutions should also be contemplated (Table). These solutions vary based on how quickly and easily they can be adopted. They can target the use and interpretation of the past biomedical literature accumulated to date or the design and deployment of the new literature that will accumulate in the future. The situation is dire for the past literature because there is no perfect remedy after the fact. In the long-term, the scientific workforce will need to be more properly trained in using the best fit for purpose statistical inference tools and biases will need to be addressed preemptively rather than retrospectively. However, these may continue to be largely unachievable goals.

Various Proposed Solutions for Improving Statistical Inference on a Large Scale
Data are becoming more complex. If time for rigorous training in methods and statistics for researchers and for research users remains limited, subpar medical statistics and concomitant misinterpretations may continue. Nevertheless, hopefully several fields will adopt better standards for P values, will decrease their dependence onP values, and enhance the adoption of other useful inferential tools (eg, Bayesian statistics) when appropriate. The rapidity and extent of these changes is unpredictable. Low adoption in the past may cause some pessimism. However, a fresh start and a rapid acceleration of adoption of better practices is always possible. Incentives from major journals and funders as well as radical changes in training curricula may be necessary to achieve more widespread and effective shifts.
References
1. Chavalarias  D, Wallach  JD, Li  AH, Ioannidis  JP.  Evolution of reporting P values in the biomedical literature, 1990-2015.  JAMA. 2016;315(11):1141-1148.
2. Ioannidis  JP.  Why most published research findings are false.  PLoS Med. 2005;2(8):e124.
3. Wasserstein  RL, Lazar  NA.  The ASA’s statement on P-values: context, process, and purpose.  Am Stat. 2016;70(2):129-133.
4. Benjamin  DJ, Berger  JO, Johnson  VE,  et al.  Redefine statistical significance.  Nat Hum Behav. 2018;2:6-10.
5. Li  X, Meng  X, Timofeeva  M,  et al.  Serum uric acid levels and multiple health outcomes.  BMJ. 2017;357:j2376.
6. Resnick  B. What a nerdy debate about P values shows about science-and how to fix it.https://www.vox.com/science-and-health/2017/7/31/16021654/p-values-statistical-significance-redefine-0005. Accessed February 1, 2018.
7. Prasad  V, Jena  AB.  Prespecified falsification end points.  JAMA. 2013;309(3):241-242.

16.3.18

Do clinicians want recommendations? A multi-center study comparing evidence summaries with and without GRADE recommendations

Neumann I, Alonso-Coello P, Vandvik PO, Agoritsas T, Mas G, Akl EA, et al.
Journal of Clinical Epidemiology , Article in press.

Abstract


Background

Evidence-based clinical practice guidelines provide recommendations to assist clinicians in decision-making and to reduce the gap between best current research evidence and clinical practice. However, some argue that providing pre-appraised evidence summaries alone, rather than recommendations, is more appropriate.

Objectives

To evaluate clinicians’ preferences, understanding of the evidence and intended course of action in response to evidence summaries with and without recommendations.

Methods

We included practicing clinicians attending educational sessions across 10 countries. Clinicians were randomized to receive relevant clinical scenarios supported by research evidence of low or very-low certainty, and accompanied by either strong or weak recommendations developed with the GRADE system. Within each group, participants were further randomized to receive the recommendation plus the corresponding evidence summary or the evidence summary alone. We evaluated participants’ preferences and understanding for the presentation strategy as well as their intended course of action.

Results

189/219 (86%) and 201/248 (81%) participants preferred having recommendations accompanying evidence summaries for both strong and weak recommendations, respectively. Across all scenarios less than half of participants correctly interpreted information provided in the evidences summaries (e.g. estimates of effect, certainty in the research evidence). Presence of a recommendation resulted in a more appropriate intended course of action for two scenarios involving strong recommendations.

Discussion

Evidence summaries alone are not enough to impact clinicians’ course of action. Clinicians clearly prefer having recommendations accompanying evidence summaries in the context of low or very-low certainty of evidence (Trial registration NCT02006017).

13.3.18

Guidelines for Adolescent Depression in Primary Care (GLAD-PC): Part I. Practice Preparation, Identification, Assessment, and Initial Management.

Zuckerbrot RA, Cheung A, Jensen PS, Stein REK, Laraque D; GLAD-PC STEERING GROUP.

OBJECTIVES: To update clinical practice guidelines to assist primary care (PC)
clinicians in the management of adolescent depression. This part of the updated
guidelines is used to address practice preparation, identification, assessment,
and initial management of adolescent depression in PC settings.
METHODS: By using a combination of evidence- and consensus-based methodologies,
guidelines were developed by an expert steering committee in 2 phases as informed
by (1) current scientific evidence (published and unpublished) and (2) draft
revision and iteration among the steering committee, which included experts,
clinicians, and youth and families with lived experience.
RESULTS: Guidelines were updated for youth aged 10 to 21 years and correspond to 
initial phases of adolescent depression management in PC, including the
identification of at-risk youth, assessment and diagnosis, and initial
management. The strength of each recommendation and its evidence base are
summarized. The practice preparation, identification, assessment, and initial
management section of the guidelines include recommendations for (1) the
preparation of the PC practice for improved care of adolescents with depression; 
(2) annual universal screening of youth 12 and over at health maintenance visits;
(3) the identification of depression in youth who are at high risk; (4)
systematic assessment procedures by using reliable depression scales, patient and
caregiver interviews, and Diagnostic and Statistical Manual of Mental Disorders, 
Fifth Edition criteria; (5) patient and family psychoeducation; (6) the
establishment of relevant links in the community, and (7) the establishment of a 
safety plan.
CONCLUSIONS: This part of the guidelines is intended to assist PC clinicians in
the identification and initial management of adolescents with depression in an
era of great clinical need and shortage of mental health specialists, but they
cannot replace clinical judgment; these guidelines are not meant to be the sole
source of guidance for depression management in adolescents. Additional research 
that addresses the identification and initial management of youth with depression
in PC is needed, including empirical testing of these guidelines.

3.3.18

Vision screening for correctable visual acuity deficits in school-age children and adolescents.

Evans JR, Morjaria P, Powell C.
Cochrane Database Syst Rev. 2018 Feb 15;2:CD005023.

BACKGROUND: Although the benefits of vision screening seem intuitive, the value
of such programmes in junior and senior schools has been questioned. In addition 
there exists a lack of clarity regarding the optimum age for screening and
frequency at which to carry out screening.
OBJECTIVES: To evaluate the effectiveness of vision screening programmes carried 
out in schools to reduce the prevalence of correctable visual acuity deficits due
to refractive error in school-age children.
SEARCH METHODS: We searched the Cochrane Central Register of Controlled Trials
(CENTRAL) (which contains the Cochrane Eyes and Vision Trials Register) (2017,
Issue 4); Ovid MEDLINE; Ovid Embase; the ISRCTN registry; ClinicalTrials.gov and 
the ICTRP. The date of the search was 3 May 2017.
SELECTION CRITERIA: We included randomised controlled trials (RCTs), including
cluster-randomised trials, that compared vision screening with no vision
screening, or compared interventions to improve uptake of spectacles or
efficiency of vision screening.
DATA COLLECTION AND ANALYSIS: Two review authors independently screened search
results and extracted data. Our pre-specified primary outcome was uncorrected, or
suboptimally corrected, visual acuity deficit due to refractive error six months 
after screening. Pre-specified secondary outcomes included visual acuity deficit 
due to refractive error more than six months after screening, visual acuity
deficit due to causes other than refractive error, spectacle wearing, quality of 
life, costs, and adverse effects. We graded the certainty of the evidence using
GRADE.
MAIN RESULTS: We identified seven relevant studies. Five of these studies were
conducted in China with one study in India and one in Tanzania. A total of 9858
children aged between 10 and 18 years were randomised in these studies, 8240 of
whom (84%) were followed up between one and eight months after screening. Overall
we judged the studies to be at low risk of bias. None of these studies compared
vision screening for correctable visual acuity deficits with not screening.Two
studies compared vision screening with the provision of free spectacles versus
vision screening with no provision of free spectacles (prescription only). These 
studies provide high-certainty evidence that vision screening with provision of
free spectacles results in a higher proportion of children wearing spectacles
than if vision screening is accompanied by provision of a prescription only (risk
ratio (RR) 1.60, 95% confidence interval (CI) 1.34 to 1.90; 1092 participants).
The studies suggest that if approximately 250 per 1000 children given vision
screening plus prescription only are wearing spectacles at follow-up (three to
six months) then 400 per 1000 (335 to 475) children would be wearing spectacles
after vision screening and provision of free spectacles. Low-certainty evidence
suggested better educational attainment in children in the free spectacles group 
(adjusted difference 0.11 in standardised mathematics score, 95% CI 0.01 to 0.21,
1 study, 2289 participants). Costs were reported in one study in Tanzania in 2008
and indicated a relatively low cost of screening and spectacle provision
(low-certainty evidence). There was no evidence of any important effect of
provision of free spectacles on uncorrected visual acuity (mean difference -0.02 
logMAR (95% CI adjusted for clustering -0.04 to 0.01) between the groups at
follow-up (moderate-certainty evidence). Other pre-specified outcomes of this
review were not reported.Two studies explored the effect of an educational
intervention in addition to vision screening on spectacle wear. There was
moderate-certainty evidence of little apparent effect of the education
interventions investigated in these studies in addition to vision screening,
compared to vision screening alone for spectacle wearing (RR 1.11, 95% CI 0.95 to
1.31, 1 study, 3177 participants) or related outcome spectacle purchase (odds
ratio (OR) 0.84, 95% CI 0.55 to 1.31, 1 study, 4448 participants). Other
pre-specified outcomes of this review were not reported.Three studies compared
vision screening with ready-made spectacles versus vision screening with
custom-made spectacles. These studies provide moderate-certainty evidence of no
clinically meaningful differences between the two types of spectacles. In one
study, mean logMAR acuity in better and worse eye was similar between groups:
mean difference (MD) better eye 0.03 logMAR, 95% CI 0.01 to 0.05; 414
participants; MD worse eye 0.06 logMAR, 95% CI 0.04 to 0.08; 414 participants).
There was high-certainty evidence of no important difference in spectacle wearing
(RR 0.98, 95% CI 0.91 to 1.05; 1203 participants) between the two groups and
moderate-certainty evidence of no important difference in quality of life between
the two groups (the mean quality-of-life score measured using the National Eye
Institute Refractive Error Quality of Life scale 42 was 1.42 better (1.04 worse
to 3.90 better) in children with ready-made spectacles (1 study of 188
participants). Although none of the studies reported on costs directly,
ready-made spectacles are cheaper and may represent considerable cost-savings for
vision screening programmes in lower income settings. There was low-certainty
evidence of no important difference in adverse effects between the two groups.
Adverse effects were reported in one study and were similar between groups. These
included blurred vision, distorted vision, headache, disorientation, dizziness,
eyestrain and nausea.
AUTHORS' CONCLUSIONS: Vision screening plus provision of free spectacles improves
the number of children who have and wear the spectacles they need compared with
providing a prescription only. This may lead to better educational outcomes.
Health education interventions, as currently devised and tested, do not appear to
improve spectacle wearing in children. In lower-income settings, ready-made
spectacles may provide a useful alternative to expensive custom-made spectacles.

Interventions to reduce accidents in childhood: a systematic review.

Barcelos RS, Del-Ponte B, Santos IS.
J Pediatr (Rio J). 2017 Dec 30. pii: S0021-7557(17)30798-2.
OBJECTIVE: To review the literature on interventions planned to prevent the
incidence of injuries in childhood.
SOURCE OF DATA: The PubMed, Web of Science, and Bireme databases were searched by
two independent reviewers, employing the single terms accidents, accident,
injuries, injury, clinical trial, intervention, educational intervention, and
multiple interventions, and their combinations, present in the article title or
abstract, with no limits except period of publication (2006-2016) and studies in 
human subjects.
SYNTHESIS OF DATA: Initially, 11,097 titles were located. Fifteen articles were
selected for the review. Eleven were randomized trials (four carried out at the
children's households, five in pediatric healthcare services, and two at
schools), and four were non-randomized trials carried out at the children's
households. Four of the randomized trials were analyzed by intention-to-treat and
a protective effect of the intervention was observed: decrease in the number of
risk factors, decrease in the number of medical consultations due to injuries,
decrease in the prevalence of risk behaviors, and increase of the parents'
knowledge regarding injury prevention in childhood.
CONCLUSION: Traumatic injuries in childhood are amenable to primary prevention
through strategies that consider the child's age and level of development, as
well as structural aspects of the environment.

A systematic review of the effects of supervised toothbrushing on caries incidence in children and adolescents.

Dos Santos APP, de Oliveira BH, Nadanovsky P.
Int J Paediatr Dent. 2018 Jan;28(1):3-11. doi: 10.1111/ipd.12334. Epub 2017 Sep21.
BACKGROUND: The anticaries effect of supervised toothbrushing, irrespective of
the effect of fluoride toothpaste, has not been clearly determined yet.
AIM: To assess the effects of supervised toothbrushing on caries incidence in
children and adolescents.
DESIGN: A systematic review of controlled trials was performed (CRD42014013879). 
Electronic and hand searches retrieved 2046 records, 112 of which were read in
full and independently assessed by two reviewers, who collected data regarding
characteristics of participants, interventions, outcomes, length of follow-up and
risk of bias.
RESULTS: Four trials were included and none of them had low risk of bias. They
were all carried out in schools, but there was great variation regarding
children's age, fluoride content of the toothpaste, baseline caries levels and
the way caries incidence was reported. Among the four trials, two found
statistically significant differences favouring supervised toothbrushing, but
information about the magnitude and/or the precision of the effect estimate was
lacking and in one trial clustering effect was not taken into consideration. No
meta-analysis was performed due to the clinical heterogeneity among the included 
studies and differences in the reporting of data.
CONCLUSIONS: There is no conclusive evidence regarding the effectiveness of
supervised toothbrushing on caries incidence.

16.2.18

Physical activity, diet and other behavioural interventions for improving cognition and school achievement in children and adolescents with obesity or overweight.

Martin A, Booth JN, Laird Y, et al. Cochrane Database Syst Rev. 2018 Jan 29;1:CD009728. doi: 10.1002/14651858.CD009728.pub3. (Review) PMID: 29376563

BACKGROUND: The global prevalence of childhood and adolescent obesity is high. Lifestyle changes towards a healthy diet, increased physical activity and reduced sedentary activities are recommended to prevent and treat obesity. Evidence suggests that changing these health behaviours can benefit cognitive function and school achievement in children and adolescents in general. There are various theoretical mechanisms that suggest that children and adolescents with excessive body fat may benefit particularly from these interventions.

OBJECTIVES: To assess whether lifestyle interventions (in the areas of diet, physical activity, sedentary behaviour and behavioural therapy) improve school achievement, cognitive function (e.g. executive functions) and/or future success in children and adolescents with obesity or overweight, compared with standard care, waiting-list control, no treatment, or an attention placebo control group.

SEARCH METHODS: In February 2017, we searched CENTRAL, MEDLINE and 15 other databases. We also searched two trials registries, reference lists, and handsearched one journal from inception. We also contacted researchers in the field to obtain unpublished data.

SELECTION CRITERIA: We included randomised and quasi-randomised controlled trials (RCTs) of behavioural interventions for weight management in children and adolescents with obesity or overweight. We excluded studies in children and adolescents with medical conditions known to affect weight status, school achievement and cognitive function. We also excluded self- and parent-reported outcomes.

DATA COLLECTION AND ANALYSIS: Four review authors independently selected studies for inclusion. Two review authors extracted data, assessed quality and risks of bias, and evaluated the quality of the evidence using the GRADE approach. We contacted study authors to obtain additional information. We used standard methodological procedures expected by Cochrane. Where the same outcome was assessed across different intervention types, we reported standardised effect sizes for findings from single-study and multiple-study analyses to allow comparison of intervention effects across intervention types. To ease interpretation of the effect size, we also reported the mean difference of effect sizes for single-study outcomes.

MAIN RESULTS: We included 18 studies (59 records) of 2384 children and adolescents with obesity or overweight. Eight studies delivered physical activity interventions, seven studies combined physical activity programmes with healthy lifestyle education, and three studies delivered dietary interventions. We included five RCTs and 13 cluster-RCTs. The studies took place in 10 different countries. Two were carried out in children attending preschool, 11 were conducted in primary/elementary school-aged children, four studies were aimed at adolescents attending secondary/high school and one study included primary/elementary and secondary/high school-aged children. The number of studies included for each outcome was low, with up to only three studies per outcome. The quality of evidence ranged from high to very low and 17 studies had a high risk of bias for at least one item. None of the studies reported data on additional educational support needs and adverse events.Compared to standard practice, analyses of physical activity-only interventions suggested high-quality evidence for improved mean cognitive executive function scores. The mean difference (MD) was 5.00 scale points higher in an after-school exercise group compared to standard practice (95% confidence interval (CI) 0.68 to 9.32; scale mean 100, standard deviation 15; 116 children, 1 study). There was no statistically significant beneficial effect in favour of the intervention for mathematics, reading, or inhibition control. The standardised mean difference (SMD) for mathematics was 0.49 (95% CI -0.04 to 1.01; 2 studies, 255 children, moderate-quality evidence) and for reading was 0.10 (95% CI -0.30 to 0.49; 2 studies, 308 children, moderate-quality evidence). The MD for inhibition control was -1.55 scale points (95% CI -5.85 to 2.75; scale range 0 to 100; SMD -0.15, 95% CI -0.58 to 0.28; 1 study, 84 children, very low-quality evidence). No data were available for average achievement across subjects taught at school.There was no evidence of a beneficial effect of physical activity interventions combined with healthy lifestyle education on average achievement across subjects taught at school, mathematics achievement, reading achievement or inhibition control. The MD for average achievement across subjects taught at school was 6.37 points lower in the intervention group compared to standard practice (95% CI -36.83 to 24.09; scale mean 500, scale SD 70; SMD -0.18, 95% CI -0.93 to 0.58; 1 study, 31 children, low-quality evidence). The effect estimate for mathematics achievement was SMD 0.02 (95% CI -0.19 to 0.22; 3 studies, 384 children, very low-quality evidence), for reading achievement SMD 0.00 (95% CI -0.24 to 0.24; 2 studies, 284 children, low-quality evidence), and for inhibition control SMD -0.67 (95% CI -1.50 to 0.16; 2 studies, 110 children, very low-quality evidence). No data were available for the effect of combined physical activity and healthy lifestyle education on cognitive executive functions.There was a moderate difference in the average achievement across subjects taught at school favouring interventions targeting the improvement of the school food environment compared to standard practice in adolescents with obesity (SMD 0.46, 95% CI 0.25 to 0.66; 2 studies, 382 adolescents, low-quality evidence), but not with overweight. Replacing packed school lunch with a nutrient-rich diet in addition to nutrition education did not improve mathematics (MD -2.18, 95% CI -5.83 to 1.47; scale range 0 to 69; SMD -0.26, 95% CI -0.72 to 0.20; 1 study, 76 children, low-quality evidence) and reading achievement (MD 1.17, 95% CI -4.40 to 6.73; scale range 0 to 108; SMD 0.13, 95% CI -0.35 to 0.61; 1 study, 67 children, low-quality evidence).

AUTHORS' CONCLUSIONS: Despite the large number of childhood and adolescent obesity treatment trials, we were only able to partially assess the impact of obesity treatment interventions on school achievement and cognitive abilities. School and community-based physical activity interventions as part of an obesity prevention or treatment programme can benefit executive functions of children with obesity or overweight specifically. Similarly, school-based dietary interventions may benefit general school achievement in children with obesity. These findings might assist health and education practitioners to make decisions related to promoting physical activity and healthy eating in schools. Future obesity treatment and prevention studies in clinical, school and community settings should consider assessing academic and cognitive as well as physical outcomes.