FormalPara Key Points

The results of this study suggest that neurodevelopmental disorders are under-recorded in the Clinical Practice Research Datalink.

Accurate recording within electronic healthcare databases should be encouraged given their increasing use in the evaluation of medication safety.

It is important to generate risk estimates from a number of sources and understand the strengths and weaknesses of the methodological types in order to aid the translation of research findings to the clinic for pre-conceptual counselling.

1 Introduction

The use of electronic healthcare databases containing anonymised, routinely collected healthcare data for therapeutic adverse event monitoring is increasing. This approach has the potential to offer several advantages over prospective observational studies, including larger cohort size and immediately available, prospectively recorded, longitudinal population-level data. The sensitivity of such routinely collected data for the detection of associations with adverse neurodevelopmental outcomes and its comparability with data generated through prospective longitudinal cohort studies is, however, unknown.

Neurobehavioural/neurodevelopmental teratogens are agents that may adversely impact the neuropsychological and behavioural functioning of an individual who was exposed to that agent in utero. However, established teratogen surveillance systems are generally designed to identify structural teratogenic effects through monitoring rates of birth defects in exposed offspring at birth. As a result, for most medicines, data on longer-term development are absent and an effect on cognition or behaviour may go undetected for many years. This is evidenced by the findings of the recent European-wide review on sodium valproate, an antiepileptic drug (AED) with a prior well-established physical embryopathy but which is only now recognised to carry a significant risk of lifelong developmental effects into adulthood [1]. The possibility of using electronic healthcare records to detect signals of neurobehavioural teratogenicity in ‘real-time’ is therefore a very attractive proposition, especially for newer AEDs and medicines such as antidepressants that are frequently prescribed during pregnancy.

Although almost consistently identified, risk estimates vary regarding the neurodevelopmental effects of valproate exposure in utero, with prevalence estimates for autism ranging from 3 to 17% [2,3,4,5]. Christensen et al. [5] reported an increased prevalence of autism following exposure to valproate using Danish electronic healthcare data, but these rates were lower than those reported by an earlier UK-based prospective study by Bromley et al. [2] and in other prospective cohort studies [3, 4]. This raises questions about the utility of using electronic healthcare databases for neurobehavioural teratogen signal detection, or to quantify risk, given the subtle and varying patterns with which neurodevelopmental disorders (NDDs) may present. It is evident that not all electronic healthcare databases are equivalent, and while some offer national population data that have been systematically collected over long periods, others, such as the Clinical Practice Research Datalink (CPRD), cover largely primary care data for a proportion of the UK population only. CPRD data have previously been found to have a reduced prevalence of structural malformations in children exposed to valproate in comparison to a prospective observational study [6], which raises concerns that NDDs may also be under-recorded. It is therefore important that the reliability of routinely collected data to detect neurodevelopmental outcomes in children is investigated and that both healthcare providers and patients understand how risk estimates generated from electronic healthcare data compare with those from face-to-face reviews.

This study therefore aimed to determine whether data from the UK’s CPRD produced similar risk estimates to a prospective UK longitudinal face-to-face study, conducted by the Liverpool and Manchester Neurodevelopment Group [2], in relation to the risk of specific NDDs following prenatal exposure to AEDs.

2 Methods

2.1 Overview

Data from the CPRD were analysed to assess the prevalence of child neurodevelopmental outcomes for individual AEDs. The CPRD analysis mirrored the Liverpool and Manchester prospective UK cohort study as closely as possible to enable comparison of the findings from each dataset.

2.2 Clinical Practice Research Datalink (CPRD)

The CPRD contains anonymised, longitudinal patient medical and prescribing records routinely collected within general practice. As of July 2013 it contained data on over 11 million patients and was actively collecting data on ~6.9% of the UK population [7]. Data are entered as Read clinical codes and include information relating to pregnancy, symptoms and diagnoses, referrals and issued prescriptions. In addition to Read codes, general practitioners (GPs) can enter, alongside the coded entry, additional detail as non-coded ‘free text’. At the time of the study, anonymised free text could be requested from the database provider to help verify a Read code diagnosis, and for patients currently registered in the database it was possible to request an anonymised photocopy of their full paper medical record. However, access to the free-text service ceased in October 2013 due to changes in governance requirements.

2.3 Liverpool and Manchester Study

The prospective cohort of the Liverpool and Manchester Neurodevelopment Group [2, 8] consisted of women with epilepsy (WWE) and women without epilepsy recruited between 2000 and 2004 from 11 hospital antenatal clinics in North-West England. For each pregnant woman with epilepsy a woman without epilepsy was identified by reviewing records to identify women who had attended the antenatal clinic on the same date. Women were matched on age, parity, residential district and employment. When recruited into the study during their pregnancy, women were asked about their exposure to AEDs during pregnancy. The offspring were then followed prospectively and at the 6-year assessment parents were asked about the health and development of their child, including any diagnoses of physical or developmental difficulties such as autistic spectrum disorder (ASD), attention–deficit hyperactivity disorder (ADHD) and/or dyspraxia that had been made independently of the research team. A positive report was followed up to confirm the diagnosis with the relevant medical records (i.e. specialist letters) or the healthcare professional/school. Outcome data at 6 years were collected for 201 children born to WWE (175 AED exposed, 26 unexposed) and 214 born to women without epilepsy. Further details of the study can be found elsewhere [2, 8] and a summary is shown in Fig. 1.

Fig. 1
figure 1

Summary of selection of the study cohorts. AED antiepileptic drug, CPRD Clinical Practice Research Datalink, GP general practitioner, NDD neurodevelopmental disorder

2.4 CPRD study population

All pregnancies to WWE were identified in the CPRD, where the pregnancy ended in a live delivery between 1 January 2000 and 31 December 2006. Women were required to have been followed in the CPRD throughout pregnancy and for the 6 months prior. WWE were identified based on a combination of epilepsy diagnosis codes, seizure codes and AED prescriptions (See Electronic Supplementary Material 1). The start date of each pregnancy was estimated using an algorithm that incorporated information from all pregnancy-related codes in the woman’s record [9]; where insufficient information was available (18.8% of pregnancies) a default duration of 280 days was assigned. The medical records of the mothers were linked to those of the child using an algorithm described previously [10]. Linked mother–child pairs were included if the child was still registered in the CPRD at age 6 years and 3 months; this cut-off was chosen as most children in the prospective study were assessed shortly after their sixth birthday. Each eligible WWE mother–child pair was randomly matched to six mother–child pairs where the mother did not meet the epilepsy criteria and had not been prescribed an AED at any time prior to her child turning age 6 years. Mother–child pairs were matched on maternal age, year of delivery, sex of the child and GP practice as a proxy for socioeconomic status (or the socioeconomic status of the GP practice where GP practice matching was not possible).

All AED prescriptions issued to WWE during pregnancy or the 6 months prior were identified. The duration of each prescription was calculated based on an algorithm that used information on the number of tablets dispensed and the dosage instruction. The prescriptions were then mapped, taking into account evidence of polytherapy use or drug switching. Based on the mapped prescriptions, AED exposure during the 6 months before pregnancy and between the start and end of pregnancy was determined. As with the Liverpool and Manchester study, treatment was classed as polytherapy where the woman was prescribed a second AED (including a benzodiazepine) for any length of time.

In line with the Liverpool and Manchester study, the NDD outcomes of interest were ASD, ADHD and dyspraxia. All children in the study population with a diagnosis recorded by age 6 years and 3 months were identified based on Read codes in their electronic record. The study period was chosen to ensure all children reached the age cut-off by 30 April 2013, when the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) was introduced; this avoided any potential differences due to the changes in terminology and definitions of the outcomes of interest between the fourth (DSM-IV) and fifth editions. The NDDs identified were verified by requesting full photocopied medical records for infants still registered in the CPRD and ‘free text’ for those no longer registered or where there was no response to the photocopied record request. These were reviewed by authors RC, RB, AW and LY, who were blinded to the maternal medication exposure status of the child.

Covariate information was extracted on maternal smoking status, alcohol consumption, quintile of deprivation at a GP practice level, folic acid 5 mg and seizures during pregnancy. As with the Liverpool and Manchester study, mother–child pairs were excluded if one or both had evidence of a diagnosis likely to influence neurodevelopmental outcome (e.g. neurofibromatosis, Down’s syndrome).

2.5 Analysis

The prevalence of NDDs was determined for mother–child pairs with epilepsy stratified by AED treatment regimen during pregnancy and for those in the matched group without epilepsy. Conditional logistic regression was used to determine the likelihood of an NDD diagnosed by age 6 years and 3 months in the children of WWE exposed to different AED regimens during pregnancy compared with women who did not have epilepsy. Adjustments were made for any covariates where p < 0.2 in the univariate analysis. To compare the results of this CPRD study and the face-to-face study, differences between the proportions of NDDs identified in the two studies for each treatment regimen were calculated along with 95% confidence intervals [CIs] of these differences [11]. Primary analyses included all NDD outcomes that could be verified as well as those where there was insufficient information to verify or refute the diagnosis. A sensitivity analysis was carried out restricted to NDDs where the diagnosis could be verified. An additional sensitivity analysis was later carried out extending the AED exposure window to include the 6 months before the start of pregnancy in addition to the pregnancy itself. This was because the dosage instructions recorded in the CPRD for women who discontinued therapy did not appear to allow for a gradual reduction in dose and it is likely that women will titrate down and not discontinue AED therapy abruptly. It is therefore possible that some women categorised by the prescription mapping as being AED unexposed during pregnancy actually continued the medication they were taking prior to pregnancy for considerably longer than the dosage instructions suggested, resulting in them having been exposed during the early weeks of gestation.

3 Results

Within the CPRD, 1030 eligible mother–child pairs were identified where the mother had epilepsy and these were matched to 6180 mother–child pairs without epilepsy. Seventy-one infants and one mother (1.0%) had a Read code for a condition that could influence neurodevelopmental outcome (see Electronic Supplementary Material 2) and these 72 mother–child pairs were excluded from the study. Of those excluded, 12 were in the epilepsy cohort and therefore their respective six matched mother–child pairs were also excluded. After exclusions, the final study cohort included 7066 mother–child pairs: 1018 WWE and 6048 women without epilepsy. Fifty-four percent of WWE received an AED during pregnancy: 79% monotherapy and 21% polytherapy. WWE were more likely than women without epilepsy to be smokers (p < 0.001) but were less likely to drink alcohol (p < 0.001); this mirrored the Liverpool and Manchester cohort [2]. Characteristics of the two study cohorts are shown in Table 1.

Table 1 Characteristics of the two study populations

Eighty-seven Read code diagnoses, in 84 children, were identified for one of the NDDs of interest: 47 for ASD, 30 for ADHD, four for dyspraxia and three children had codes for both ASD and ADHD. Verification of the outcomes resulted in 63 being verified, four being refuted and for 20 there was insufficient information to confirm or refute (see Electronic Supplementary Material 3). Four of the mother–child pairs excluded from the study cohort, due to evidence of a condition that could influence neurodevelopment, had one of the outcomes of interest: two in the epilepsy cohort (one valproate polytherapy exposed and one unexposed to AEDs) and two in the non-epilepsy cohort. In addition, one child was identified as having ADHD based on a Read code and review of the free text confirmed this diagnosis and also provided evidence of an ASD diagnosis, even though no ASD Read codes were present.

In both the CPRD and the Liverpool and Manchester study, NDDs were more frequently reported in the children of WWE than children of women without epilepsy, although the risk estimate was much lower in the CPRD (2.16 vs. 0.96%, p < 0.001 and 7.46 vs. 1.87%, p = 0.0128, respectively). The prevalence of NDDs varied by specific AED exposure in both study cohorts (Tables 2, 3). In the CPRD, the prevalence of NDDs was increased amongst offspring antenatally exposed to carbamazepine, valproate (monotherapy and polytherapy) and non-valproate polytherapy combinations when compared with offspring born to women without epilepsy, but these increases did not reach statistical significance (Table 2).

Table 2 Prevalence of neurodevelopmental disorders and crude and adjusted odds ratios by antiepileptic drug group: Clinical Practice Research Datalink study
Table 3 Prevalence of neurodevelopmental disorders and crude and adjusted odds ratios by antiepileptic drug group: study by the Liverpool and Manchester Neurodevelopment Group (adapted from Bromley et al. [2], with permission)

In the CPRD cohort, a significant increase in NDD risk was observed in the offspring of WWE and no AED exposure when compared with women without epilepsy. However, the two sensitivity analyses restricting NDDs to those that could be verified and extending the AED exposure window no longer demonstrated a significant increase in NDD risk (see Electronic Supplementary Material 4 and Table 4). When extending the AED exposure window to include the period 6 months before pregnancy, manual review of the electronic prescription records for the nine NDD cases in this group revealed that the algorithm predicted that two women had discontinued valproate just prior to pregnancy (one case of valproate monotherapy and the other of valproate polytherapy). The calculated prescription durations had estimated discontinuation at ≤10 weeks prior to conception; however, such predictions were based on estimated routine daily dose information and did not take into account the likely slow tapering of the medications prior to discontinuation. If tapering did occur this could have extended the time window of exposure so that it overlapped with early gestation. Extending the AED exposure window resulted in a significant increase in risk for valproate polytherapy (adjusted odds ratio [ORadj] 7.32 [95% CI 1.65–32.57]) and increased the risk estimate for valproate monotherapy (ORadj 2.97 [95% CI 0.84–10.49]), although this still did not reach statistical significance.

Table 4 Results of the sensitivity analysis in the Clinical Practice Research Datalink study extending the window of exposure to include exposure during the 6 months before pregnancy

The prevalence of NDDs in the CPRD, with the exception of the carbamazepine and epilepsy and no AED exposure cohorts, was lower than in the Liverpool and Manchester study (Fig. 2). Analysis to look at the differences in proportions found that for ‘lamotrigine’, ‘valproate’ and the ‘other monotherapy’ categories, the proportion of NDDs observed in the CPRD was significantly lower than in the Liverpool and Manchester study, although the CIs were wide (Table 5). The proportions of NDDs observed for all WWE and WWE who had AED exposure were also significantly lower in the CPRD than in the face-to-face study. The exposure window sensitivity analysis did not alter these findings, although the ‘monotherapy other’ category did reduce to borderline significance. In contrast to the Liverpool and Manchester study, the CPRD data did not reproduce the significant increased risk of NDDs at 6 years and 3 months following in utero exposure to valproate [2].

Fig. 2
figure 2

Prevalence of neurodevelopmental outcomes by study and antiepileptic drug group (adapted from Bromley et al. [2], with permission). AED antiepileptic drug, CBZ carbamazepine, CPRD Clinical Practice Research Datalink, LMNDG Liverpool and Manchester Neurodevelopment Group, LTG lamotrigine, mono monotherapy, poly polytherapy, VPA valproate, WWE women with epilepsy

Table 5 Comparison of proportions of neurodevelopmental disorders between the Clinical Practice Research Datalink study and the Liverpool and Manchester study stratified by antiepileptic drug exposure

4 Discussion

This study identified a lower prevalence of NDDs in the CPRD data than in the Liverpool and Manchester cohort, both overall and for certain AED exposure groups. Using data from the CPRD it was possible to reproduce the significant increase in risk of NDDs in WWE compared with women without epilepsy, but it was not possible to reproduce the significant association for valproate monotherapy exposure, although the point estimate did fall within the corresponding CI of the Liverpool and Manchester study. Analysis of the CPRD data did identify an increased risk of NDDs in WWE who had no AED exposure during pregnancy, but this was no longer significant following sensitivity analyses. The sensitivity analysis to extend the time window of exposure to include the 6 months before pregnancy, allowing for a longer duration of exposure resulting from the likely tapering of AED dose before discontinuation, resulted in a significant increase in risk following valproate polytherapy exposure and a small increase for valproate monotherapy, although this still did not reach statistical significance.

The lower prevalence of NDDs in the CPRD, in comparison with the Liverpool and Manchester study, suggests potential under-recording of NDDs within the CPRD and this is evidenced in a number of ways. The overall prevalence of all three NDDs combined in the CPRD cohort for women without epilepsy was 0.96% and this is lower than the background population prevalence of ASD alone, which is estimated to be between 1.16 and 1.57% in the UK general population [12, 13]. The rate of NDDs within the CPRD study cohort was also lower than other prospective observational cohort studies where rates following valproate monotherapy exposure ranged from 3.8 to 8.9% [2,3,4]. Finally, the rate of NDDs in the CPRD was also lower than those reported by the only other electronic healthcare record study assessing specific neurodevelopmental outcomes for AEDs [5]. In this nationwide population-based study, Christensen and colleagues [5], using routinely collected Danish healthcare and pharmacy records, identified a prevalence for all ASDs within a cohort of monotherapy valproate exposed infants of 3.09% (12/388Footnote 1). Although this prevalence is still low in comparison with the Liverpool and Manchester and other observational studies [2, 3], this study, using Danish electronic healthcare records, did replicate the finding of a greater risk for the valproate-exposed cohort. The finding of a lower prevalence in the CPRD is also consistent with that observed in an earlier CPRD study in which the prevalence of major congenital malformations within valproate-exposed children was lower than that reported by the UK Epilepsy and Pregnancy Register [6].

The reason for under-recording of NDDs within the CPRD remains unclear and to our knowledge no study has reported on the sensitivity of their recording in the CPRD. It is possible that in some cases only symptoms of a condition are entered as Read codes and the actual diagnosis is either not entered at all or is only entered as free text or as a scanned letter from a specialist. This possibility is supported by the case identified in this study where the free text reported an ASD diagnosis but no ASD Read code had been entered. A recent study, using data from The Health Improvement Network database in the UK, evaluated the risk of NDDs following exposure to valproate using a broad range of Read codes describing general developmental delay and behavioural problems, including codes for speech delay and language difficulties, rather than specific NDD diagnoses [14]. This study found an almost three-fold increased risk of a child having a record of one of these codes in women prescribed valproate during pregnancy compared with women not prescribed an anticonvulsant mood stabiliser. It is possible that these results may support the theory that GPs are more likely to record symptom-related codes than specific autism, ADHD and dyspraxia codes. However, this study did not verify the Read codes identified, evaluate the sensitivity or specificity of recording or report on the proportion of children with specific NDD diagnosis codes. Under-recording could also potentially occur if there is a delay between a diagnosis being made by a specialist and information being entered into the electronic system and this could have resulted in misclassification of outcome status for children in the CPRD diagnosed close to the upper age cut-off.

The percentage of CPRD Read code diagnoses that could be verified by free text or photocopied records in our study was higher for ASD than for ADHD and dyspraxia. This may in part be explained by the age cut-off, with a recent study demonstrating that approximately 75% of incident ADHD cases in the CPRD are diagnosed beyond 6 years of age [15]. Within the free text we did find references to the fact a child was too young for a formal diagnosis, so it is possible that GPs are entering an ADHD code as a working diagnosis but the child is not referred to receive a definitive diagnosis until they are older. This should not, however, have affected the comparison with the Liverpool and Manchester study as the cut-off for age was matched.

Differences between the CPRD and Liverpool and Manchester study populations may also explain some of the observed differences. Both studies matched on maternal age and a measure of geographical location and socioeconomic status. In the CPRD it was not possible to obtain reliable information on parity, but to our knowledge this has not been found to be associated with NDD risk. The CPRD study matched on the sex of the offspring, which was not possible in the Liverpool and Manchester study as recruitment was prenatal. ASD is more common in males and it is possible that matching may have reduced the level of association; however, as the Liverpool and Manchester study adjusted for the sex of the offspring in the statistical analysis it is unlikely that this will explain the differences observed. The sample size in the CPRD study was larger, both in terms of the number of WWE and the use of 1:6 matching, which increased the statistical power and produced more stable effect estimates with narrower CIs. However, comparing this study to the larger population study by Christensen and colleagues [5] demonstrates that this discrepancy is not purely one of sample size. Observational studies have in the past ascertained relatively small cohorts from specialist clinics and may therefore have included mother–child participants at higher risk of adverse outcome on the basis of both disease severity and higher exposure dose. The dose-related effects of valproate on the fetus have been documented in relation to major congenital malformations [16], reduced IQ [17] and in the parental ratings of autistic behaviours [4]. It is thus possible that ascertainment bias may account for some of the differences and that the prevalence rates from observational studies are higher than those generated by population-based electronic health record studies as a consequence of different methodological approaches to data collection.

Differences were also present in the method of recruitment; in the Liverpool and Manchester study women were enrolled if they reported a history of epilepsy [8], whilst in the CPRD study identification of epilepsy was dependent on a combination of Read codes and therapy records. Approximately 45% of the women identified as having epilepsy in the CPRD did not receive an AED prescription during pregnancy. It is possible that some women may have received their AED prescriptions from secondary care, resulting in exposure misclassification, although these numbers are likely to be small as often a specialist will initiate treatment and repeat prescribing will be continued by the patient’s GP. However, despite excluding women with only a single epilepsy code, there may have been some women with evidence suggestive of epilepsy who did not have a true diagnosis of epilepsy. A previous study in the UK has found that approximately 20% of individuals with a Read code for epilepsy were not listed in their general practice’s Epilepsy Register and of these only 14% were on epilepsy medication or had experienced seizures in the previous year. Unfortunately, it was not possible to compare the breakdown of AED exposures in the CPRD, as the Liverpool and Manchester study set out to recruit 50 women to each exposure category rather than WWE in general [8].

The classification of AED exposure differed between the two studies, with the Liverpool and Manchester study using data from maternal interviews and medical notes and the CPRD using data on the issue of a prescription; as with all data from electronic records, no information was available in the CPRD on adherence and whether the medicine was actually taken as instructed. The finding of an association between children born to untreated WWE and an increased risk of NDD was not significant following record verification, which was undertaken in a blinded fashion. However, the possibility of confounding by indication should always be considered when a positive association between a maternal exposure and adverse outcome in the child is observed. It is also possible that exposure misclassification may have played a role in the differences. In the CPRD, the prescription records did not appear to allow for a tapering of dose and in epilepsy AEDs are not routinely discontinued abruptly and tapering can take a number of months [18]. Manual review of the electronic prescription records for the nine NDD cases of WWE and reportedly no AED exposure demonstrated that two women had been estimated to have discontinued valproate just prior to pregnancy; however, if tapering of the dose did occur rather than an abrupt stop (as predicted by the algorithm), their window of exposure could have been extended to overlap with early gestation. Such a phenomenon could also be a possible hypothesis for the findings of others where women who were predicted to have ‘stopped’ valproate in the 6 months prior to pregnancy demonstrated an increased risk of having a child with an ASD [5]. An association between NDDs and WWE and no AED exposure has not been reported before and, therefore, given the lack of its significance following sensitivity analyses, its reliability is questionable. These results do highlight the need for sensitivity analyses when using electronic healthcare data in terms of the time window of exposure; this would also help account for any misclassification in the timing of conception and subsequent exposure at conception.

This study had the strength of being able to make a direct comparison between the results of two differently collected datasets of individuals using the same healthcare system during the same time period and in the same country. Despite the large size of the CPRD, some aspects of this study were limited by small numbers of exposed offspring and this meant it was not possible to determine whether the CPRD produces accurate measures of relative risk even though the prevalence estimates are lower. Restricting the CPRD study population to children still registered in the CPRD at age 6 years and 3 months had the benefit of making it directly comparable to the Liverpool and Manchester study but did have an impact on sample size. One of the strengths of electronic healthcare databases is the longitudinal nature of the data and some individuals in the cohort will have been followed beyond 6 years of age, but any diagnoses made after this cut-off were not captured in our study.

5 Conclusion

This study identified a lower prevalence of NDDs in the CPRD than in a prospective observational study matched for calendar time and age of the child and did not produce a signal that valproate as monotherapy is a neurodevelopmental teratogen. Consideration of the factors that may account for this strongly suggest that NDD diagnoses are under-recorded in this dataset. Data availability, reliability and completeness varies between electronic healthcare databases and this may affect the generalisability of our findings to other data sources. However, the finding of a potential misclassification of exposure status by estimated exposure timeframes applies to a large number of electronic resources and researchers need to give this consideration. As the use of electronic healthcare databases for the purposes of research increases, accurate recording should be encouraged as under-recording could have significant implications for educating about the risks associated with in utero exposure. This study has also demonstrated the value and importance of access to anonymised information held within the free-text fields of the CPRD. This has been a key strength of the database and the fact that this is no longer available will reduce the validity and quality of the data.

This study highlights the need for feasibility studies and sensitivity analyses to be carried out before risk assessment studies are initiated using electronic healthcare databases, in order to ensure the results can be correctly interpreted. The difference in findings between the two methodologies demonstrates the importance of generating risk estimates from a number of sources, including direct assessment studies, and that understanding the strengths and weaknesses of the methodological types is essential in aiding the translation of this information to the clinic for pre-conceptual counselling.