Introduction

Neandertal people are documented in the European and western Asian fossil record between 300 000 and 25 000 years ago (Mellars, 1992). They coexisted for 1000–6000 years, depending on the geographical area, with anatomically modern humans of Cro-Magnoid morphology who are first documented in Europe around 40 000 years ago (Mellars, 2006a). Cultural interactions are apparent between these groups (Mellars, 1999, 2006b; D'Errico, 2003), but all known Neandertal mitochondrial DNA (mtDNA) sequences are clearly distinct from both the almost contemporary Cro-Magnoid sequences studied (Caramelli et al., 2003; Serre et al., 2004) and modern European sequences (Krings et al., 1997, 2000). Therefore, the available mtDNA evidence, including Bayesian phylogenetic comparison of four Neandertal sequences with modern European sequences (Hebsgaard et al., 2007), does not seem easy to reconcile with the notion that Neandertals were part of the modern Europeans' genealogy, at least along the female lines. However, in principle, genetic drift could have eliminated the Neandertal lineages from the modern European mitochondrial pool even if interbreeding did occur (Nordborg, 1998; Relethford, 2001, 2008; Gutierrez et al., 2002). This could potentially explain why we now observe two distinct monophyletic clades, one leading to modern humans, and the other leading to Neandertals (Hebsgaard et al., 2007).

The recent sequencing of 1 million base pairs in one Neandertal's nuclear DNA has not made things much clearer. Noonan et al. (2006) simulated a single event of admixture 40 000 years ago and concluded that the maximum likelihood corresponded to no contribution of Neandertals to the modern human nuclear genome. By contrast, comparisons of polymorphic nuclear sites in the same Neandertal individual with those in modern humans and chimpanzees showed that this Neandertal individual had the derived allele in about 30% of cases. This finding was interpreted as incompatible with a simple model of population split followed by divergence between anatomically archaic (Neandertal) and modern humans (Green et al., 2006), a split that Hublin (personal communication to Green et al., 2006) would place around 400 000 years ago. Green et al. (2006) therefore concluded that non-negligible interbreeding is most likely to have occurred between the two human forms.

To better understand the genealogical relationships among the people of different morphologies who inhabited Europe, we developed a simulation approach on the basis of the coalescent theory. We explicitly modelled population sizes, ancient population structure and demographic changes in the Neandertal, Cro-Magnoid and modern European populations, thus testing hypotheses on the possibility that Neandertals made a contribution to the mitochondrial gene pool of modern Europeans.

Materials and methods

The observed data

Six Neandertal sequences spanning 360 base pairs of the hypervariable region 1 of mtDNA were available at the time we began the analysis. Two come from Feldhofer in Germany (Krings et al., 1997; Schmitz et al., 2002), two from the Vindija Cave in Croatia (Krings et al., 2000), one from Mezmaiskaya in the Russian Caucasus (Ovchinnikov et al., 2000)and one from Italy (Caramelli et al., 2006). There are two comparable sequences of Cro-Magnoids, both from southern Italy (Caramelli et al., 2003). Finally, 558 sequences of modern humans were chosen from geographically matching localities, namely Germany (Richards et al., 1996; Lutz et al., 1998), the Croatian island of Hvar (Tolk et al., 2001), the Caucasus (Nasidze and Stoneking, 2001) and central Italy (Francalacci et al., 1996); these regions were selected to sample comparable geographical ranges for Neandertals and modern Europeans.

The simulations

Mitochondrial genealogies of samples collected at different moments in time were simulated using a serial coalescent algorithm, Serial Simcoal (Anderson et al., 2005). Suppose that one has samples of sizes n0, n1, n2, …nk of populations studied t0, t1, t2,… tk generations ago. The program generates genealogies proceeding backwards in time, starting with n0 samples in the present (t0) and adding n1, n2, … nk samples at the appropriate moments in the past. For the sake of simplicity, in most simulations, all the Neandertal (n=6) and Cro-Magnoid (n=2) samples were placed, respectively, 42 500 and 24 000 years ago, which corresponds to their average age. However, preliminary tests gave essentially identical results when each sequence was assigned its exact age.

Once the genealogy had coalesced to the most recent common ancestor, mutations were randomly distributed onto the tree, generating variation in simulated sequences spanning 360 base pairs. We chose a finite-site mutation model with two potential allelic states for each site. The mutation rate was equal to either 0.05 per million years per nucleotide (as estimated from phylogenies; Pakendorf and Stoneking, 2005) or 0.5 per million years per nucleotide (as estimated from pedigrees; Howell et al., 2003). We chose a transition bias of 0.9375 and a rate-heterogeneity parameter of 0.26 (Meyer et al., 1999). Generation time was 25 years, consistent with previous simulation studies (Currat and Excoffier, 2004; Liu et al., 2006; Noonan et al., 2006; Fagundes et al., 2007). In preliminary simulations, we found that using lower or higher generation times (20 or 30 years, respectively) did not substantially affect our results. For each of the 17 demographic models tested (detailed below), we generated 1000 or 10 000 genealogies and calculated several measures of genetic diversity. We refer to these statistics as simulated values, which we compared with the observed values, calculated from the data by Arlequin ver.2.000 (Excoffier et al., 2005).

The shape of the simulated genealogies depends on sample sizes and population sizes, the latter being obviously difficult to define a priori. However, implausible parameters result in large differences between simulated and observed data, and hence in rejection of such models. Therefore, in the course of the simulations, we tried various population sizes for the three groups. The only empirical estimate for the Neandertal population is around 250 000 individuals (Biraben, 2003). For the Cro-Magnoid population, in the absence of any paleodemographic data, we interpolated between the Neandertal population size and the estimated European population at the dawn of agriculture, around 10 000 years ago, which has been estimated to have been about 1 000 000 persons (Currat and Excoffier, 2004). Thus, our estimate for the Cro-Magnoid population size was around 550 000 individuals. For modern Europeans of the four areas of interest, the current censuses give an overall population size of 33 600 000 people. As the effective population size for mitochondria is one-fourth of the autosomal population size, and assuming that the effective size is close to one-third of the census size (Cavalli-Sforza et al., 1994), the maximum effective female population sizes Nf were calculated as N/12. Therefore, in the model considering the largest plausible population sizes (L1.5), Nf was 2 800 000 for modern Europeans, 46 000 for Cro-Magnoids and 21 000 for Neandertals. Lower values were used for other models, representing reductions of effective population sizes due to factors such as population subdivision and habitat fragmentation (Whitlock and Barton, 1997). Whenever implemented, population growth or decline was modelled as exponential.

Comparing observed and simulated data

Twelve statistics were calculated for each simulated data set, namely three measures of genetic diversity within each of the three populations (total number of haplotypes, haplotype diversity or heterozygosity, and average pairwise difference) and three measures of haplotype sharing between populations, expressed for each sample as the fraction of haplotypes also present in another sample.

To obtain a synthetic measure of the goodness of fit of each model, we proceeded according to the criteria described by Belle et al. (2006). For each observed measure of genetic diversity, we counted the frequency of more extreme (that is, less likely) values in the 1000 or 10 000 simulations and treated that frequency, EF, as an empirical P-value of the observed statistic, given the parameters of the simulation. In this way, EF was 1 when the observed statistic corresponded exactly to the median simulated value; when the observed value fell completely out of the range of the simulated values, we set EF=0. We then counted how many simulated parameters significantly deviate (at P<0.05) from the observed ones and called this number nsd.

Results

Table 1 shows the observed statistics and, for each model tested, the median of the 12 statistics through 1000 simulations (or 10 000 simulations for the two-population models). Among the 14 models listed, Model L1.7 was tested twice, using two different Neandertal subsamples, and Model L2.2 was tested several times with different migration rates and initial population sizes. The empirical probabilities of the observed data (EF) and the count of significant differences between observed and simulated statistics are shown in Table 2. Under models L1.1 through L1.6, the three human groups were regarded as belonging to a single mitochondrial genealogy, and the lower mutation rate was used. Under model L1.7, genealogical continuity was simulated between modern people and only one subpopulation of Neandertals, still using the lower mutation rate. Under models H1.1 through H1.3, the higher mutation rate was used. Finally, under models L2.1A through L2.1C and L2.2, Cro-Magnoid and modern Europeans on the one hand, and Neandertals on the other, were assigned to distinct genealogies, with the lower mutation rate. Although some of the models are admittedly oversimplified, we describe all of them to detail the trial-and-error procedure that was followed throughout this study (Figure 1).

Table 1 Observed and simulated mtDNA diversity statistics for Neandertal (N), Cro-Magnoid (CM) and modern European (M) samples: Medians of 1000 (models L1.1 through H1.3) or 10 000 simulations (models L2.1 through L2.2)
Table 2 Empirical probabilities (in fact, frequency of more extreme values, EF) for each summary statistics observed in 1000 or 10 000 simulations
Figure 1
figure 1

Outline of the demographic models simulated. For explanations, see text. The figures on the left refer to the number of generations from the present; Nf is the effective female population size and r is the rate of demographic change. N (and N′ in model L1.7) stand for Neandertals, CM for Cro-Magnoids and M for modern Europeans. As the coalescence process is simulated backwards, an increase in population size is obtained by using a negative r-value.

Model L1.1: a small population of constant size

This oversimplified model corresponds to a population that remained at a constant size of Nf=2500 from Neandertal times until the present. Haplotype number and haplotype diversity values were significantly lower than observed for both Neandertals and modern Europeans. In particular, under this model, one expects to find seven different haplotypes in the modern European samples, rather than 309 as observed. In view of these large differences, this model can be rejected with a high degree of confidence.

Model L1.2: a small expanding population

The number nsd of simulated statistics departing significantly from the observed ones does not decrease when a more realistic model with a larger modern population is simulated, growing from Nf=2500 in Neandertals and Cro-Magnoids to Nf=100 000 in modern humans. Haplotype number increases in Neandertals, and haplotype diversity increases in both Neandertals and moderns, but not enough for the difference between simulated and observed statistics to become insignificant. Note that although haplotype sharing between modern humans and Neandertals was less here than under Model L1.1, the difference between observed and simulated values became significant (EF=0.0270), because the simulated values have a wider distribution under Model L1.1 (95% of them falling between 0.0 and 40.0%) than under Model L1.2 (95% between 1.7 and 7.1%).

Model L1.3: an ancient founder effect and a population expansion

We next added a founder effect before an ancient population expansion. In this way, the population size increased from Nf=40, before 8000 generations, to Nf=2500 in Neandertals and Cro-Magnoids, and to Nf=100 000 in modern humans. There was no improvement, and in fact the general fit of the model further decreased, with nsd=8; for seven simulated diversity measures, the departure was downwards. Accordingly, a better fit was sought by considering parameter values that would increase diversity, such as larger population sizes or higher mutation rates.

Model L1.4: a weaker founder effect and a larger modern population

Here, we simulated a less dramatic founder effect (Nf=1000) and a much larger modern population (Nf=2 500 000). The values of internal genetic diversity within populations increased substantially compared with all previous scenarios, although all values for Neandertals and modern humans remained too low. However, much similar to that under previous models, also in the simulated data generated under Model L1.4., some degree of haplotype sharing was apparent between Neandertals and modern Europeans. The value of nsd remained a high 8, higher indeed than in the oversimplified Models L1.1 and L1.2. As a consequence, for the following model we: (i) eliminated any founder effects and (ii) simulated the largest possible effective population sizes.

Model L1.5: a large expanding population

Fagundes et al. (2007) estimated that the size of the archaic African human population was between 6000 and 20 000 individuals. When we considered a similarly large population expanding from Nf=21 000 in Neandertals to Nf=46 000 in Cro-Magnoids and finally Nf=2 800 000 in modern humans, all the values of internal genetic diversity became compatible with the observed ones for Cro-Magnoids and Neandertals, but not for modern humans, nor did we obtain an estimate of haplotype sharing compatible with the observed data: 2.1% of the modern European haplotypes (or 11.7 out of 558) should be shared with Neandertals. However, the general fit of the model appeared to improve substantially, and it was better indeed than under any previously simulated model (nsd=3).

Model L1.6: a large expanding population with a subdivided modern population

In principle, a further increase in modern haplotype number and haplotype diversity, two parameters that we were unable to fit under Model L1.5, could be sought by modifying that model so as to subdivide the modern population in 100 groups of Nf=28 000 each. However, in this way, the median simulated pairwise differences became far too high in both modern Europeans and Neandertals, and indeed also among Cro-Magnoids, although in the last-mentioned case, the departure (observed value=1, median simulated value=19) did not reach statistical significance, due to the small Cro-Magnoid sample size. As a consequence, nsd increased to 4 with respect to the previous model.

Model L1.7: a large expanding population descended from a Neandertal subpopulation

Central European sequences cluster in the Neandertal evolutionary tree, to the exclusion of two sequences from Italy and from the Caucasus (Caramelli et al., 2006). To account for some effects of such a population structuring among Neandertals, we ran simulations in which only Central European Neandertals were ancestral to modern Europeans. Most measures of genetic diversity resembled the observed ones more closely than under the previous models except L1.5 (nsd=3). To better interpret this result, we also ran simulations in which the Italian and Caucasus sequences were considered ancestral to modern sequences. The results changed only marginally (data not given), suggesting that any factor reducing the Neandertal sample size (and thus adding uncertainty) would improve the fit. However, much similar to that under Model L1.5, in this way, 1.9% of the modern European haplotypes (or 10.6 out of 558) should be shared with Neandertals, a highly unlikely result (P=0.0009, Fisher's exact test).

To summarize, under single-population models, between 3 and 8 of the 12 simulated statistics differed significantly from the observed ones. It was never possible to reproduce the 0 haplotype sharing observed between Neandertals and modern humans, nor could we obtain the high haplotype diversity observed in modern populations. The latter result could be due to gene flow in the form of immigration into Cro-Magnoids and/or modern populations, or to the mutation rate we chose, or both. We could not realistically explore possible migrational scenarios because of the large number of parameters necessary to describe gene flow in a structured population shifting from hunting–gathering to farming and then to industry. Therefore, we focused on the effects of a higher mutation rate, and in the following three models, we increased it to 0.5 per million years per nucleotide, the highest value inferred from pedigree studies (Howell et al., 2003).

Model H1.1: a population at a small constant size before expanding

Again considering a population size of Nf=2500 both for Neandertals and Cro-Magnoids and an expansion to Nf=100 000 for modern humans, none of the internal genetic diversity values for modern humans fitted the observed ones, and four simulated statistics departed significantly from the observed ones. In this way, however, we could observe for the first time no haplotype sharing between modern Europeans and Neandertals. On the other hand, owing to the high mutation rate used, the haplotype sharing also between Cro-Magnoids and modern humans went down to 0, a result that appears incompatible with the observed data.

Model H1.2: a population at a small constant size before expanding, with an initial founder effect

Adding a founder effect to the previous scenario 4000 generations ago with Nf=100 did not have a dramatic effect on most genetic diversity values, and nsd remained=4. The only improvement we could observe was a slight decrease of modern pairwise difference.

Model H1.3: a small continuously expanding population with a founder effect

In this scenario, we modelled the same founder effect and added an expansion from Neandertals (Nf=1000) to Cro-Magnoids (Nf=2500) and to modern humans (Nf=100 000). Whereas modern haplotype diversity, modern pairwise difference and haplotype sharing between modern and Cro-Magnoids departed significantly from those estimated from the observed data, under this model (and only under this model), the simulated number of haplotypes in modern humans fitted with the observed one. However, this model, as well as all other models with the higher mutation rate, predicts no haplotype sharing between modern and Cro-Magnoid people, and hence seems at odds with a feature of the observed data that we consider crucial.

Models L2

Under none of the models with the higher mutation rate did we obtain a greater resemblance between observed and simulated data than in the best-fitting models on the basis of the lower mutation rate. Therefore, the next step was to implement models with two separate genealogies, the Neandertals on the one hand, and the Cro-Magnoids and modern humans on the other, keeping the low mutation rate that seems preferable when mitochondrial data are compared over long time spans (Pakendorf and Stoneking, 2005). As there is no surviving Neandertal, the Neandertal population size was brought to zero (in fact to Nf=1 as required by the software), whereas for Cro-Magnoids and modern Europeans, we used the model showing the best performance so far, model 1.5. Actually, Model L1.7, assuming that only part of the Neandertal population was ancestral to Cro-Magnoid and modern Europeans, has the same nsd=3. But its relatively good performance seems largely due to the increased uncertainty obtained by reducing the Neandertal sample sizes, as remarked previously, rather than to any particularly realistic features of the model itself. On the basis of preliminary simulations, the common ancestral population of the two groups was given an effective size at the time of divergence equal to 10% of the overall population size. Finally, to obtain narrower confidence intervals about the relevant statistics, we iterated the simulations 10 000 times.

Model L2.1: no migration

Under this model, a process of trial and error showed that a divergence time between the Neandertal and the European population 4000 (Model 2.1A) and 40 000 generations (Model 2.1B) ago gives better results than any previously tested model, with nsd=2. Within this range of dates, the best fit was obtained for a divergence time 6000 generations or 150 000 years ago (Model L2.1C), in which case only one statistic departed from the observed ones, that is, there was a deficit of modern haplotypes.

Model L2.2: migration from Neandertals into modern humans

We then tried various migration rates from Neandertals to early modern humans (Cro-Magnoids), on the basis of model L2.1C, which so far provided the best fit. We found that any value exceeding 0.001% causes a substantial increase in the difference between simulated and observed values and that even such a low migration rate reduces the fit with respect to the model without migration. At the population sizes considered in this study, this migration rate means that less than 2.5 individuals per generation could have mixed with the modern human population at the time of the maximum expansion of Neandertals. We also investigated the effect of a smaller ancestral population by using an effective population size at the time of divergence equal to 1% instead of 10% of the overall population size, but this decreases the fit of the model (data not given).

For the scenario providing the best fit (Model 2.1C), we also estimated the age of Most Recent Common Ancestor (MRCA) of the entire mtDNA genealogy. Across 1000 simulations, we found an average MRCA around 244 200±2200 years ago. This estimate is more recent than the ones previously inferred from mitochondrial DNA variation in Neandertals, that is to say between 300 000 and 750 000 years ago (Krings et al., 1997, 1999).

Discussion

On the basis of a simple model of instant mixing, Serre et al. (2004) estimated that any Neandertal contribution to the mtDNA of modern humans in the past 30 000 years could not have exceeded 25%. Under a more sophisticated model simulating the effects of geography, but not considering ancient DNA evidence, Currat and Excoffier (2004) estimated that the absence of Neandertal mtDNA sequences in the modern gene pool is compatible with no more than 120 admixture events between Neandertals and anatomically modern Europeans, or a 0.01% rate of immigration from Neandertals. Conversely, patterns of linkage disequilibrium in a small sample of modern nuclear DNAs have been interpreted as suggesting that archaic human populations may have contributed up to 5% of the modern gene pool (Plagnol and Wall, 2006). This view seems consistent with Green et al.'s (2006) calculations, suggesting a non-negligible presence of Neandertal DNA in the modern nuclear genome, but not with Noonan et al.'s (2006) opposite conclusion. Given the limited sample size, one Neandertal nuclear genome, this disagreement is not surprising.

This study differs from previous studies, by being the first to consider the DNA sequences of 24 000-year-old anatomically modern Europeans, whose authenticity has recently been confirmed (Caramelli et al., 2008). Its purpose was to explore various models describing the genetic relationships among anatomically archaic (that is, Neandertal) and anatomically modern inhabitants of Europe (including Cro-Magnoids) and possibly to reject some of them. In future analyses, we will proceed to estimate parameters such as effective population sizes and gene flow rates between groups, on the basis of the models that proved compatible with the data.

Our first finding was that it is possible to define models that can reproduce many, but so far not all, features of genetic diversity in the European samples collected in a 50 000-years time-window. The second main finding is that there is a much higher probability of reproducing the data if anatomically archaic (that is, Neandertals) and anatomically modern individuals are assigned to two distinct genealogies. We found at least three significant differences between observed and simulated statistics under all 10 models on the basis of a single genealogy, versus not more than 2 in the three models with two distinct genealogies. In addition, the best fit was observed for a model (L2.1C) where gene flow between the Neandertals and the other groups was 0. Any degree of genetic exchange resulted in an increased difference between observed and simulated data, and the maximum Neandertal contribution to the modern gene pool compatible with the data was around 0.001% per generation.

Despite the evident differences between the mtDNA sequences of Neandertals on the one hand, and Cro-Magnoid and modern humans on the other, it is possible, although complicated, to fit some Neandertal sequences in the modern human mtDNA genealogy. Indeed, in the data generated simulating a subdivided Neandertal population and descent of Cro-Magnoids from a Neandertal subpopulation (Model L1.7) only three of the simulated statistics depart significantly from the observed ones. However, this result mostly reflects the uncertainty associated with small sample sizes, and hence the large standard errors about the estimates of past genetic diversity. Indeed, it does not matter which subgroup of Neandertals, central or southern/eastern, is considered ancestral to modern Europeans in model L1.7. What makes this model more likely than other one-genealogy models is the smaller sample size of the Neandertals in the modern human genealogy, 2 or 4 instead of 6. Although a simple count of the parameters that we were unable to fit does not lead to formal rejection of a model, these simulations show that the fit increases dramatically by envisaging two independent mtDNA genealogies for Neandertals and modern humans.

Ancient DNA data suffer from serious limitations, mostly deriving from the generally poor preservation of ancient biomolecules in fossil remains. Under the current standards developed to minimize the risk of sequencing contaminating DNA, if a Neandertal individual actually carried an mtDNA sequence identical to modern ones, that sequence would be considered a contamination and would not be published. Unless technical progress will offer new solutions, a high degree of uncertainty seems thus impossible to eliminate. Within these limitations, our simulation study—the first one comparing the available Neandertal, Cro-Magnoid and modern European data with simulated gene genealogies—shows that under the best-fitting model (L2.1C), Neandertals on the one hand and Cro-Magnoid and modern Europeans on the other are part of two monophyletic clades unconnected by gene flow after their separation.

Even under the best-fitting models, there was a deficit of simulated haplotypes in modern Europeans, except when the mutation rate was so high (Model H1.3) that many other simulated statistics became incompatible with the observed data. The simplest explanation we can envisage for this finding is that immigration from other geographical regions, both within and outside the continent (see, for example, Torroni et al., 2006), has contributed substantially to mtDNA diversity in Europe, increasing the number of different haplotypes in the modern populations. However, to realistically simulate geographic scenarios in which anatomically modern populations exchange migrants across millennia, one should greatly increase the number of unknown parameters in the simulation, which did not appear feasible for serial coalescent simulations at this time.

Under the best-fitting model, the population split between Neandertals and the ancestors of modern Europeans, that is, the moment after which interbreeding ceased between the two human forms, can be placed around 150 000 years ago, and the most recent mitochondrial common ancestor of the two groups is estimated to have lived some 244 000 years ago. The spread of simulated values is large for these dates, and in fact our simulations are consistent with divergence between Neandertals and the ancestors of modern humans having occurred any time between 100 000 and 1 000 000 years ago. However, the split date that appeared most likely in our simulations is comparatively recent, more recent indeed than the dates estimated from archaeological evidence, between 250 000 and 400 000 years ago (Foley and Lahr, 1997; Rightmire, 2001), from genetic evidence, between 290 000 and 440 000 years ago (Noonan et al., 2006), or using a combination of morphological and molecular clocks (Weaver et al., 2008), on the basis of the observation of a parallelism between cranial and genetic variation (Manica et al., 2007).

Green et al. (2006) simultaneously estimated population size and time since the split. They assumed that the split occurred 400 000 years ago, obtaining an estimate of population size close to 3000 individuals. In our study, comparably small Neandertal population sizes (Nf=2500) proved unable to generate the observed haplotype number, haplotype diversity and nucleotide diversity (Models L1.1–L1.4) and hence could be rejected with a high degree of confidence. As Green et al.'s approach imposes a necessarily negative correlation between effective population sizes and times since the split, our results may mean that the Neandertal population was larger, and split more recently from the ancestors of modern Europeans, than estimated from the analysis of nuclear DNA. More sophisticated simulations, including the evolution of both mitochondrial and nuclear loci under various scenarios, should be developed to gain a more precise understanding of the Neandertal's demography.