Bouckaert, R., Lemey, P., Dunn, M., Greenhill, S. J., Alekseyenko, A. V., Drummond, A. J., Gray, R. D., Suchard, M. A., & Atkinson, Q. D.* (2012). Mapping the origins and expansion of the Indo-European language family. Science, 337:957–960.
Download a pdf of the paper — Full Text Version — Supplementary Material — Indo-European Lexical Cognacy Database — Correction
* Corresponding author: firstname.lastname@example.org
In this paper we identify the homeland of the Indo-European language family by adapting ‘phylogeographic’ methods initially developed by epidemiologists to trace the origins of virus outbreaks. Instead of comparing viruses, we compare languages and instead of DNA, we look for shared cognates – words that have a common origin, such as “mother,” “mutter” and “madre” – across various Indo-European languages. We use the cognates to infer a family tree of the languages and, together with information about the location of each language, we trace back through time to infer the location at the root of the tree – the origin of Indo-European.
Watch the Indo-European expansion unfold…
This movie shows how our model reconstructs the expansion of the Indo-European languages through time. Contours on the map represent the 95% highest posterior density distribution for the range of Indo-European.
Language expansion in time and space…
Figure showing the expansion of the language family across Europe through time: Estimates for the location of ancestral nodes on the tree (representing ancestral languages) are plotted as opaque points with a color that indicates their corresponding age. Older nodes are shown on the foreground to better show the expansion. (NB: – This figure needs to be interpreted with the caveat that we can only represent the geographic extent corresponding to language divergence events, and only between those languages that are in our sample. The rapid expansion of a single language and nodes associated with branches not represented in our sample will not be reflected in this figure. For example, the lack of Continental Celtic variants in our sample means we miss the Celtic incursion into Iberia and instead infer a later arrival into the Iberian peninsular associated with the break-up of the Romance languages (and not the initial rapid expansion of Latin). The timing represented here therefore offers a minimum age for expansion into a given area.)