Our Complex Origins

We typically think of technological advance as moving us into the future. But technology also allows us to see our past more clearly. The Hubble and James Webb telescopes (and the lesser known COBE, WMAP, and Planck) allowed us to see deep into the universe’s past, closer to the Big Bang than we’d ever seen before.

DNA sequencing technology allows us to do something similar: we can now see the ancient past of humans more clearly than ever before. Svante Pääbo, winner of the 2022 Nobel Prize in Medicine, and David Reich, author of the book Who We Are and How We Got Here, are among the geneticists that have pioneered this science.

Below is a quick overview of the science and some of the more fascinating insights I took from the book (and David Reich’s appearance on various podcast, linked below).

The science and the technology

  • Human nuclear DNA sequences are about 3 billion base pairs long

  • Only 1 to 2 percent of that sequence serves a direct, clear purpose: coding proteins; the rest is (arguably) useless, non-coding sections of DNA called “junk” DNA

  • Mutations occur regularly from one human generation to the next—about 30 base pair mutations (these come mostly from the father)

  • So that means the vast majority of mutations occur in the non-coding sections of DNA

  • These mutations serve as a sort of population signature: populations of humans that breed with one another will have more similarities in these stretches of DNA than those that do not

  • By applying statistical analysis tools to the DNA of various human populations as well as from the DNA of ancient human remains, geneticists can create a “map” of how populations have mixed and diverged over time

Some key ancient human milestones

mya = million years ago
kya = thousand years ago

  • 7 mya - 5 mya: human ancestors split from chimpanzees

  • 1.8 mya: humans (genus homo) appeared outside of Africa

  • 770 kya - 550 kya: ancestors of modern humans split from Neanderthals

  • 320 kya: most recent shared ancestor of all present-day humans

  • 200 kya - 300 kya: modern humans appear in Africa

  • 50 kya: modern humans spread out of Africa

Humans were much more mobile than thought

  • People used to believe that the ancestors of present day people in an area descended from ancient ancestors that migrated to that region

  • The reality is far more complex

  • There have been layers and layers of mass population movement and replacement

  • This means that the people in a region are descended only a little, if at all, from the people that lived there 10 to 20 kya

Ancient Northern Eurasians (ANEs)

  • ANEs are a ghost population, or a population that doesn’t exist today but must have existed based on the statistical traces it left in the genes of present day populations

  • David Reich and his team used a test for population mixture (Three Population Test), which compares populations to see how much they vary along 600,000 positions in DNA

  • For each population in their dataset, they compared them to all other populations to find which populations gave the strongest signal of mixture

  • The results were surprising; e.g., they found that the French were most closely related to Sardinians and Native Americans

  • In fact, Siberians and East Asians were not related much at all to the French

  • So the team theorized that Native Americans didn’t migrate across the Atlantic into Europe; rather, there was a population that existed in the past in northern Eurasia and sometime before 15 kya moved through Siberia and contributed to the population that became the Native Americans—and that same population moved into Europe and contributed to the population that became the French

  • This was the ghost population the team called the Ancient North Eurasians (ANEs)

  • The team predicted this in 2012 and, amazingly, in late 2013, a separate team of scientists sequenced DNA from the remains of a 24 kya human; the DNA matched that of the predicted ghost population

The Yamnaya

  • The Yamnaya are an archaeological culture defined as such based on their tools and artifacts

  • They started in the steppe region north of the Black and Caspian Sea sometime a little before 5 kya

  • Prior to this, there were a lot of isolated populations in the area, but after the emergence of the Yamnaya, these groups are replaced by the Yamnaya

  • The Yamnaya spread over a vast area: from Hungary in the west to the Altai mountains of central Siberia in the east; their descendants reached India

  • The Yamnaya lasted hundreds of years and then faded out into groups that had a similar lifestyle

  • They left a strong genetic record and language traces: the Indo-European language

  • The cultures that preceded the Yamnaya lived in villages; but with the rise of the Yamnaya, the villages disappeared

  • The Yamnaya innovated with horses, wheels, and carriages, allowing them to cover vast areas (they were mobile)

  • Within 500 to 800 years, their descendants spread even further (one of these descendant groups is the Corded Ware culture)

  • The Yamnaya and their descendants were remarkably effective: they replaced 70 percent of present-day Germany’s population and 90 percent of present-day of Britain

Signs of power

  • There are groups of Y chromosomes in the world that indicate a common male ancestor

  • One example is in Ireland where there is a common Y chromosome type that shares a common ancestor about 1,500 years ago where a good fraction of the population of Ireland have the same Y chromosome type, indicating one powerful male who had preferential access to women and had many offspring (as did his male descendants)

  • Another example is from the Mongol empire about 800 years ago in East Asia, where a Y chromosome culture is dominant, presumably from Genghis Khan

  • There are other, deeper implications of Y chromosome start clusters because there are Y chromosome clusters in Europe, Asia, and South, where each regions shares common regional ancestors (but different from other regions) 5 to 8 kya, but this isn’t the case prior to this period, indicating that it was the first time people accumulated great wealth and power

  • All the Indo-Europeans (except for ancient Hittite) share common vocabulary for wheel, axle, and horses, so the language must have spread after the development of those technologies

  • The language tree shows a very peculiar pattern, which the genetic pattern explains: Indo-Iranian languages (e.g., Iranian and Indian languages) have clear relationships with Balto-Slavic languages (e.g., Lithuanian), which is strange considering the distance between these regions; but the genetic record shows that between 4,500 and 3,500 years ago, we see a chain of archaeological cultures that shows a movement from the steppe to Europe to Eastern Europe back to Europe and then east to India

Yamnaya and India

  • In India, there is diligent practice of something called endogamy

  • There are a minimum of 5,000 or so well-defined endogamous groups, which are groups of people that will only marry individuals within the group

  • India is not a large population like Han Chinese; rather, they are many small populations

  • These groups descend from a relatively small group of founders

  • This is remarkable because some groups, such as the Visya, have kept their genetics clearly differentiated over thousands of years, despite being in close geographic proximity with other groups (similar to the Jews in Europe)

Mixing with Neanderthals

  • It had previously been believed that all modern humans spread out from Africa about 50 kya

  • But the reality of migrations (and separations between populations) was, in fact, more complicated

  • In 2006, the Neanderthal genome was sequenced

  • Using statistical tests, David Reich and his team compared Neanderthal genome to African and non-African genomes

  • If African and non-African genomes differed equally from the Neanderthal genome, then that would have been consistent with the theory that Africans and non-Africans descended from a common ancestor that separated earlier from Neanderthals

  • However, the data showed something different: non-African genome matched Neanderthal genome more closely than African genomes matched Neanderthal genomes (2-4 percent more)

  • The implication was that Neanderthals and modern humans interbred with one another

  • The group that interbed would have been the ancestors of East Asians, Europeans, South Asians, New Guineans (all of whom carry this ancestry)

  • There is a mystery: the Neanderthal ancestry is diminishing (something is selecting against it)

Denisovans

  • There was another distinct archaic population: Denisovans

  • They were discovered in 2010 in Siberia (Denisova Caves)

  • They also interbred with ancestors of modern humans

  • Denisovan genome is found in large amounts in people from Philippines and New Guinea and indigenous people from the Philippines (and broadly with East Asians in small proportions)

South Asia

  • South Asia’s history is, unsurprisingly, complex

  • While Reich admits that there is much detail yet to be discovered (e.g., what were the genetic origins of the Indus Valley Civilization), the genetic tools have illuminated a rich history

  • Until 2016, much of the focus was on two distinct populations that had existed in South Asia: the Ancestral North Indians (ANIs) and the Ancestral South Indians (ASIs):

    • Before mixing, these two groups were as different from one another as Europeans and East Asians are today

    • The ANI are related to Europeans, central Asians, near Easterners, and the people of the Caucasus

    • The ASI descend from a population not related to any present-day population outside India

    • The people of India today are mixtures of these two populations, albeit in different proportions

  • But in 2016, some laboratories published the genomes of the world’s earliest farmers—people that lived in present-day Israel, Jordan, Anatolia (Asian peninsula of Turkey), and Iran.

  • After much study, analysis, and discussion, a much more complex picture emerged:

    • South Asia and Europe have parallel genetic histories

    • About 9 kya, there was a first wave of migration and mixing that originated from the earliest farmers in the Near East and mixed with the local hunter-gatherer populations of Europe and South Asia: from Anatolia to Europe and from Iran to South Asia (1 on the map below)

    • About 5 kya, there was a second wave that brought the Yamnaya pastoralists from the steppe who spoke the origins of the Indo-European languages to mix with the local farmers in the northern regions of Europe and South Asia (2 on the map below)

    • The second wave’s focus on the northern regions of Europe and Asia caused a gradient of ancestry that is common in Europe and Asia (the European and Indian cline’s shown on the map)

    • More specifically, the ANI were a 50 percent mix of Yamnaya steppe pastoralists and 50 percent mix of Iranian farmer-related ancestry, and the ASI were a mix of about 25 percent Iranian farmer-related ancestry and 75 percent local hunter-gatherer ancestry

Origins of caste in India

  • There was an anomaly in the South Asian genetic data, however

  • The model of the Indian cline was based on a simple mixing of ANI and ASI populations

  • Six groups did not fit the model; they had a higher than expected mix of steppe-related ancestry than Iranian farmer-related ancestry than was expected in the model

  • All six of these groups were Brahmins, with a traditional role as priests and custodians of the sacred texts in the Indo-European Sanskrit languages

  • The theory was that the populations did not mix evenly; rather, there were sub-populations that were socially distinct

  • The people who were custodians of the Indo-European language and culture were the ones with relatively higher steppe ancestry, and because they married one another preferentially (if not exclusively) their ancient ANI genetic structure is still intact after thousands of years

Controversy about human diversity

  • David Reich wrote an op-ed in The New York Times: “How Genetics is Changing our Understanding of Race”

  • The piece re-ignited longstanding controversy about science related to race.

  • David Reich’s perspective is that racial categories as we discuss them are social categorizations (e.g., the notion of black has varied over time and varies by geography), and Reich claims that scientists don’t use the word race (rather, in the op-ed piece he puts the word race in quotations)

  • Rather, geneticists deal with groupings based on various objective definitions, and what has emerged from the genome revolution is the understanding that the human species contains within it lineages that have been largely separated from each for many tens of thousands of years, and in some cases hundreds of thousands years, which is enough time for natural selection and evolution to cause shifts in the average frequencies of mutations that matter to traits

  • The implication is that the human species is quite varied; and the correlation to races as described socially is also quite varied (i.e., not perfect)

  • The orthodoxy has been that there is “no meaningful differences between human populations”

  • But this isn’t quite accurate, as it gives the impression that there is no space for there to be average biological differences between groups of people (which is contradicted by the science)

  • We know, for example, that different genetic populations have different susceptibility to disease

  • What is clear is that the genetic record shows that from a purely statistical perspective, the human genome today can be grouped into five categories: Eurasians, Africans, East Asians, Native Americans, and New Guineans

  • The difference on average between populations is about a sixth of the average difference between individuals

  • Jim Watson (one of the people that discovered DNA) and Nicholas Wade, author of the panned book, A Troublesome Inheritance, have argued that these biological differences between populations correspond to the old racial stereotypes; Reich doesn’t agree with this (“there is no evidence in favor of that…[and] those are racist statements to make”)

Podcasts: