Skip to content

When you choose to publish with PLOS, your research makes an impact. Make your work accessible to all, without restrictions, and accelerate scientific discovery with options like preprints and published peer review that make your work more Open.


Tracing African-American Roots Through DNA recently expanded its coverage of western Africa, enabling African-Americans to trace back to these areas, and to find cousins anywhere in the world. I had the pleasure of sitting down with three of their eight scientists at the American Society of Human Genetics meeting last week in Boston. The eight are among 1200+ employees of, where the DNA division debuted in May 2012.

I wish genetic ancestry had been around when I was in grad school. We had to choose among molecular, developmental, or population genetics, and I elected to avoid the dreaded math of population genetics, which at the time seemed to lead only to careers sorting out groups of fruit flies. Today, it means connecting people. I get e-mails every week or so of mitochondrial DNA matches from Family Tree DNA and nearly daily about distant cousins from, for whom I spit into a tube a few months ago. is well known for helping people trace their roots through documents, and the website is mesmerizing. Adding DNA was a natural extension.

African-Americans of course face a huge challenge in finding their origins due to the slave trade. I have a small taste of that, being 98% Ashkenazi.

My grandfather, Sam Aaronson, lived in 3 centuries. He was born in the U.S.
My grandfather, Sam Aaronson, was born here and lived to 103.

I know very little about my grandparents, because when you’re a kid, you don’t realize you should be peppering them with questions about where they came from – at least I didn’t. I know the most about my maternal grandmother, who came here as a small child at the turn of the twentieth century to escape pogroms in Russia. Just as African-Americans often bore the names of slaveholders, Jewish people and other immigrants with hard-to-pronounce or foreign-sounding names often were renamed at Ellis Island or Castle Garden, in New York.


Interest in African origins surged in 1976 with publication of Alex Haley’s historical fiction “Roots: The Saga of an American Family  and the TV series a year later. Decades after the Roots phenomenon I devoured President Obama’s Dreams from My Father: A Story of Race and Inheritance, still wondering why he isn’t considered half-Irish as often as he’s considered half-African-American.

hapmapUsing DNA to trace origins and find cousins grew out of the HapMap project, which identified single nucleotide polymorphisms. SNPs are single base sites in the genome that differ among a certain percentage of a population, set to reflect how deep into variation a researcher wants to delve. 1% and 5% variability are commonly used cut-offs.

We know of so many SNPs today – millions – that we can pretty much define our genomes by them, like signs posted in aisles of a supermarket to ease navigation. SNPs clustered in blocks, called haplotypes (or haplogroups if they’re really extensive) on their chromosome tend to be transmitted together to the next generation, a phenomenon called linkage disequilibrium.

Family lore is also important in tracing ancestry. These horse-drawn fire wagons were used to fight the Triangle Shirtwaist Factory fire in 1911.
Family lore is also important in tracing ancestry. These horse-drawn fire wagons were used to fight the Triangle Shirtwaist Factory fire in 1911.

I love how classical genetics impacts the latest biotechnologies: genetic linkage was discovered in 1911. That was the year that my grandmother stayed home one fateful day from her job at the Triangle Shirtwaist Factory in Manhattan, due to a cold. That was the March day when the factory burned, sending workers jumping out windows. Had she gone in, I might not be here.

Although SNPs travel in linked blocks, cross-overs do happen, when matching chromosomes exchange parts, and that can mix up the SNP sequences – which we can detect. If we know how often crossing over occurs, we can deduce when linkage blocks swapped, which in turn reflects when parents from different ancestries presumably met and mixed their DNA.

A few years ago, acquired a huge DNA collection, along with pedigrees, from the non-profit  Sorenson Molecular Genealogy Foundation, thanks to philanthropist James LeVoy Sorenson. The key was to document that the people providing the samples had lived in a particular geographic area for hundreds of years, because DNA tracks with geography, not skin color or other physical trait.

“The mission was not only to give good samples of genetic variants from human populations but to get out the message that we are a lot more alike than we are dissimilar,” Cathy Ball, PhD, vice president of genomics and bioinformatics at AncestryDNA, told me last week. “So got panels from all over the world. Some of the people are mixed up American mutts like me, but some are from expeditions to Nigeria, Mongolia, small villages in rural Mexico.” The DNA is from saliva samples, which was sometimes a problem to collect when people wouldn’t use the mouthwash for fear of alcohol, she added.

Slave routes (
Slave routes (

Results from are two-tiered: deep ancestry and cousins.

The SNP roster now exceeds 700,000, and African communities have been sampled, so deep ancestry testing can tell whether one’s family came, for example, from Ghana, Mali, Senegal, Cameroon, Nigeria, or other sources of slaves.

Africans have by far the most variable genomes, because we all, ultimately, came from there, but sampling is important. “Capturing genetic structure of a human population requires care in how you get the samples, from before people traveled. We have all this wonderful genealogical information that goes along with each sample,” Jake Byrnes, PhD, a bioinformaticist at the company, told me last week.

“People didn’t come just from the ports. People were enslaved from far inland and marched to the coast and put on ships. It’s complicated because populations moved around a lot. But we can see a stable genetic structure and that can give a sense of who a customer is most related to in a modern population in the area,” Dr. Byrnes said.

But once African slaves came here, their identities were obscured, if not erased. “It’s very challenging for African-American families to trace their family history back before 1860. If you had an ancestor who was enslaved they were enumerated on documents but not named. So paper records are phenomenally difficult to find. Almost anybody of African heritage will bump into a bunch of brick walls. It’s as if the ancestor just appeared in the world when emancipated. Our hope is that through DNA testing they can get a little bit of interesting information about where their ancestors came from. The truth is all over the entire coasts of western Africa,” said Dr. Ball, who is white but whose own DNA revealed an African-American first cousin of her mother’s. She smiled. “You can find living relatives rather than than dead ancestors.”

This 1930 U.S. census includes my mother and her parents and siblings. They came from Russia, likely Minsk. (
This 1930 U.S. census includes my mother and her parents and siblings. They came from Russia, likely Minsk. (

The website offers help in trying to fill in more recent blanks with a treasure trove of documents. Just a few minutes of searching turned up a cemetery index, World War 1 pension records, census data, the St. Croix US Virgin Islands Slave and Free People Records 1733-1930, and the Marion County Indiana public library death index. The site also includes emancipation records, slave ship manifests, military records, property and probate records, and wills that mention slaves. The US census didn’t begin to include African-Americans until 1870, although 175,000 blacks fought for the Union.

When you sign up with, you’re encouraged to figure out and post a family tree, listing as many surnames and birthplaces as possible. “That’s where to start digging. Our algorithm plots shared birth locations, perhaps finding a town where you and your matches both have ancestors. The algorithm combs through both trees and finds common ancestors,” Dr. Byrnes explained. By “matches” he means pairs of people who share a certain percentage of their haplotype blocks.

family tree“The way we do it differently is that we have a lot of customers that have large pedigrees. Once we do a genetic analysis we have something to compare to. We go back and use the pedigrees and DNA and fine tune our algorithms in a way that no one else does,” Ken Chahine, PhD, JD, senior vice president and general manager at told me several months ago when I had my DNA tested.

It seemed, though, that I’ve been hearing from quite a few fifth cousins. But that’s because we each, assuming no inbreeding, have 4,688 of them. So ironically, in the face of the complex algorithms and SNP maps, finding a cousin often comes down to knowing a small fact about a particular place and time.

People seek ancestry testing for many reasons, Dr. Ball said. “Every customer has her or his own personal story. People who are adopted are looking for birth parents, or trying to nail down whether a particular person is Jewish or Italian. Some people want to confirm a single relation to a distant cousin.” Said Dr. Chahine “We have stories of adoptees that find their first or second cousins after 20 to 30 years of trying to figure out who their biological parents are. With 700,000 markers, that’s a slam dunk.”

One of the newest members of the team is Julie Granka, whose mother came to the U.S. from Italy in the 1950s. As an undergrad she liked biology in general and evolution in particular, and ended up getting her PhD working with dog DNA. “It’s exciting and scary to have real people see the results of your work and talk about it. Population genetics is no longer an abstract exercise.”

(Jane Ades, NHGRI)
(Jane Ades, NHGRI)

I can only imagine the giant leap forward understanding our ancestries will take once the databases embrace complete genome sequences. Just as efforts such as the Personal Genomes Project are getting people to place their genome sequences in the public domain for the general good — solving health problems — more widespread ancestry testing will uncover and strengthen the genetic links that bind us all. DNA will ultimately tell us how we are all connected.

  1. Those African Americans who were “enslaved” are not named on the US Census, however, those who were “free” are enumerated by name on the US Census beginning in 1790. By 1860 there were approximately 4 million enslaved African Americans and 500,000 “free: African Americans.

  2. Just a quick note on terminology: the concept of haplogroup applies to chromosomes that don’t recombine, the Y chromosome and mitochondrial DNA. These chromosomes track 0nly the straight paternal or straight maternal line, so they give a very deep but narrow slice of one’s total ancestry.

  3. The people who donated DNA samples and pedigrees to the Sorenson Molecular Genealogy Foundation did so on the understanding that they would be contributing to research by a non-profit organisation. is a commercial venture. I’d be interested to know what you think of the ethics of a commercial testing company using research assets in this way. Do you know if Ancestry have obtained informed consent from the people who contributed the samples to re-use them for commercial purposes? Are AncestryDNA going to share some of the profits from their DNA testing business with the communities in Africa and elsewhere who kindly donated their samples free of charge?

  4. “FREED” are persons of color who were enslaved and rarely (in 1850 & 1860) listed by name in the federal census of 1790-1860. “FREE/OTHER PERSONS OF COLOR (FPOC)” were identified in federal census 1790-1860 as non-whites being blacks, colored, freed man/woman, mulattos, and indians who were free (not enslaved) with FPOCs dating to colonial times in America.

  5. Ken Chahine of answers the query about people donating DNA without being aware of future uses for profit:

    AncestryDNA values the users’ choice to participate. We use the Sorenson Molecular Genealogy Foundation (SMGF) consistent with the consent and mission – to create a large molecular genetic database to help determine how individuals and populations are genetically related, and to bring people together for the purpose of promoting genealogical and family information exchange.

  6. Thank you for taking the trouble to get Ken Chahine to respond. However, he still hasn’t addressed the issues.

    There are actually two concerns here because there were two strands of testing: the public participation via SMGF and the indigenous samples obtained by SMGF researchers for their non-profit organisation.

    The SMGF consent form clearly states that people were donating their samples to a “not-for-profit organization”:

    The change from a “not-for-profit” status to a “for-profit” status following the acquisition of the SMGF research assets by AncestryDNA surely invalidates the terms of the original consent form.

    Yet, from what I understand, AncestryDNA have made no attempt to contact the genealogists who donated their pedigrees and samples to SMGF to ask for their consent for re-testing for commercial purposes. Similarly Ancestry do not appear to have made any attempt to contact the people who provided the indigenous samples to obtain their consent.

    Using the DNA results from indigenous communities to inform the results of Americans taking the AncestryDNA test provides no benefit to the communities who donated their DNA samples for non-profit research. Similarly people from outside the US who contributed to the SMGF database are also deriving no benefit from the re-testing as the test is not on sale outside the US.

    Whether or not AncestryDNA are legally entitled to use those samples having purchased the SMGF assets the whole practice seems to me to be highly unethical.

Leave a Reply

Your email address will not be published. Required fields are marked *

Add your ORCID here. (e.g. 0000-0002-7299-680X)

Back to top