Skip to content

When you choose to publish with PLOS, your research makes an impact. Make your work accessible to all, without restrictions, and accelerate scientific discovery with options like preprints and published peer review that make your work more Open.

PLOS BLOGS DNA Science

NextCODE Health Mines deCODE’s Data, and More, to Catalyze Clinical Diagnosis

imgresLike the mythical Phoenix bird reborn from its own ashes, deCODE Genetics, or at least its data, resurfaced in October in the form of a private company, NextCODE Health. The metaphor isn’t perfect, for deCODE still exists as a wholly owned subsidiary of Amgen, which paid $415 million for it at the end of 2012.

deCODE Genetics flew into the headlines in 1996 with its Icelandic Health Sector Database, an ambitious project to meld genealogical, genetic, and health records of the entire population of Iceland, with an “opt-out” model of presumed consent. deCODE kept the bioethics journals busy for years. And as  people all over the world debated informed versus presumed consent, deCODE published a series of discoveries, in top journals, about genes that increase risk for kidney disease, cancer, lupus, vascular disease, schizophrenia, osteoporosis, and found a protective gene variant against Alzheimer’s disease.

TONS OF DATA
IcelandWith genealogical records dating back to 874 AD and abundant genetic markers, deCODE developed a robust way to discover genes by comparing symptoms to the proportion of the genome that people related in a certain way share. A person shares on average half his or her genome with full siblings, parents, and children, but one-eighth with a first cousin, and lot less with his or her 5,000 or so fourth cousins, and so on.

Take 100,000 or so people, figure out the degree to which they’re related, and search for patterns in the genome that only the people with a specific diagnosis have in common. Or look at the problem the other way. For example, one study found that 750 women with endometriosis shared significantly more genome regions than did 750 matched controls without endometriosis. Either approach highlights places in the genome to look for a causative or risk gene. The more specific the diagnosis and the more people and markers, the stronger the associations.

NextCODE invited me to a luncheon towards the end of the American Society of Human Genetics meeting a few weeks ago in Boston. I didn’t expect to learn much, after four days of being bludgeoned with tales of annotating genomes and comparing exomes to solve diagnostic mysteries, but I was impressed.

NextCODE, operating independently of deCODE, has a five year exclusive license for the use of the largest proprietary database of sequence diversity, along with sequence analysis tools developed by deCODE, for use in clinical sequencing. They’ve got whole genome sequencing information on more than 300,000 individuals, 3,000 deeply, and 40 million variants. The company already has service agreements with Boston Children’s Hospital, Newcastle University, Saitama University, and Queensland University. CEO Hannes Smarason blogs about the coming projects here.

“We know what percentage of the genome any 2 people share, down to .01% allele frequencies for rare diseases and greater than 1% for more common conditions,” Jeff Gulcher, MD, PhD, co-founder of deCODE and president and CSO of the new company told the crowd. Their clinical sequencing services tap into an annotation pipeline that uses public domain databases plus the voluminous Icelandic information, all forming the Genomic Ordered Relational (GOR) database. Common sense conclusions emerge from all that data along with nailing down specific gene locations, such as the impact of a gene variant on lifespan. “We catalog age-specific allele frequencies. If a gene variant is in old Icelanders, it is unlikely to kill kids,” Dr. Gulcher said.

A CASE STUDY
The numbers were intriguing, but it was the case study that got my attention: sisters, ages 3 and 5, who had progressive blindness and deafness. The GOR database, fed the right information (complete genome sequences), nailed the diagnosis in 5 minutes.

The parents had spent years going from doctor to doctor, with no answers. Tests for single-gene retinal disorders, of which there are many, were negative, and they also didn’t have known conditions that impair hearing as well as sight, such as Usher syndrome. The next step: whole genome sequencing for parents and daughters.

Exome sequencing was needed to diagnose Gavin Stevens' LCA, a form of hereditary blindness. (Jennifer Stevens)
Exome sequencing finally led to a diagnosis for Gavin Stevens’ rare form of  hereditary blindness. (Jennifer Stevens)

The many-years-to-a-diagnosis story is one I know well, for it was a recurring theme in my book about gene therapy and I’ve blogged about diagnostic journeys, such as that of five-year-old Gavin Stevens. It took years to diagnose his Leber congenital amaurosis because he had a mutation in a gene that hadn’t yet been discovered.

Dr. Gulcher took us quickly through the narrowing down to reach the diagnosis for the two little girls. Like my husband choosing seat warmers, XM radio, and a kayak rack for his new metallic blue Honda, Dr. Gulcher entered into the “Clinical Sequence Analyzer” the probable mode of inheritance (autosomal recessive), the symptoms, and frequency cut-off for candidate gene variants. The search included known syndromes and different types of mutations, such as single base substitutions, truncations, or repeats. “We looked at 423 exonic variants at <1% frequency, screening for autosomal recessives, and got 6 hits,” Dr. Gulcher explained.

We all watched shifting color bars on the image on the screen, which soon spit back “retinal dystrophy.” Not surprising, but it also identified a mutation in a gene called SLC52A2. It is a tweaked copy (duplication) of SLC52A3, known to cause Brown Vialetto Van Laere syndrome. Only a few dozen cases have ever been reported. The gene, on chromosome 20, encodes a protein that transports riboflavin into cells, where the vitamin is used to produce key molecules in energy metabolism.

The known form of the condition typically begins with deafness and does not include blindness, but sadly it is like amyotrophic lateral sclerosis, impairing neurons until death comes in adolescence. The NextCODE researchers identified two other families with SLC52A2 mutations, diagnosing teenage children posthumously from parental DNA.

NextCode’s “sequence-based clinical diagnosis” is “extremely affordable,” but company reps wouldn’t be specific. And they’ll have competition from institutions diagnosing patients with familial exome sequencing. But I don’t know if anyone else can match the power of the deeply-rooted Icelandic database. Which brings up the looming matter of participation of the public in sequencing projects.

PRIVACY vs THE NEED FOR MEGADATA

(NHGRI)
(NHGRI)

The question of informed consent still echoes around uses of population level databases to refine family level diagnoses, especially since the clever outing of identities of research participants using Google and a genealogy database led by grad student Melissa Gymrek earlier this year. And a comment on the most recent DNA Science post about tracing African-American roots through DNA testing questioned ancestry.com’s use of a database of DNA sequences from samples originally taken from indigenous peoples by a not-for-profit organization. Did those people give permission for future use of their personal information? Even if it’s de-identified?

Apparently the ruckus over use of deCODE’s data hasn’t died down. On May 28, 2013 Iceland’s Data Protection Authority denied the company’s request  to impute genotypes of 280,000 people using data from relatives who had consented to use of their genetic information. Unfortunately, genetics can complicate informed consent because relatives share genes in predictable patterns and proportions.

Did the Icelanders or the indigenous peoples give permission for their DNA information to be used to help diagnose two little girls years later, or to help African-Americans deduce where their ancestors came from? Use of DNA data from past collections is certainly a contentious area. I learned that this summer when my two blog posts challenging the classic case of DNA misuse of the Havasupai Indians  led to personal threats that grew so vicious that the blog was shut down.

I think that, with time, proprietary feelings surrounding personal genome information are going to fade away, for two reasons: The novelty of genome sequencing will diminish, and people will realize that those huge databases of A, T, C and G sequences are essential to interpreting personal genomes. Eventually, use of DNA sequences in a database, tied to symptoms but not personal identities, will become an accepted, if not expected, act of kindness.

Back to top