(photo credit for Robot B-9 from the TV series Lost in Space (Creative Commons, Maker Faire 2008, San Mateo) I have a…
AI Enhances Human Genetics
(NHGRI www.genome.gov)Genetics is a field rich in numbers and patterns, reaching back to Gregor Mendel’s crosses of pea plants with distinguishing characteristics that revealed the two basic laws of inheritance.
At a microscopic level, genetics is informational, a series of languages: a DNA sequence is transcribed into an RNA sequence, which is then translated into a sequence of amino acids comprising a protein molecule. The suite of proteins, with abundances ebbing and flowing as patterns of gene expression change in response to the environment, provides our traits, our abilities, and the myriad metabolic reactions that keep us going.
Because genetics is so highly informational, it is a natural target for artificial intelligence. AI can speed, enhance, and extend what we know about the meanings in our genes, transcending what we deduce from far simpler data. It digests (trains on) massive amounts of data, stores and analyzes them, then makes connections and provides insights beyond what a human mind could do.
A Very Brief History of AI
The dawn of AI is credited to British mathematician Alan Turing. He published his idea of the exercise that became known as the “Turing test” in 1950 in the journal “Mind” as “Computing machinery and intelligence.” He asked “can machines think?”
Turing called the eponymous test the “imitation game.” Fittingly, google’s AI defines it as:
“a thought experiment proposed by Alan Turing in 1950 to determine if a machine could exhibit human-level intelligence by engaging in a typed conversation with a human interrogator who cannot distinguish it from a human. If the machine’s responses are indistinguishable from a human’s, it can be considered intelligent.”
The Turing test evolved into the large language model (aka LLM), which is AI that generates human-like text by training on a lot of text, using “rules-based learning.” It is the oldest form of AI.
Within the large language model is machine learning (aka ML). It learns and improves without being instructed to do so. Imagine Hal from 2001: A Space Odyssey, or even the robot from Lost in Space or Robby the Robot from Forbidden Planet. I’m dating myself … but we’re not to genetics yet.
Next, deep learning (DL) is a type of ML that drills down into layers of information. It is this deep learning that is ideal for the data overload of medical genetics.
Shortening Diagnostic Odysseys from Years to Minutes
Families searching for the cause of their child’s unusual combination of symptoms and traits have called the multi-year effort the diagnostic odyssey. I heard many such stories when writing The Forever Fix: Gene Therapy and the Boy Who Saved It, and many articles over the years.
At human genetics conferences, I’d watch and listen, in awe, as company reps demonstrated new tools and technologies that continually shortened the time to diagnosis. Years, months, weeks, and even days could shrink to mere minutes as algorithms and databases swiftly sorted themselves into matches between an unusual or even unheard-of combination of signs and symptoms and specific sequences of DNA. Although newborn screening of bloodspots taken from the heel to identify a few dozen rare diseases has been going on for decades, today’s AI-driven approaches are taking diagnosis of the ultra-rare to a new level. DNA Science has recently covered newborn screening: See A Genetic Crystal Ball: When Newborn Genome Sequencing Findings Explain Illnesses in Relatives and Nixing the Newborn Screening Advisory Committee is Ill Advised. I was being kind – damaging newborn screening, a hugely successful preventative medicine measure, is evil and idiotic, akin to vanquishing vaccines. I digress.
Digitizing Data and Details
AI layers levels of genetic information and searches databases for matches.
It easily zeroes in on and compares odd signs and symptoms of certain rare genetic diseases – the kinky hair of a child with giant axonal neuropathy differs from the wiry hair of a child with Menkes disease, for example.
Some signs are subtle, such as the ability to bend fingers back in people with Ehlers-Danlos syndrome), the distinctive wide smile of a child with Williams syndrome, or the tall and lanky body and long face of a person who has Marfan syndrome. AI can distinguish the gait of a person with an ataxia from the movements of someone with early Huntington’s disease.
Eclectic types of data feed into AI-aided genetic diagnosis. These include signs and symptoms, findings from blood and urine tests, images from medical scans, pedigrees that depict family history, physical exam results and measurements, as well as molecular findings such as DNA and RNA sequences, chromosomal anomalies, and gene expression patterns.
Some criteria used to diagnose genetic diseases aren’t obvious: ratios of lengths of structures, such as fingers; patterns of capillaries at the back of the eyes; distances between facial features. AI sees, compares, groups, and otherwise analyzes them all. And details are important.
A person with Williams syndrome has a small chin, wide nose and mouth, puffy face, long neck, and poor mobility. The irises may be star-shaped. A person with Noonan syndrome looks very different: drooping widely-spaced eyes, low-set ears that tip backwards, prominent eyes, broad nose, and bowed upper lip. Children with these and many other genetic conditions do not necessarily look unusual, but strikingly resemble each other.
AI can also help design treatments. A tool called AlphaFold, for example, translates DNA sequences into the millions of versions of proteins that they encode. The algorithm then searches three-dimensional nooks and crannies of the proteins for specific “druggable targets” in the development of novel pharmaceuticals.
Data, Data, and More Data
A deep learning algorithm used in genetics research is trained on an extremely large dataset. Consider the AI tool CHIEF, used to evaluate cancer.
CHIEF was initially trained on 15 million unlabeled images grouped by tissue type or location in a particular organ or structure. The algorithm was next trained on 60,000 more images, representing many body parts, considering exactly where a particular cell lies within the three-dimensional space of a tissue or organ. Then, CHIEF was given more than 19,400 whole-slide images from 32 independent datasets, collected from 24 hospitals and patient cohorts from all over the planet.
AI can extract diagnostic cues and clues that might not be obvious.
CHIEF, for example, analyzes and compares scans of the “tumor microenvironment,” the cellular landscape surrounding cancer cells. The tool can identify the site of tumor origin, find DNA patterns that predict treatment responses and likely outcomes, and even foresee patient survival from whether or not immune cells are in the vicinity (good prognosis) or not (bad prognosis).
Eye2Gene analyzes retinal scans for patterns of blood vessels, speeding diagnosis of more than 63 eye disorders. Eye2Gene can also detect hints of cardiovascular disease – telltale spots of cholesterol, the integrity of blood vessels and their branching patterns, and damage from high blood pressure.
The skeleton is the target of Bone2Gene, which identifies more than 700 conditions, corresponding to more than 500 genes, that affect bones. These include achondroplasia (a form of dwarfism), Turner (XO) syndrome (a female with only one X chromosome), Noonan syndrome, and several lysosomal storage diseases.
Most fascinating, I think, is Face2Gene, which assigns facial descriptors to photographs. Digitized data include the distances between facial features, shapes, dimensions, contours, skin patterns, and other characteristics, returning a list of possible matching genetic syndromes.
CODA
My personal experience with today’s health care providers is that they rarely have an in-depth knowledge of, or even familiarity with, the current state of the field of medical genetics and related biotechnologies, such as screening and diagnostic tests and gene-based treatments. Sales reps pitching tests and drugs parrot facts about their specific products, as they’ve been trained to do, but I’m not sure if they could launch into a distinction between, say, a tumor suppressor and an oncogene, or know that there isn’t just one “mitochondrial disease.”
So I’m optimistic that AI will be increasingly able to step in and bring DNA science into health care. AI and DNA make a powerful combination, and can put the “personal” into “personalized” medicine.