Skip to content

PLOS is a non-profit organization on a mission to drive open science forward with measurable, meaningful change in research publishing, policy, and practice.

Building on a strong legacy of pioneering innovation, PLOS continues to be a catalyst, reimagining models to meet open science principles, removing barriers and promoting inclusion in knowledge creation and sharing, and publishing research outputs that enable everyone to learn from, reuse and build upon scientific knowledge.

We believe in a better future where science is open to all, for all.

PLOS BLOGS DNA Science

AI Analyzes Human Genomes and Foretells My Possible Obsolescence

(photo credit for Robot B-9 from the TV series Lost in Space (Creative Commons, Maker Faire 2008, San Mateo)

I have a curious relationship with AI.

A few years ago, I began to notice that when I posed a query on a topic in genetics – for a DNA Science post or to update my textbook Human Genetics: Concepts and Applications with McGraw-Hill – an answer would pop back that read curiously like my own words. Likely, they were.

So I wasn’t terribly surprised when, a few months ago, a notice appeared in The New York Times and elsewhere: “Anthropic to pay authors $1.5 billion” in a class action lawsuit brought by three authors. Fourteen of the half million “works” that the company copied, to train its chatbot Claude, were mine!

AI Saves Time, Provides Details
Right now, I’m working on the next incarnation of my textbook. Editions are outdated, I’m told, so this is a revision – not much of a difference. To update in this age of AI, I still scrutinize new research findings, led to journal articles through old school press releases in my email.

But AI is becoming increasingly helpful in handling the minutiae, of quickly updating a statistic or other detail. What does screening the genome of an early embryo cell cost? How are cancer immunotherapies selected for a particular patient? Which single-gene diseases are amenable to correction using CRISPR?

A few weeks ago, DNA Science began delving into applications of AI in genetics, through tools with compelling names like Eye2Gene, Face2Gene, and Bone2Gene that are increasingly aiding diagnoses of exceedingly rare sets of symptoms that arise from mutations in single genes.

AlphaGenome Widens the Scope
Exploring single genes, like those behind hemophilia or cystic fibrosis, compares sequences of the DNA bases A, C, T, and G. The AI-based tool AlphaGenome takes a step back for a broader view, seeking meaning in the vast stretches of our genomes that do not specify the amino acid sequences of specific proteins and their controls.

Quick biology lesson. A human genome is three billion bits of information, a smidgeon of which – under two percent – contains the instructions for a human body to form, function, develop, and grow.

Throughout the 1990s, many research groups explored specific single genes and the diseases they cause when mutant. The gene-by-gene approach was painstaking and slow. AI is now aiding investigators in deciphering the possible meanings buried within the other 98 percent of the sequence that doesn’t encode protein.

A Genome is More Than a Recipe Book for Proteins
AlphaGenome is built on data from public DNA sequence repositories. The input includes protein-encoding genes and their controls, as well as the vast stretches of noncoding DNA sequences.

The AI tool can process up to a million DNA bases at a time. It discovers novel gene variants, links to databases to identify effects on physiology of specific DNA sequences, and reveals controls knitted into the strings of DNA bases that function at a distance from the genes that they oversee – all in just seconds. I remember when it took an entire PhD project, years, to sequence a gene and understand its function.

Specifically, AlphaGenome reveals:
• The starts and ends of protein-encoding genes.
• Splice sites, which are signals in the DNA sequence to cut and paste the mRNA that encodes the corresponding amino acid sequence. Mutations at splice sites cause some cases of cystic fibrosis, spinal muscular atrophy, other single-gene syndromes, and cancer.
• Percent of a gene’s sequence that is transcribed into RNA (gene expression), the key step in manufacturing a specific protein.
• Areas of the DNA sequence that bind specific proteins that affect gene expression.
• Proximity of genes to each other.
• Sequences far from genes that affect their function.

In short, AI will enable us to understand the genetic instructions for a human being in ways that we never have before, and might not even have imagined.

Will AI Replace Me?
AI summaries of genetics topics or research reports are, as far as I can tell, well written, organized, and correct.

The algorithms behind AI tools have likely ingested and digested books like The Elements of Style to catch things that writers pay attention to: avoiding passive voice, alternating sentence and paragraph lengths, cutting repetition, grammar, and less tangible qualities such as cadence and word choice. And of course being sure not to use certain terms deemed offensive, such as “abnormal,” “victim,” and various terms that apply to certain population groups (See Embracing Diversity, Equity, and Inclusion in Genetics Textbooks and Testing).

I use AI to quickly lead me to relevant articles – I know from years as a science journalist, before the offensive “content creator” entered the lexicon, which technical sources to trust. Science, Nature, The American Journal of Human Genetics, yes. Blogs and newspapers and magazines, no.

Asking a specific question of AI is important. What is the current cost of IVF or to sequence a human genome? Applications of CRISPR? How gene expression patterns relate to a set of symptoms?

AI resources provide questions for the students who use my textbook, and it’s pretty amazing. The embedded AI tool doesn’t just grade an answer to a question as correct or not, but tells students why their answer to a multiple choice question is incorrect, why other answers are also incorrect, and why the right answer is right.

But as much as I appreciate AI as a pedagogical tool, I think, it cannot replicate the contributions of a human writer of textbooks: nuance, irony, humor, the sequence of topics and choices of illustrations and photos.

So I’m not too worried.

There’s more to writing a college science textbook, especially for non-science majors, than definitions, jargon, explanations, figures, descriptions, and learning tools. Creativity and delivery are important, and that’s where the non-nuanced fact-spitting of AI may fall short.

I looked through my human genetics book for examples. Would AI have:
• come up with a detailed question on Mendel’s second law as it applied to the Tribbles of Star Trek?
• described a rare blood type with an example from General Hospital?
• created beings from another planet – the Gazooks – to illustrate Mendel’s second law?
• told the story of one of the first patients to receive Gleevec to treat leukemia – a young editor at Glamour magazine in seemingly perfect health, who shared her story with me?
• quoted the parents of young children having experimental gene therapies?

I welcome the speed in researching, comparing, and analyzing that AI can provide. But it cannot replace experience with the craft of original, unaided writing.

Related Posts
Back to top