Scientists used AI to rewrite part of life’s alphabet
April 30, 2026
4 min read
Add Us On GoogleAdd SciAm
Scientists used AI to rewrite part of life’s alphabet
An engineered E. coli strain survived after one amino acid was designed out of many of its ribosomal proteins—an early test of whether life’s chemistry can be simplified
By Jacek Krywko edited by Eric Sullivan
An illustration of protein production inside a bacterium. In a new study, researchers used AI to redesign some E. coli ribosomal proteins to work without the amino acid isoleucine.
BSIP/Education Images/Universal Images Group via Getty Images
Nearly all known life builds proteins from the same alphabet of 20 canonical amino acids. Strung together in different orders, those building blocks form the proteins that make cells work. In a new Science study, researchers at Columbia University, the Massachusetts Institute of Technology and Harvard University used artificial-intelligence-guided protein design to test how much of that alphabet can be pared back: they engineered an Escherichia coli strain that survived after it was redesigned to not have a specific amino acid in its ribosomal proteins.
The team did not create a true 19-amino-acid organism. The engineered strain still uses the targeted amino acid, isoleucine, throughout most of its genome. But the result suggests that one of life’s most ancient and essential machines can tolerate at least partial simplification—and that AI may help biologists test the limits of life’s chemistry.
“The underlying question that we seek to ask is what early life looks like,” says Harris H. Wang, a professor of systems biology at the Columbia University Irving Medical Center and senior author of the study. Researchers think all life today descends from an ancient, single-celled organism that lived more than four billion years ago. But some suspect that earlier, simpler life-forms that predate even this common ancestor may have run on a leaner chemistry. Wang’s team wanted to find out whether modern cells could be engineered in that direction.
On supporting science journalism
If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.
“Think about language. There are 26 letters in the English alphabet, but do you really need 26, or can you simplify that to 25 or 24?” Wang says. The team chose to remove isoleucine because it resembles the amino acids valine and leucine closely enough that, in principle, some proteins might tolerate isoleucine’s removal when it was replaced with one of them. They worked with E. coli, one of biology’s best-studied organisms, and targeted its ribosomes, the molecular machinery that builds proteins and is itself a sprawling complex of more than 50 proteins. “Like in a video game, we just pushed the ‘skip to the final boss’ button,” Wang says.
The first attempt was brute force. The researchers took 39 essential or highly expressed E. coli genes and replaced every isoleucine with valine or leucine, like a genetic find-and-replace. The engineered bacteria survived but did so poorly. Their fitness dropped to about 40 percent of wild-type E. coli. The team’s target was 90 percent. To close the gap, the researchers turned to AI.
They combined two kinds of models. First, sequence-based protein language models such as ESM2 and MSA Transformer read protein sequences and suggested evolutionarily plausible mutations that a simple swap would miss. Then structure-based AI models such as AlphaFold2 and ProteinMPNN checked that the redesigned proteins would fold into the correct shapes and fit alongside neighboring molecules.
The proposals were stranger than the team expected. “Some of these AI designs were really surprising,” Wang says. “They didn’t look like anything we would have anticipated.” In one case, while redesigning a ribosomal protein called RpsJ, the AI remodeled an alpha helix—a structural element bridging different parts of the ribosome—and introduced eight new nearby mutations to compensate for the substitution of just two isoleucines. “Maybe these machine-learning systems know some aspects of biology we can experimentally verify but we don’t yet understand,” Wang says.
“A noteworthy part of the project is the evolving contribution of AI to this work,” says Tom Ellis, a professor of synthetic genome engineering at Imperial College London, who was not involved in the study. “In the last seven years, the AI-enabled modeling of proteins and mutations in proteins has come on leaps and bounds.”
The team first tested each AI-suggested change one at a time, confirming individual edits could meet the 90 percent fitness goal. Combined, the changes killed