Columbia University scientists used gene editing to encode the phrase “hello world!” into the DNA of living bacteria, demonstrating their new framework for storing data in one of the world’s most information-dense materials.
In a paper recently published in Nature Chemical Biology, the bioengineering team developed a method for translating 0s and 1s from computers into a pattern of genetic pieces inserted into cell DNA that can be later decoded. The message persisted through dozens of generations of the bacteria before being altered by mutations.
DNA, which contains the genetic instructions for all living beings, is an extremely efficient data-storage medium that outperforms today’s best computer storage by several orders of magnitude. One gram of the nucleic acid can theoretically hold up to 455 exabytes of information, or nearly 49 billion gigabytes — more data than all traffic across the internet in a month.
Scientists have stored messages in DNA since 1988, and recent breakthroughs in the past decade have led to the storage of dozens of megabytes in the form of genetic code. But in contrast to previous research, which was mostly performed in vitro, the Columbia lab used the DNA within living cells to encode their 12-character phrase using 72 bits of information.
The lab had previously used CRISPR gene editing to create a “biological tape recorder” within bacteria that fills their DNA with added nucleotide clusters, known as spacers, to record what they are exposed to. Reference spacers filled arrays in the DNA at a regular pace, while trigger spacers were added more quickly when a specific biological signal was active. The result was a DNA-based message that recorded when the signal was active.
The same mechanism was applied in their most recent paper, with bacteria being treated to accept electric pulses as the signal that adds trigger spacers. An applied electric voltage would add many trigger spacers and represent a 1, while no voltage and few spacers represented a 0.
The researchers used this method to insert three bits of information into 24 cell populations — six bits and two populations for each letter, space and exclamation mark in the phrase “hello world!” — and tagged the groups with genetic “barcodes” to indicate which bit they held. They let the bacteria multiply and later measured the CRISPR arrays in several hundred thousand cells to successfully retrieve the “0” and “1” bits and the message they encoded.
“It is pretty early to talk about any practical application using this technology yet, but it was a proof-of-concept study to show that you can directly turn the electrical pulses from (a) digital computer into biological information,” said Sung Yim, a postdoctoral researcher at Columbia and the paper’s lead author.
The researchers tested the persistence of the inserted messages by propagating the bacteria populations for about 100 generations over 16 days. They found that the bits were preserved with an accuracy of over 90% across 80 generations, with about a third of the populations losing the inserted information by the 100th generation as the bacteria populations quickly grew in size and experienced mutations.
Despite this limitation, more than 1 septillion (10^24) copies of the 72-bit data could be created across 80 generations, the researchers wrote, demonstrating the very low cost to copying data inscribed in DNA. These cells could be frozen to store the information until it is needed, Yim said.
DNA is an attractive long-term option as the need for data storage grows and modern technology eventually grows obsolete, as the tape recorder and CD player once did, according to Yim. In addition to its high information density, DNA can last a very long time with little upkeep and will never become outdated so long as it remains the genetic language of Earth-based life.
Yim and his colleagues are working to improve their new technology to handle larger amounts of data with greater efficiency. They also want to apply the method to engineer bacteria into diagnostic tools: Yim said they could be put inside the body to sense their surroundings and record their observations in their DNA, much like how the microbes in this experiment had their genetic code altered in the presence of voltage.
“We will be interested in reading and writing DNA (for) a very, very long time,” he said. “The technology to read and write information in this medium will be always getting improved, and we're just getting started with it.”
The article, “Robust direct digital-to-biological data storage in living cells,” was published Jan. 11 in Nature Chemical Biology. The authors of the study were Sung Sun Yim, Ross McBee, Alan Song, Yiming Huang, Ravi Sheth and Harris Wang, Columbia University. The lead author was Sung Sun Yim.