As more and more torrents of data pour in every day, some experts fear that modern hard drives may become too limited or outstripped to capture information.

But as years of computer science have proven before, the solution to this problem may be found in the smallest and unlikeliest of places. This time, it's in a single gram of DNA.

Breakthroughs In Data Storage

The use of DNA for digital storage is appealing in theory because DNA is ultracompact enough to store, replicate, and transmit massive amounts of information.

This hypothesis was proven in 2012. Harvard University geneticist George Church published a paper describing how he and his colleagues successfully encoded 650 kilobytes worth of data into DNA strands, which contained millions of copies of Church's 52,000-word book, Regenesis.

Indeed, the proof-of-concept was already a groundbreaking achievement — one that other engineers and biologists would expand upon over the years. But Church and his team's methods were inefficient, as it could only store 1.28 petabytes of data per DNA gram.

Now, scientists from Columbia University and the New York Genome Center have created the highest-density DNA data storage ever invented, surpassing Church and his team's first research.

Led by Yaniv Erlich, the team of engineers successfully stored and retrieved 215 petabytes of data (215,000,000 gigabytes) into DNA.

It contained six files: an old French film called The Arrival of a Train at La Ciotat Station, a 1948 scientific research paper, a computer operating system, a $50 Amazon gift card, a photo, and a computer virus.

Manipulating DNA Molecules

How did Erlich and his team do it? They took advantage of the structure of DNA molecules, which look like twisting ladders denoted by the letters A, C, G, and T.

This genetic sequence typically acts as a building block for living things, and if one can convert it into binary numbers 0 and 1, DNA molecules can encode almost anything.

Of course, the process is not that easy because not all DNA sequences are robust enough, said Erlich. What's more, not all data stored in DNA can be retrieved successfully.

To solve these issues, Erlich and his colleagues made use of a fountain code to gatekeep the code. This DNA fountain provides unlimited number of clues to the code rather than storing the code itself.

That way, the DNA sequence can still be decoded even if a few codes get lost.

What It Means For The Future

If stored appropriately, DNA can last hundreds of thousands of years and save millions of data.

"DNA won't degrade over time like cassette tapes and CDs," said Erlich. "[I]t won't become obsolete."

However, while the results of the study point to a promising future, Erlich said the use of DNA in data storage is still in its early stages.

"It's basic science," he said. "It's not that tomorrow you're going to go to Best Buy and get your DNA hard drive."

Church, who was not part of the new study, believes the immediate use of the DNA data storage is for archiving. However, it's still quite expensive to archive such huge amounts of data on DNA. In fact, synthesizing the DNA costs $7,000 alone, while reading it costs $2,000.

Meanwhile, Erlich is optimistic about the future of this new technology. He and his colleagues believe that if it improves, the costs may even shrink.

"My hope is that by focusing on better approaches, we can realize the potential of DNA storage," added Erlich.

ⓒ 2021 All rights reserved. Do not reproduce without permission.