New method for DNA storage announced

A new method of DNA storage has been created at the European Bioinformatics Institute (EMBL-EBI) based in the Wellcome Trust Genome Campus in Hinxton. The method, published in the journal 'Nature', makes it possible to store 100 million hours of high-definition video, or the equivalent, in a cup of DNA.

The amount of digital information in the world is enormous, and it's increasing rapidly. How to store this endless flow of new digital content poses a real problem in terms of space, energy and how long it will last.

DNA is the body's natural storage system, and we know it is robust enough to store data for thousands of years. "We can extract [DNA] from woolly mammoth bones, which date back tens of thousands of years, and still make sense of it," said Nick Goldman, from EMBL-EBI.

DNA has been shown to be an effective method of storing digital information already. It is not only robust but also small and dense, and it can be stored without using extra power; however, there are several issues that stop it from being commercially viable.

First, using current methods, it is only possible to manufacture DNA in short strings. Second, writing and reading DNA is prone to errors – in particular when the same letter of DNA is repeated again and again. Nick and coauthor Ewan Birney have tried to create a code that overcomes those problems.

"We knew we needed to make a code using only short strings of DNA, and to do it in such a way that creating a run of the same letter would be impossible. So we figured, let's break up the code into lots of overlapping fragments going in both directions, with indexing information showing where each fragment belongs in the overall code, and make a coding scheme that doesn't allow repeats. That way, you would have to have the same error on four different fragments for it to fail – and that would be very rare."

For this new method to work, DNA must be synthesised using the encoded information. This was done by Agilent Technologies, Inc., based in California. The team from Hinxton sent over encoded versions of an MP3, a JPG and a PDF, as well as a file that describes the encoding. Agilent were able to download the files from the web and synthesise hundreds of thousands of pieces of DNA from them, giving them what looked like tiny pieces of dust.

Using a code, and a machine that can read DNA, the researchers were able to sequence the DNA and read back the information. The next step is to perfect the coding scheme and to explore practical aspects, working towards a commercially viable DNA storage model.