Two Harvard scientists have produced 70 billion copies of a book in DNA code –and it’s smaller than the size of your thumbnail.
Despite the fact there are 70 billion copies of it in existence, very few people have actually read the book Regenesis: How Synthetic Biology Will Reinvent Nature and Ourselves in DNA, by George Church and Ed Regis. The reason? It is written in the basic building blocks of life: Deoxyribonucleic acid, or DNA.
Church, along with his colleague Sriram Kosuri, both molecular geneticists from the Wyss Institute for Biologically Inspired Engineering at Harvard, used the book to demonstrate a breakthrough in DNA data storage. By copying the 53,000 word book (alongside 11 jpeg images and a computer program) they’ve managed to squeeze a thousand times more data than ever previously encoded into strands of DNA, as reported in the August 17 issue of the journal Science. (To give you some idea of how much information we’re talking about, 70 billion copies is more than three times the total number of copies for the next 200 most popular books in the world combined.)
Part of DNA’s genius is just how conspicuously small it is: so dense and energy efficient that one gram of the stuff can hold 455 billion gigabytes. Four grams could in theory hold ever scrap of data the entire world produces in a year. Couple this with a theoretical lifespan of 3.5 billion years and you have a revolution in data storage, with wide ranging implications for the amount of information we could record and store.
Don’t expect your library to transform from paperbacks to vials of DNA anytime soon though. “It took a decade to work out the next generation of reading and writing of DNA – I’ve been working on reading for 38 years, and writing since the 90s,” Church tells TIME.
The actual work of encoding the book into DNA and then decoding it and copying it only took a couple weeks. “I did it with my own two hands!” says Dr. Church, “which is very rare to have that kind of time to spend doing something like this.” Church and Kosuri took a computer file of Regenesis and converted it into binary code — strings of ones and zeroes. They then translated that code into the basic building blocks of DNA. “The 1s stand for adenine (A) or cytosine (C) and the zero for guanine (G) and thymine (T),” says Kosuri. Using a computer program, this translation was simple.
While the future implications and applications are not yet clear, the DNA storage industry is moving at an incredible speed. “Classical electronic technology is moving forward something like 1.5 fold per year,” says Dr. Church, “whereas reading and writing DNA is improving roughly ten fold per year. We’ve already had a million-fold improvement in the past few years, which is shocking.”
Given that the genomics field has attracted its fair share of criticism — witness, for example, the firestorm that greeted biologist Craig Venter and his colleagues when they created the first synthetic cell in 2010 — there are ethical questions to address. Dr. Church and co-author Ed Regis have decided not to include a DNA insert of the book with the actual paper copy when it comes out in October because of this sensitivity.
“We’re always trying to think proactively about the ethical, social and economic implications in this line of work,” says Dr. Church. He explains that the risks are relatively small, but both he and Dr. Kosuri mention that if it is possible to encode a book using DNA encode, it is also theoretically possible to encode a virus–though this would be a far-fetched scenario.
“The chances that something bad will come out of this is so small,” says Dr. Kosuri. “If someone really nefarious wanted to make a virus they would have to use a much larger chunk of DNA to encode function.”
Why make 70 billion copies of the book? “Oh that was a bit of fun,” says Dr. Church. “We calculated the total copies of the top 200 books of all time, including A Tale of Two Cities and the Bible and so on, and they add up to about 20 billion. We figured we needed to go well beyond that.”