Using DNA as a high-den
The world is transmitting more data today than ever in history. This is likely to increase almost six times between 2020 and 2025.
The International Data Corporation predicts that the global data storage demand will increase from the 33 zettabytes in 2020 to 175 zettabytes by 2025 – a figure far beyond the storage capacity of currently available methods. This growing figure, coupled with the costs and energy requirements of maintaining and transferring data, calls for novel solutions.
The prospect of using DNA as a method for storing data has
been considered since the 1950s. "DNA is the molecule that stores all the
instructions needed to create us as human beings. That's a lot of information, and
we have a copy of all that information in every single cell in our body,"
said Dr. Keith EJ Tyo, associate professor of chemical and biological
engineering at the Center for Synthetic Biology, Northwestern University.
Cells vs computers
In biological cells, DNA comprises four nucleic acid bases,
adenine (A), thymine (T), guanine (G) and cytosine (C), which are strung
together in different combinations to form genes. Genes are transcribed andtranslated into proteins, the functional "workhorses" of our cells.
In computers, information is stored as binary digits, or "bits",
which are 1s and 0s. When these bits are read in a particular order, they can
be used as code to instruct programs to be conducted. The overall goal of
DNA-based data storage is to encode and decode binary data to and from
synthesized strands of DNA.
While the field holds huge potential, current drawbacks
associated with ex-vivo synthesis of DNA have limited its large-scale
application. "State-of-the-art DNA chemical synthesis is expensive
relative to other storage technologies, such as solid-state hard drives,"
Tyo explained.
Tyo and colleagues at Northwestern have developed a novel invitro method for recording information to DNA that relies on an enzymatic
system. The method, Time-sensitive Untemplated Recording using TdT for Local
Environmental Signals, or TURTLES, is published in the Journal of the AmericanChemical Society.
"We have been working for some time on engineering DNA
as a storage medium. For a long time, we studied the most common DNA
polymerases that replicate DNA. We realized that this type of replicative
polymerase can stall for a relatively long time, and that would not be
acceptable for data storage. Instead, we started looking at non-replicative DNA
polymerases and found one with exactly the right properties," he said.
Recording the environment using DNA
The DNA polymerase, called terminal deoxynucleotidyl
transferase or TdT, adds nucleotide bases to the 3'- end of single-stranded
DNA. Its selectivity can be characterized depending on the physiological
conditions and environment of a cell. Tyo explained, "Additional links are
added sequentially to one end [of the DNA strand]. When a chemical – for
example, cobalt – is present, the DNA that is added tends to have more As and
less Gs. When cobalt is not present, the opposite is true. Therefore, we record
a 1 when we allow the links to be added when cobalt is present, and a 0 when
cobalt is absent." When the DNA is later read to decode the information,
if a region has a lot of A bases present, it is recorded as a 1. If a lot of G
bases are present, it is recorded as a 0.
In this way, as the environment changes, the composition of
the DNA strand being synthesized also changes. The average rate of nucleotide
incorporation can also be used as a "time stamp" to indicate exactly
when the environmental change occurred. "We engineer TdT to allosterically
turn off in the presence of a physiologically relevant concentration of
calcium. We use this engineered TdT in concert with a reference TdT to develop
a two-polymerase system capable of recording a single-step change in the Ca2+
signal to within one minute over a 60-minute period," the authors write in
the paper.
When asked how the speed of the method compares to
chemically synthesizing DNA, Tyo said, "Chemical DNA synthesis works by
tethering DNA to a surface and requires adding chemicals sequentially and then
washing them away before adding the next chemical. This washing step creates
slowness. Our approach does not require washing steps and instead all the
reagents for DNA synthesis stay in the mixture and the properties of the DNA
polymerase are modulated reversibly."
A call for innovation and collaboration
The study is a proof of principle demonstration where the
team were able to report up to 3/8 of a byte of information in one hour. There's
potential to scale-up here, Tyo said: "A digital picture is millions of
bytes and takes a fraction of a second to read and write to your hard drive.
Parallelization to millions of strands of DNA will allow significantly more and
faster data storage, but we are going to address technical hurdles to increase
the number of bytes and shorten the record time of one DNA chain."
“This is a really exciting proof of concept for methods that
could one day let us study the interactions between millions of cells
simultaneously. I don't think there's any previously reported direct enzyme
modulation recording system,” said Namita Bhan, co-first author and a former
postdoctoral researcher in the Tyo lab.
Tyo and colleagues hope that other research teams across the
globe will contribute to the technology, in order for it to become viable as a
commercialized concept. "Who knows, maybe if they come up with something
really good, they can send us a letter explaining what they've done on a strand
of DNA!" he concluded.
Keith EJ Tyo was speaking to Molly Campbell, Science Writer for Technology Networks.
0 comments:
Post a Comment