24 April 2015

Will DNA computers replace hardware?

What will DNA computers of the future be capable of?

Alexander Makarov, Sergey Apresov
"Popular Mechanics" No. 9, 2014
Published on the website "Elements"

According to the forecast of the IDC agency, by 2020 the volume of data created and stored by mankind will reach 40,000 exabytes. That's 40 trillion gigabytes, or 5,200 gigabytes per capita. Less than 100 g of DNA would be enough to store all this information. This fact makes us sincerely believe in the prospect of the development of DNA computers.

Packing containers of equal weight, finding the shortest route between several destinations, decrypting encoded data – what can these tasks have in common? The answer is simple – they are too complex for modern computers.

A classic example is the ancient problem of the Konigsberg Bridges, in which it was asked how to walk across all seven bridges of the city without walking on any of them twice. The problem was first solved in 1736 by the great Leonhard Euler, who was born in Switzerland, but lived and worked in Russia, at the St. Petersburg Academy of Sciences, for almost half his life. Russian Russian was well known to Euler, and he published many of his works in Russian.

Euler's work laid the foundations of graph theory, which allows us to formalize such problems. The points of the route (banks) in it are called the vertices of the graph, the transitions between the vertices (bridges) are called edges. Each edge has a weight that characterizes the complexity of this transition (the distance to be traversed). Euler found out that it is impossible to walk on each bridge of Konigsberg only once. But this does not negate another, more important task: how to bypass all the bridges of the city in the shortest way (the traveling salesman's task)? The complexity of this and similar tasks lies in the fact that today there is no known algorithm for solving them, except for a complete search of options. At each subsequent vertex of the graph, the problem splits into many similar problems, and the number of possible solutions increases exponentially.

Silicon computers are based on a consistent principle of problem solving. One by one, the computer adds up possible routes, checks their compliance with the conditions of the task, calculates their length, compares the results and identifies the shortest path. To solve the problem with 30 bridges in the most straightforward way, called the lexical search method, it would take more time than the age of the universe.

Fortunately, there are algorithms that allow silicon computers to solve relatively complex combinatorial problems in an acceptable time. But there is another way – calculations with high parallelism, allowing you to analyze all possible solutions to the problem at the same time. This is exactly what future DNA computers will do.

BioautomatInterestingly, the creator of the first DNA computer, Leonard Adleman, is known primarily as an outstanding cryptographer.

In the name of the RSA encryption algorithm, without which global finance is unthinkable, the third letter denotes his last name (Rivest – Shamir – Adleman).

In 1994, Adleman repeated Euler's experience by proposing his own solution to the traveling salesman problem for a graph with seven vertices. In this he was helped not by the usual piece of silicon, but by several test tubes, each of which contained billions of billions of DNA molecules – biological nanocomputers. Back in 1953, Nobel laureates Francis Crick, James Watson and Maurice Wilkins, who deciphered the structure of DNA, compared this molecule with a Turing machine, a hypothetical forerunner of modern processors.


Deoxyribonucleic acid
The DNA double helix is nothing but a program code,
common to all living organisms on Earth.
In terms of simplicity and versatility, it stands
at the same stage with the machine
binary code.

Recall that DNA is two helices connected by pairs of nitrogenous bases. Spirals are giant macromolecules consisting of deoxyribose and phosphate groups. If we draw an analogy with a Turing machine, a spiral is a punched tape on which the program code is written.

The code consists of four letters denoting nitrogenous bases: A – adenine, T – thymine, C – cytosine and G – guanine. The nitrogenous bases of two adjacent spirals are attracted to each other, with adenine connecting only with thymine, and cytosine with guanine. Thanks to the nitrogenous bases, the two helices, which are a mirror image of each other, are connected into one DNA molecule.

DNA in a living organism is constantly replicated with the help of molecular machines – enzymes. Helicases split the double helix into two complementary helices. Polymerase, moving along a single spiral, builds its "mirror" copy. Such a copy is called the Watson-Crick complement.

An abstract computing machine, described in 1936 by Alan Turing, was an infinite tape divided into frames with incoming data, and a control device moving along the tape, reading data and changing its state according to some internal algorithm. What is not DNA and polymerase?


A schematic diagram of the work of DNA computing units.
In this form, the graph of the problem was presented in the work of Leonard Adleman.
As can be seen from the graph, for the work of a DNA computer to a scientist
I had to sequence 20 oligonucleotides,
7 of which represent vertices
and 13 – ribs.

LEGO Cube SoupTo solve the problem for a graph with seven vertices, Adleman used the simplest extensive algorithm: generate all possible routes; exclude all paths that do not pass through the specified start and end points; exclude all paths that pass more than seven vertices; exclude all paths that pass more than once through one vertex.

The DNA computer is somewhat like a LEGO constructor. Imagine that there are cubes denoting the vertices of the graph, and jumpers capable of connecting the vertices into chains of any length.

In Adleman's computer, each cube was a DNA fragment with 20 nitrogenous bases. Such short pieces of DNA are called oligonucleotides and are encoded by oligonucleotide synthesis. This process has been worked out and put on stream for a long time. There are even commercial firms that can synthesize and send oligonucleotides with the necessary codes in a matter of hours. And "20-bit" cubes are quite enough to denote the seven vertices of the graph.


Synthesis of oligonucleotides.
In modern laboratories, the process of creating short DNA fragments with a given code
fully automated. Small scientific groups that cannot afford
own synthesizer, order oligonucleotides from commercial firms.

A separate set of oligonucleotides represents the edges of the graph. The edge code is made up of the halves of the vertex codes that it connects, in a "mirror" representation. For example, for the vertices AASS and TTGG, you need to make a "jumper" CCTT and "flip" it to GGAA. Then, meeting in solution, these three oligonucleotides will combine into a single "segment of the path", forming a double helix.

If you mix billions of 20-letter "vertices" and billions of 20-letter "edges" in a test tube, they will combine into longer molecules in a variety of ways. There is a high probability that DNA encoding all possible pathways through the graph will be in the test tube at the same time. It is necessary to mix oligonucleotides under certain conditions, with the addition of ligase, an enzyme that glues breaks in DNA helices.

At the next stage, it is necessary to find the chains passing through the specified start and end points. The polymerase chain reaction (PCR) method, widely used in microbiology and medicine, will help in this. The necessary building materials for DNA (deoxyribose, phosphates, nitrogenous bases), polymerase, as well as "hook" molecules are added to the solution containing the initial molecules. The "hooks" in our case are the segments encoding the initial and final vertices.


Polymerase chain reaction
It is widely used in medical diagnostics, criminology,
to establish paternity, cloning
and genetic engineering

The solution is alternately heated and cooled. When heated, the original DNA splits into two helices, and when cooled, the helices we need are first recombined with "hooks", and then completed by polymerase to their exact copies. As a result, there are so many necessary DNA that unnecessary ones can be safely neglected.

At the third stage, it is necessary to isolate only those molecules whose length is exactly 140 bases (seven times 20). Gel electrophoresis is used for this. The DNA is placed in a gel-like solution and exposed to electricity. Molecules of different lengths move in an electric field at different speeds and line up "according to growth". Under a microscope, they can be distinguished even visually.


DNA Electrophoresis
It is used to sort DNA strands by length.
To select chains of the desired length with high accuracy, macromolecules are compared
with pre-sequenced chains whose size is precisely known

The fourth stage allows you to select chains containing all vertices. A tiny piece of metal can be attached to a piece of DNA encoding a certain vertex. This segment will easily connect to the molecule containing the corresponding vertex. With the help of a magnet, all such molecules can be separated from the rest. This operation is repeated for each vertex.

At the fifth stage, it is enough to apply the PCR method to what is left in the test tube and send the result for sequencing – the process of DNA decoding, which has become widespread in modern microbiology. If the desired path was not found during sequencing, then the problem has no solution.

Leonard Adleman came to this conclusion. After all, the problem of the Konigsberg bridges has no solution, which was proved almost three centuries ago by Leonard Euler.

And yet he's dancing!A number of researchers continued Adleman's research, in particular, proposing to encode the weights of the edges by repeatedly repeating a 20-digit code.

Thus, edges with a weight of "five" are five times longer than edges with a weight of "one", but they are connected to the same vertices.

Using this method, you can find the shortest path through a graph with a given number of vertices. It is necessary to first select the chains passing through all the vertices using magnetic sorting, and then apply gel electrophoresis to find the shortest path.

In 1994, Adleman spent seven working days on all the described operations, while the most ordinary computer would have solved the problem by brute force in a matter of seconds. The scientist himself commented on this fact as follows: "The miracle is not that the bear dances well, but that he dances at all." Adleman proved that DNA is quite capable of parallel computing, and the only question is how to make molecular computing convenient and technologically advanced.

In the end, if the number of vertices (cities, cargo units, cipher multipliers) is increased to at least fifty, then the silicon processor will save, while the DNA computer, perhaps, will only straighten its shoulders and breathe deeply.

Portal "Eternal youth" http://vechnayamolodost.ru24.04.2015

Found a typo? Select it and press ctrl + enter Print version