26 April 2011

Human genome: ENCODE will help you understand what you read

ENCODE: online encyclopedia of the human genome
Dmitry Tselikov, Compulenta 

The web publication of an extensive database continues, in which the functional elements of the human genome are catalogued - genes, RNA transcripts, etc.

The ENCODE project (Encyclopedia Of DNA Elements, "Encyclopedia Of DNA Elements") is the first attempt at a comprehensive interpretation of the human genome, as well as a guide to using a huge amount of data.

One of the main participants of the project Ross Hardison from the University of Pennsylvania (USA) notes that ENCODE is on the heels of the 13-year program "Human Genome", aimed at identifying all the genes in human DNA (there are 20-25 thousand of them). It was also based on interdisciplinary and open data exchange.

There are about 3 billion base pairs in the human genome; cataloging and interpreting this information is a truly monumental task. "We're not just looking for genes that provide information for cells and proteins," says Mr. Hardison. – We also want to know what determines the production of proteins in certain cells at the appropriate time. The search for DNA elements that control regulated gene expression is one of ENCODE's main tasks. Decoding the human genome without interpretation is just a description of a cipher without a key, just a huge pile of letters."

In particular, ENCODE provides information about where proteins bind to DNA and where sections of DNA increase due to additional markers. These proteins and chemical additives are the key to understanding how different cells of the human body interpret the language of DNA.



For example, scientists know that DNA variants located in front of the MYC gene are associated with several types of cancer, but until recently, the mechanism of this connection remained a mystery. It was the ENCODE project that showed that these variants are able to change the binding of certain proteins, and this leads to an increase in the expression of the MYC gene and the development of cancer. Thousands of other DNA variants have been studied in a similar way.

The project staff uses about twenty different tests. They have 108 cell lines at their disposal. John Stamatoyannopoulos from Washington State University (USA) notes that many molecular biological procedures for measuring the activity of biochemical agents, which are now of fundamental importance for biology, were created precisely within ENCODE. The same situation is with computing tools for processing and interpreting large-scale functional genomic data.

Ross Hardison recalls that the part of the human genome that encodes proteins is only 1.1% – but this is an abyss of information. The situation is complicated by the fact that most of the mechanisms of gene expression and regulation lie outside the coding region of DNA. And the set of tools for studying the genome is very limited. The most common is interspecific comparison. For example, you can compare a human and a chimpanzee. There are very few differences between proteins and other DNA products of these species, but the expression of genes at the basic level that determines eye color, height, and propensity to a certain disease varies quite a lot. This is where ENCODE's help is needed.

The project participants talked about it in the journal PLoS Biology: A User's Guide to the Encyclopedia of DNA Elements (ENCODE).

Prepared by ScienceDaily: Decoding Human Genes Is the Goal of a New Open-Source Encyclopedia.

Portal "Eternal youth" http://vechnayamolodost.ru26.04.2011


Found a typo? Select it and press ctrl + enter Print version