22 November 2013

Short encyclopedia "-omik"

"Omiki" – the era of big biology

Victoria Korzhova, "Biomolecule"

Thanks to the sensational Human Genome project, there are more and more words with the suffix "-om". The appearance of a large number of new "ohms" following the genome and proteome is evidence of an important trend in the world of modern biology. More and more large-scale studies are being conducted, the result of which is not a description of individual molecules, but large arrays of complexly organized data, as described in this article.

In 1920, the botanist Hans Winkler could not even guess what fate awaits the term "genome", which he proposed to refer to the set of chromosomes of an organism. Some "oms" already existed at that time: for example, a biome (a collection of living organisms) and a rhizome (the root system of a plant). All of them are based on the Greek suffix "-om", meaning "having nature". But it was the popularization of the word "genome" with the participation of the Human Genome project [1] that led to the emergence of a fashion for omas and omics. Alexa McCray, a linguistics and medical information specialist at Harvard, comments: "By using the suffix "-om", you show that you belong to a completely new fascinating field of science" [2].

In recent years, scientists have begun to realize the marketing potential of this inspiring suffix. Jonathan Eisen, a microbiologist at the University of California, Davis, notes: "People try to convince others that their field of research is an independent branch of science, and that it deserves special funding" [2]. And, despite the fact that the names of some omics make you raise an eyebrow in surprise (for example, ciliomics is the study of various outgrowths on the surface of cells), researchers are convinced that some of them really deserve the right to be a separate field of research. At the same time, some omics have already firmly taken their place in modern biology – for example, genomics, transcriptomics, proteomics and metabolomics – the names of others still sound unusual, but they all reflect the movement towards a new "big", integrative biology. Some of these new disciplines will be discussed in this article.

CLASSIC "OMS"Genome

In our "post-genomic" era, it is not easy to find someone who has not heard about the Human Genome project [1].

To describe it briefly: 13 years (1990-2003), three billion nucleotides, three billion dollars. Not all scientists' expectations have been met (the DNA sequence has been decoded, but it is not always clear what it encodes), but we owe much to the technological breakthrough in genetic research of the last decade to the work on the human genome. After him, the genomes of other mammals began to be actively sequenced: 2002 – the mouse genome, 2004 – rats, 2005 – chimpanzees, 2007 – macaques [10] and so on (at the moment, the genome sequences of almost 30 mammals are known, and then this number will only grow). In addition, the decoding of the human genome has led to the emergence of specialized genomic projects, the purpose of which is to describe the work of a certain group of genes associated with the work of individual organ systems or the development of a disease.

TranscriptomeA transcriptome is a collection of all RNA molecules that are synthesized in a cell, in some organ or tissue.

Interestingly, although the transcriptome is a product of the expression of our genome, none of them provides a complete description of the other. This is due to the fact that, on the one hand, there is a lot of so-called "junk" DNA in the genome, which does not encode anything (at least, it seems so). On the other hand, there are processes that alter RNA after transcription: for example, the process of RNA editing, which, according to recent studies [11], is very widespread and occurs on more than 90% of all mRNAs. In addition, we must not forget that the transcriptome contains not only protein-coding mRNAs, but also other types of RNAs – ranging from tRNAs and rRNAs to various types of small regulatory RNAs [12].

The genome sequence is a more or less constant characteristic of an organism (although there are exceptions - for example, the sequences of some genes are strikingly different from each other in the DNA of lymphocytes of one person). A transcriptome can be a permanent characteristic of an organ, tissue, or a separate cell population, because different cell types perform different functions and express different genes, and it can also depend on environmental conditions and change over time. That is why recently scientists have been increasingly engaged in research on the transcriptome of cells of a certain type (for example, embryonic stem cells) or individual organs (for example, the transcriptome of the human brain [13]).

ProteomeSince different cells express different genes at different times, not only will the set of RNA not be the same throughout the body, but the set of proteins will also differ.

This consideration prompted scientists to study the human proteome – to create a complete list of proteins that are present in different human cells and tissues at any given time. Scientists formed the international organization Human Proteom Organization (HUPO), which headed the Human Proteom Project (HPP), launched in 2008 (Biomolecule has already written about this event [14]). One of the difficulties of this project is the incredible diversity of proteins in the human body, because one gene can provide the synthesis of several variants of one protein, which in the future may undergo additional chemical modifications. As a result, HPP split into two projects – C-HPP and B/D-HPP. In the first of them, different groups of scientists study proteins encoded on one or another chromosome (chromosome 18 is studied by a group of Russian scientists at the Research Institute of Biomedical Chemistry. Orekhovich in Moscow). In the second project, groups of proteins are studied according to their biological role or involvement in the development of certain diseases. To date, the study of the human proteome is still in its initial stage, at which scientific groups are looking for new approaches to protein analysis and selecting bioinformatic algorithms [15], however, it is hoped that the first successes of this project are not far off.

The metabolomeThe word "metabolome" describes a collection of small molecules-metabolites that can be found in a cell, tissue or the whole body.

Metabolites include molecules with a molecular weight of no more than 1 kDa (these are both small peptides, for example, some hormones, and other biologically important organic substances – antibiotics, lipids and other secondary metabolites). Currently, all the results of the study of the metabolome are collected in a single database – the Human Metabolome Database. Now this database contains data on more than 40 thousand different metabolites. An account has been created for each of these substances – MetaboCard – which not only exhaustively describes the chemical properties of the metabolite, but also what proteins or nucleic acids this substance can interact with and what significance it has in clinical practice (connection with diseases or drugs).

Currently, metabolomics helps scientists to investigate both the physiology of the human body and to detect or treat various diseases. One of the broad applications of metabolomic research is the search for biochemical markers of various diseases, for example, for Parkinson's disease [16]. In such studies, scientists are trying to detect substances whose concentration changes in the blood can help diagnose at an early stage and start treatment in a timely manner [17].

IncidentalomeThe term "incident" was first used by Isaac Kohane, who studies the problems of medical information at the Children's Hospital in Boston.

Despite the fact that at that time there were no modern technologies that made the personal genome a reality, in his 2006 article [3] Kohane expressed concern that the increasing availability of genetic information would soon lead to a complex ethical problem in medicine.

The unusual name comes from the slang term of doctors – "incident" (from the English incident – accident) – an asymptomatic tumor found during examination of the patient in connection with other complaints. Something similar happens when studying the human genome – unexpected information turns out that no one was looking for. Searching for the genetic causes of hearing problems in a child, for example, may reveal an increased risk of developing heart disease or cancer in older age. But is it worth informing the patient about this, and if so, when?

A study conducted in 2012 [4] showed the extent of the ethical problem. Among 16 geneticists, a survey was conducted on a number of mutations involved in the development of 99 common genetic diseases. These mutations can be detected during full-scale genome sequencing, regardless of whether the doctor needs it or not. In about a quarter of cases of these diseases and related mutations, all 16 interviewed specialists expressed their willingness to inform their adult patients about the sequencing results. But only 10 would have done it for Huntington's disease [5] – an incurable neurodegenerative disease – and there was even less agreement about some other complex diseases, and what to tell parents if mutations are found in their child.

The biggest problem of sequencing the personal genome is the presence of a large number of variations in the human genome, the role of which in maintaining human health is still unknown. One of the possible ways to solve the incident genome problem is to give the patient a choice of which information about his genome he would like to know and which he would not.

"Unfortunately, I can't hire you because we don't like your DNA sequences that are responsible for character." In addition to the incident genome problem, the ability to determine the sequence of a personal genome is fraught with other dangers. For example, how will your boss behave if he learns something about the features of your DNA?Hair dryer (phenome)

With the development of new generation sequencing methods [6], "reading" the human genome has become not such a difficult task.

What is missing is phenomes: an accurate description of the phenotype – that is, all the physical and behavioral characteristics of a person. Most of all, researchers are interested in those characteristics that are associated with diseases: pathologies of appearance, when and why the diagnosis was made. Moreover, it would be good to have these descriptions in a form that is accessible to a computer in order to link phenotypic parameters with the features of the genome.

As often happens in biology, research in a new field began with laboratory organisms. Phenomic projects are already underway for mice, rats, yeast, danio fish and arabidopsis plants. The best approach for these studies is to sequentially turn off individual genes and study the changes in appearance, behavior and metabolism that will follow such a mutation. This, of course, cannot be applied to a person, but specialists hope to get the necessary information by carefully recording the patient's medical history, although there are many difficulties waiting for them here.

Even for "Mendelian" diseases, which are caused by a mutation in one gene, it is not always easy to detect the cause gene. Of more than six thousand rare hereditary diseases, less than half were able to determine their genetic basis. One of the problems in this area is to find a sufficient number of patients, since some diseases occur in one person out of a million. "Perhaps we would have dealt with most of the "Mendelian" diseases if we had access to a sufficient number of well–described cases," says Michael Bamshad, a geneticist at the University of Washington in Seattle.

To do this, you need to process patient records from different countries and continents. At the same time, many research and medical centers have long had an established system of terms for describing and characterizing various deviations. Because of this, it can be difficult to combine sources, because if the same symptom is described by one doctor as "stomach pain" and by another as "gastroenteritis", then these patients cannot be combined into one group, explains Richard Cotton, a geneticist from the University of Melbourne in Australia.

In November 2012, Cotton was one of the participants of the congress "Preparation for the Human Phenome Project" [7] in San Francisco (USA). The main task of the congress was to make the exchange of phenotypic information between scientists easier and more convenient. A consortium for the study of rare diseases, called Orphanet, is trying to get doctors and researchers to agree on one or two thousand standard terms. This will help to put in order the often fragmented and confusing electronic medical records so that computer programs can automatically sort and process them.

InteractomeThe central dogma of molecular biology leads us directly to the three main "ohms" – the genome (DNA), transcriptome (RNA) and proteome (proteins).

But to understand the structure of living organisms, it is not enough to describe all the components of living systems, you also need to figure out how they interact. Everything in living organisms – the life and death of individual cells, the development of the embryo from the zygote and the work of neurons – is ensured by the interaction of molecules with each other. The term "interactome" comes from the English to interact – to interact – and describes all possible interactions of molecules with each other. In terms of complexity, it can be called the "king" of ohms: considering only paired interactions for the known 20 thousand proteins, we will already get about 200 million variants.

But some scientists are not intimidated by the scale of the task facing them. Marc Vidal, a systems biology specialist at the Cancer Research Institute in Boston, hopes to see a rough sketch of all the interactions encoded by the genome before he retires. "This is what we have been working on for the last 20 years, and we are already very close to our goal," he says [2].

Interactome of yeast membrane proteins. The proteins indicated by circles are grouped into several groups (proteins of EPS, peroxisomes, plasma membrane, etc.). The lines connecting the circles show an interacting pair of proteins. Image from the website of Dr. Vodak's laboratory (Wodak).To date, Vidal's team and several other laboratories have described about 10-15% of protein-protein interactions in the human body.

To do this, they used special genetically modified cells that signal if the studied pair of proteins interacts. Other scientists achieve this by extracting proteins from artificially destroyed cells and analyzing protein pairs that can be detected. Still others study the literature and develop methods for computer prediction of possible interactions based on the spatial structure of proteins. It is important that scientists are now beginning to understand how to separate the wheat from the chaff and isolate natural interactions, discarding false results. One of the important criteria for such selection is the ability to get the same result when using different techniques. But even in the conditions of an unfinished interactome, scientists can already and are beginning to turn more and more actively to the data already obtained in this area.

Haiyuan Yu, a systems biologist at Cornell University, and his colleagues tested about 18 million potential protein pairs and looked through all available databases in order to identify 20,614 interactions between 7,410 human proteins. For about a fifth of these proteins, researchers can name interacting regions (domains) of proteins. They found that disease-related mutations are most often found precisely at the sites of contact of the damaged protein with other proteins. For example, a blood disease – Wiskott-Aldrich syndrome – occurs when there is a mutation in the WASP protein, but only if this mutation falls into the site by which WASP interacts with the VASP protein. As Yu notes, genetic differences that do not explain anything to us in the study of gene sequences acquire a special meaning in the study of protein interactions.

Vidal believes that all information about the interactions of proteins can be decomposed into two levels, which in total will make up a complete interactome. The basis should be a description of all paired interactions, a level higher – a descriptive characteristic of these contacts (how long it lasts, under what conditions it occurs, and which parts of proteins interact).

In the not too distant future, Vidal believes, the doctor will involve in the diagnosis not only the sequence of the patient's genome, but also carefully analyze all the consequences of changes in their interactome, not to mention the impact of these changes on the hair dryer. The genome, after all, is absolutely static, and it is the interactome that changes under the influence of external factors.

ToxomeThomas Hartung wants to know everything about how small molecules can harm a person.

To do this, he created the Human Toxome Project, which has now existed for more than six years. The suffix "-om", according to Hartung, is intended to emphasize the large–scale nature of the project, the purpose of which is to describe all cellular processes associated with the manifestation of toxicity.

Testing the toxicity of a particular substance with the help of laboratory animals costs researchers and government organizations millions of dollars, but even so, laboratory tests can incorrectly predict the reaction of the human body. Every sixth drug faces the problem of toxic effects at the stage of clinical trials involving humans. Hartung believes that toxom could help in the development of convenient and cheaper laboratory tests that will be based on human cells and can replace animal studies. Understanding which cellular pathways are affected by the substance under study can help scientists develop less toxic analogues.

To begin with, Hartung plans to expose cells to various toxic substances and monitor changes in their metabolome and transcriptome. He hopes to discover in which places of metabolic or signaling cascades disorders occur that lead to changes in the work of hormones, poisoning of liver cells, changes in heart rate or other abnormalities in the work of the human body. According to Hartung, the total number of such intracellular pathways will be only a couple of hundred – a small enough number to create toxicological tests. While the project is still at the initial stage of development, scientists are trying different experimental approaches and looking for those that give the same result in different laboratories.

Of course, we must not forget that even if the substance looks safe when tested in cell culture, when it enters the body it can behave differently, for example, turn into a toxin as a result of treatment with liver enzymes. But even taking into account possible errors, the development of new toxicological tests using human toxome should greatly simplify the testing of medicinal and nutritional substances and save not only public money, but also the lives of laboratory animals.

Integrome (integrome)The way to unravel the most complex mysteries of biology lies, according to Eugene Kolker, not in the creation of new ohms and omics, but in the unification – integration – of those that already exist.

Welcome integrom – information on all ohms in one boiler, which, thanks to a generalizing analysis, can reveal a lot of new and interesting things.

Imagine Google maps: several maps showing separately the location of streets, gas stations and restaurants would be much less useful to us than knowing that a gas station is located on a particular street next to a restaurant. But most modern omics stop precisely at the stage of creating lists – genes, proteins, RNA. This approach excludes the study of interactions and misses a lot – for example, that changing two unrelated proteins can lead to the same result, because their metabolic pathways partially overlap.

Such beautiful trees are obtained by visualization using the method of the Ideker laboratory for computer creation of an integrome [8]. Circles are groups of genes created according to a certain attribute, the size of the circle determines the size of the group, and the color saturation is the degree of proximity of gene sequences within the group.In the laboratory of Trey Ideker, an approach has been developed that can make the creation of an integra a reality in the near future.

Scientists have created a method for automated analysis and combining individual omics: a computer program studies several databases and searches for general principles by which it is possible to determine the function of a gene, and then uses the knowledge obtained in this way to classify genes that have not yet been studied (Fig. 3) [8]. Thus, not a traditional approach is used, in which a theoretically developed system of concepts is used to explain the data, but a new descriptive system is created based on empirical data. The first testing of the system was carried out on databases on yeast-saccharomycetes, and the results inspired enthusiasm in the researchers. Such computer algorithms, of course, will not be able to replace human curators, but they will be a good addition and help in facilitating their work.

In 2012, Michael Snyder, a geneticist from Stanford University, published his personal integrome (although he calls it a "generalized personal profile of omik", and some other scientists with a considerable degree of irony – "narcissus"), combining data on his genome, transcriptome, proteome and metabolome. The DNA sequence of Snyder's genome revealed an increased risk of developing diabetes, and during the work on the project, doctors actually revealed his elevated blood sugar level [9]. Interestingly, some other biochemical abnormalities that were not previously associated with the development of this disease were also identified in Snyder's integrome.

Integrative BiologyIn recent years, the development of large–scale research in biology has been rapidly gaining momentum: various omas are being created not only in the field of molecular biology, but also in other areas - for example, a connectome to describe the connection of all neurons in the human brain and animals, or a microbiome to describe communities of microorganisms living in the human body.

At the same time, a new trend is emerging – to combine existing databases to clarify the relationship between different systems in a living organism, to create projects in which scientists from different fields of biology complement each other's experiments to create a new integrative biology.

NEW GENOMICSGenome and Consciousness – cognitive genomics

Cognitive genomics specialists are interested in genes and non-coding sequences that are necessary for the development and functioning of the brain.

By comparing the genomes of different animals, they try to determine which genes provide the characteristics of the human nervous system, its behavior and intellectual abilities. These approaches are also used to identify genetic factors in the development of diseases of the nervous system (Down syndrome, Alzheimer's disease, etc.). For example, the Laboratory of Cognitive Genomics at the Beijing Institute of Genomics studies both human intelligence and its violation – prosopagnosia, and the Institute of Cognitive Genomics named after Stanley (USA) – various cognitive diseases, for example, schizophrenia, bipolar disorder and autism.

Researchers from the University of North Carolina Psychiatric Genomics Consortium have recently obtained interesting data in this area. They conducted genome-wide screening of single-nucleotide replacement (SNP) sites for patients with five different diseases: autism, attention deficit hyperactivity disorder, bipolar disorder, depression and schizophrenia [18]. Genome regions were identified, changes in which correlate with the development of all five diseases. It turned out that these changes affect the genes of calcium channels.

Calcium plays an important role in the work of all cells of the human body, but it is especially important for nerve cells, because it is necessary for the transmission of chemical signals between neurons. Therefore, on the one hand, the violation of calcium transport in various diseases of the nervous system does not surprise scientists ("calcium" hypotheses of pathogenesis have long existed for a number of neurodegenerative diseases, including Alzheimer's and Huntington's diseases [5, 19]). On the other hand, this study has shown that a much larger number of mental and neurological abnormalities can be based on the same mechanisms. In addition, according to the authors of the article, their results indicate in favor of the possibility of using in diagnostic psychiatry not only descriptions of external mental symptoms, but also genetic data.

Genome and drugs – pharmacogenomicsPharmacogenomics, which has been actively developing in recent years, appeared as a result of the interaction of experimental pharmacogenetics with young genomics.

The result is the field of genomics, which investigates how the totality of a person's hereditary information can affect the effect of medications taken by this person. Important importance is attached to pharmacogenomics by pharmaceutical companies using the approach of rational drug development (drag design [20]) or developing methods of personalized medicine, which should become publicly available in the near future, due to a reduction in the price of sequencing the personal genome [21, 22]. At the same time, pharmacogenomics becomes the basis for creating mathematical models and computer modeling in the pharmaceutical industry.

"Mathematical modeling methods are most often used to make decisions on the approval of drugs and instructions for them, special attention is paid to the dosages of medicines. In addition, using these methods, FDA experts evaluate the success of clinical trials of drugs and identify the most effective doses of drugs for various groups of patients," Donald Stansky, Vice president of Novartis, head of the Department of Mathematical Modeling, notes in an interview. And if now the division of patients in clinical trials into groups is carried out most often according to a few external indicators – age, gender, weight, etc., then the selection of groups based on pharmacogenomics data is not far off.

Literaturebiomolecule: "Human genome:

  1. how it was and how it will be";
  2. Baker M. (2013). Big biology: The ’omes puzzle. Nature 494, 416–419;
  3. Kohane I.S., Masys D.R., Altman R.B. (2006). The incidentalome: a threat to genomic medicine. J. Am. Med. Assoc. 296, 212–215;
  4. Green R.C et al. (2012). Exploring concordance and discordance for return of incidental findings from clinical sequencing. Genet. Med. 14,405–410;
  5. Biomolecule: "How to save the Thirteenth? (Prospects for the treatment of Huntington's disease)";
  6. biomolecule: "454-sequencing (high-performance DNA pyrosequencing)";
  7. Oetting W.S. et al. (2013). Getting ready for the human phenome project: the 2012 forum of the human variome project. Hum. Mutat. 34, 661–666;
  8. Dutkowski J. et al. (2013). A gene ontology inferred from molecular networks. Nat. Biotechnol. 31, 38–45;
  9. biomolecule: "Reproaches in narcissomics";
  10. Biomolecule: "The time of monkey research: the rhesus macaque genome has been decoded";
  11. Hayden E.C. (2011). Cells may stray from ‘central dogma’. Nature News;
  12. biomolecule: "About all RNAs in the world, large and small";
  13. Biomolecule: "Allen Brain Atlas: Brain Transcriptome";
  14. biomolecule: "Billion for proteomics";
  15. Marko-Varga G. et al. (2013). A First Step Toward Completion of a Genome-Wide Characterization of the Human Proteome. J. Proteome Res. 12, 1–5;
  16. Sharma S. et al. (2013). Biomarkers in Parkinson’s disease (recent update). Neurochem. Int. 63, 201–229;
  17. biomolecule: "How to recognize cancer using biomarkers?";
  18. Cross-Disorder Group of the Psychiatric Genomics Consortium. (2013). Identification of risk loci with shared effects on five major psychiatric disorders: a genome-wide analysis. Lancet 381, 1371–1379;
  19. Bezprozvanny I.B. (2010). Calcium signaling and neurodegeneration. Acta Naturae 2010 #2, 72–82;
  20. biomolecule: "Drag design: how new medicines are created in the modern world";
  21. Biomolecule: "Over a thousand: the third phase of human genomics";
  22. Biomolecule: "Sequencing of single cells (version – Metazoa)".

Portal "Eternal youth" http://vechnayamolodost.ru22.11.2013

Found a typo? Select it and press ctrl + enter Print version