| Search View | Molecular Biology | Article View |
| I. | Introduction |
Molecular Biology, branch of biology that seeks to understand the molecular basis of life. In particular, it relates the structure of specific molecules of biological importance—such as proteins, enzymes, and the nucleic acids DNA and RNA—to their functional roles in cells and organisms.
| II. | The Structure of DNA |
The field of molecular biology effectively began with the discovery of the structure of DNA in 1953. Francis Crick and James Watson published the first description of the structure using research performed by Rosalind Franklin and Maurice Wilkins. This discovery was of importance not only because DNA is the molecule that transmits hereditary information from generation to generation, but also because its structure immediately provided an insight into how this transmission is achieved.
In cells DNA is a double-stranded helical molecule in which the two single-stranded chains are joined together by bonds between the bases adenine (A), guanine (G), cytosine (C), and thymine (T). In this structure, an A in one strand always pairs with a T in the other strand, and a G always pairs with a C.
When DNA replicates, the two single strands separate and the information is precisely reproduced. Each single strand becomes double-stranded by an A being inserted in the new strand to pair with a T in the old strand, a G being inserted to pair with a C, and so on. To assure accuracy, there is a proofreading capacity in cells, resulting in identical copies of DNA with each replication. In this way the hereditary information, which controls the properties of the cell and of the organism, is transmitted to daughter cells when a cell divides, and to the offspring when an organism reproduces.
| III. | DNA Makes RNA, and RNA Makes Protein |
After DNA was modeled, scientists began to investigate how hereditary information actually influences the activities of the cell. It was discovered that the DNA is copied, in a process called transcription, into a single-stranded molecule of the related RNA. As in the replication of DNA, the information in the bases of DNA is precisely copied by base pairing to produce the RNA by a family of enzymes called DNA-dependent RNA polymerases. RNA polymerases substitute U for every T found in DNA; that is, whenever an A is found in the DNA strand being copied into RNA, a U is inserted.
After further processing, the so-called messenger RNA (mRNA) moves to subcellular particles called ribosomes, where it is translated into protein. This translation is governed by the genetic code in which each combination of three bases, or triplet, directs the addition of a particular amino acid onto the protein chain: ACC directs addition of threonine, CCC of proline, and so on. Hence the genetic information contained in the linear array of bases in the DNA directs the production of a linear array of amino acids within a protein.
Consequently, genetic changes in the bases in the DNA result in specific changes in the protein produced. For example, an A-to-C change in an ACC triplet would lead to the addition of a proline instead of a threonine. As specific proteins have particular biological effects, changes affecting the function of the protein will lead to an alteration in the appearance or function of an organism. In this way differences in the information in the DNA are observed as inherited differences between individuals, such as eye color, or a genetic disorder such as hemophilia. The conclusion that DNA makes RNA makes protein has been referred to as the “central dogma of molecular biology.”
| IV. | Gene Cloning and Hybridization |
Although the major advances described above were made in the 1950s and 1960s, the explosion in molecular biology began in the 1970s with the development of techniques for gene cloning. These techniques allowed the isolation of large amounts of a pure DNA fragment, free from all the other DNA sequences that together constitute the organism’s genome (all the genes in the chromosomes). This process enabled a DNA fragment—perhaps representing a particular gene—to be generated and characterized.
Gene cloning was coupled with the development of hybridization procedures in which a cloned DNA molecule is radioactively labeled and then made single-stranded. Some techniques use nonradioactive labels. The resulting molecule will bind by base pairing to any DNA or RNA that contains the same linear order of the four bases. As a result, it can be used as a probe to locate a particular DNA sequence within a DNA sample. The procedure used to accomplish this is known as Southern blotting after its inventor, Ed Southern.
In the related technique of Northern blotting, DNA from a gene is hybridized to the RNA prepared from different tissues, which allows the RNA corresponding to the gene to be detected and quantified in different tissues. These techniques have revealed much information on gene structure and expression.
| V. | Split Genes |
The use of Southern blotting to analyze gene structure led to the biggest surprise obtained in molecular biology studies so far. This was the finding that in plants and animals the regions of the DNA that contain the information coding for the protein, known as exons, are interrupted by other DNA sequences, known as introns. Introns are transcribed into a single RNA molecule with the exons and are then removed by the process of RNA splicing. Splicing takes place in the cell nucleus, producing the mRNA molecule in which the exons are appropriately joined together without any intervening DNA. This mRNA is then transported from the nucleus to the cytoplasm and translated into protein on the ribosomes.
Although the significance of introns is unclear, their existence does allow different combinations of the exons present in the initial transcript to be joined together in different cell types. This process, known as alternative splicing, results in the production of different but related proteins from the same gene.
| VI. | Transcriptional Control |
Northern blotting can be used to investigate the presence of mRNA molecules derived from different genes in extracts of whole tissues. Such studies can be complemented by in situ hybridization that can identify the mRNA within individual cells, allowing its distribution within a tissue to be characterized. Another technique for studying the transcription rate of genes in cells and tissues is known as RT-PCR. This technique is more sensitive than Northern blotting, and measures the relative amount of mRNA in a cell. A fourth technique, called real-time PCR, can quantitatively distinguish the frequency of mRNA in cells.
These studies lead to the conclusion that, in the vast majority of cases, the mRNA encoding a particular protein is only present in tissues and cells that express the protein. Similarly unspliced precursor RNAs still containing introns are not detectable in tissues that do not contain the spliced mRNA or the protein.
This indicates that, in most cases, the production of different proteins by different tissues is determined by controlling which genes are transcribed in each tissue with the subsequent stages of intron removal and translation following automatically. This has been confirmed by studies in which the transcription rate of a specific gene was directly measured in different tissues where the corresponding protein is either present or absent.
Therefore, the production of different proteins, which is central to the functional differences between tissues, is controlled at the level of gene transcription. In turn, gene transcription is regulated by proteins known as transcription factors that bind to specific DNA sequences in the regulatory regions of the gene and stimulate transcription. Such transcription factors may be present in only one tissue, producing tissue-specific transcription of the genes that they activate. Alternatively, they may be present in all cells in an inactive form, being activated by specific signals that result in their post-translational modification, for example, by the addition of phosphate residues (phosphorylation). This modification, in turn, leads to the transcription of their target genes.
Some genes, called housekeeping for their essential nature, are expressed in every cell. Actin for the cytoskeleton is one example of a housekeeping gene. Other genes, called inducible, are highly restricted in cell type or time of synthesis. Inducible genes include rhodopsin, which is only expressed in the eye, and nitric oxide synthase-2, which is made in macrophages during inflammation.
| VII. | DNA Sequencing |
In addition to these studies on gene structure and expression, it is also possible to read the linear order of the bases in the DNA using a process known as DNA sequencing. The most common method for this was described by Fred Sanger in 1977 and used in the Human Genome Project to sequence the entire human genome. Through DNA sequencing a gene can be characterized in terms of a linear sequence of AGCT bases that, in turn, can be used to predict the amino acid sequence of the corresponding protein using the genetic code.
DNA sequencing is much easier than direct sequencing of the protein itself, and protein sequences are now normally determined indirectly by sequencing the corresponding gene. Similarly, by sequencing a gene involved in a specific disease from unaffected and affected individuals, it is possible to characterize the alteration in the corresponding protein that results in the disease. This process may involve, for example, a base change that leads to a single amino acid change in the protein or a loss of a DNA segment leading to the loss of a corresponding portion of the protein.
| VIII. | Protein Structure and Function |
In the 1960s the British biochemist John Kendrew determined the structure of myoglobin, a muscle protein, using purified protein and X-ray crystallography. His colleague Max Perutz subsequently determined the more complex structure of hemoglobin, which consists of four myoglobin-like units. However, in the same way that protein sequence analysis is now performed using DNA-based procedures, structural analysis is now normally carried out on protein that has been produced by artificially expressing the equivalent gene, for example, in bacteria. Using this method, the protein can be obtained in large amounts.
Moreover, specific changes can be introduced into the DNA by site-directed mutation and expressing the altered protein in bacteria. Studies on the activity and structure of the altered protein can then be used to relate changes in amino acid sequence to their effect on the structure and functional activity of the resulting protein. In this way, the goal of relating structure to function can be achieved.
| IX. | From Study to Applications |
Analysis of DNA has led to the development of many sensitive diagnostic techniques including applications in forensic medicine (crime scene evidence), diagnosis of infectious diseases, and paternity testing. Generating large amounts of specific proteins by expressing the corresponding DNA has been used in medicine. The proteins are used to treat individuals who suffer from diseases caused by the non-production of a specific protein. This technique has been used to successfully produce insulin to treat diabetes, and growth hormone to treat dwarfism.
Similarly, new vaccines have been developed using molecular technology. Attempts have been made to bypass the protein production stage and treat individuals with genetic diseases by supplying the functional gene itself so that the corresponding protein is synthesized in the affected individual. This approach to treatment, known as gene therapy, is in its infancy but offers considerable hope for the future. Molecular biology has enabled scientists to achieve much since the discovery of the structure of DNA in 1953. This progress offers hope that the complex processes underlying embryonic development and the functioning of the adult organism may one day be sufficiently understood in molecular terms to result in new treatments for human diseases.