Messenger RNA is just one of the seven main types of RNA. Some are also involved in protein synthesis (such as transfer RNA). And DNA directly encodes these RNA molecules. So the information flow in this case would simply be from DNA to RNA. The other big exception to this dogma, the flow of information is reversed. Some viruses, for example, have genes made up of DNA. When these viruses infect a cell, the viral RNA synthesizes DNA. So, in this way, the information flow would be from RNA to DNA. But while there is an exception to dogma, the central dogma of molecular biology encompasses the flow of information most important to life. DNA codes for RNA and that RNA codes for proteins.
RNA replication is the copying of one RNA to another. Many viruses replicate in this way. Enzymes that copy RNA into new RNA, called RNA-dependent RNA polymerases, are also found in many eukaryotes where they participate in RNA silencing. RNA editing, in which an RNA sequence is altered by a complex of proteins and a "guide RNA", could also be considered an RNA to RNA transfer.
The triplet code
Once biologists understood the dogma, they understood the general pattern of information flow in the cell. The next challenge was to understand how the sequence of bases in a strand of messenger RNA encodes the sequence of amino acids in a protein. What is the genetic code? What are the rules that specify the relationship between the nucleotide sequence in DNA and the amino acid sequences in a protein? George Gamow suggested a code based on logic. He suggested that each codeword contains three bases. His reasoning was based on the observation that there are 20 amino acids.
Since there are only four unique nucleotides in DNA and 20 amino acids, a combination of base pairs was required to code for the amino acids. If an amino acid were based on a single nucleotide, then there would only be four amino acids. With the same logic, Gamow assumed that the code could not be represented by a combination of two nucleotides, because 4x4 is 16 and there are 20 amino acids. The code must be a three-base code (or triplet code), because it is the simplest code that allows for all 20 known amino acids: 4 X 4 X 4 = 64. This suggests that there could be up to 64 unique amino acids. However, there are only 20 amino acids.
Figure 3. The genetic code. The triplet code mRNA directly encodes the assembly of amino acids that make up a protein. To identify the amino acid encoded by the mRNA sequence, locate the mRNA triplet code (codon), the gray box to its right represents ...
Figure 3. The genetic code. The triplet code mRNA directly encodes the assembly of amino acids that make up a protein. To identify the amino acid encoded by the mRNA sequence, locate the mRNA triplet code (codon), the gray box to its right represents the corresponding amino acid. For example, CCC indicates the amino acid Proline (Pro).
There are many more possibilities of amino acids provided by a triplet code, than the number of amino acids (20) that we see in nature. Thus, the code is said to be redundant, meaning that amino acids can be encoded by more than one triplet code. For example, the CCU and CCC triplet codes of mRNA encode the same amino acid: proline. In fact, all amino acids are encoded by more than one triplet code, except for methionine and tryptophan. Further investigation indicated that a specific triplet code always encoded the same amino acid. In other words, the code is unambiguous. For example, the AUG triplet code in mRNA always codes for methionine. Surprisingly, the code works exactly the same for all living organisms, from bacteria to plants and animals! While there are very few exceptions to this, the consistency of the code in highly variable organisms suggests that we all come from a single common ancestor. The code is universal. Lastly, the code is conservative. If the first two base pairs of the mRNA are the same but the third is different, there is a high probability (but not absolute certainty) that it codes for the same amino acid.
The group of three bases that make up a particular amino acid is called a codon. And according to the Gamow triplet hypothesis, each codon is made up of three nucleotides. And each gene is defined by a start codon and a stop codon. The start codon has been identified, and it is the same start codon for every gene in every organism on Earth. Rather, there are three stop codons.