The Central Dogma of Molecular Biology
We begin with the Central Dogma of Molecular Biology – perhaps the simplest concept of biology, and yet one of the most profound. Simply, the central dogma describes the unidirectional flow of genetic information.
It may be easiest to first explain a more common definition of the Central Dogma – even if it is not the original definition – which describes the direction of information flux in molecular biology. According to this casual definition, genetic information transmits unidirectionally from DNA to RNA to protein. The information is stored in the language of the genetic code as a “hard copy” in the form of deoxyribonucleic acid (DNA), a very long and stable molecule. The sequential information stored in DNA is then transferred to an intermediate molecule, ribonucleic acid (RNA) – the molecular cousin of DNA – which acts more as a short-lived “soft copy” of the original DNA code. The RNA code, in turn, is transferred to proteins, the terminal and functional product of genetic information. Furthermore, the information stored in DNA can be copied into more DNA (think copy-paste on your computer). In the terminology of biologists, the process of transferring information from DNA to RNA is transcription (DNA is transcribed into RNA); the transfer of information in RNA to protein is translation (RNA is translated into protein); DNA-to-DNA copying is replication (DNA is replicated into more DNA). To summarize the general principle of the central dogma simply, genetic information is transferred from DNA to RNA to proteins.
However, in its purest and original form (Crick, 1958), the central dogma only states that genetic information can be passed from DNA to RNA, and from RNA to protein (as well as in a few other directions, to be discussed below), but genetic information cannot be passed from protein to any other form. The language of proteins is a dead-end for genetic information.
Perhaps a saucy metaphor will help here. Let’s say you’re an English-speaking romantic, and you’ve fallen hopelessly in love with a Spaniard. You write a love letter to your Spaniard in the only language you know – English –, using an English-to-Spanish dictionary to translate your affectionate I love you's into the Te quiro's that will melt your Spanish-speaking lover's heart. But you are paranoid that your message will be intercepted by a malicious voyeur, and as such you resolve to translate your letter yet again, from Spanish to Gibberish, to ensure that your message will be safely encrypted. However, when your Spanish lover receives the letter, you realize your mistake: there does not exist a Gibberish-to-Spanish dictionary (for a convenient reason that is ancillary to this metaphor) and so your love letter remains untranslated, your love unrequited.
The central dogma works in the same way, though perhaps with slightly less finesse for drama; it states that, though DNA-to-RNA and RNA-to-Protein dictionaries exist, there is no Protein-to-anything dictionary. Once a genetic message is translated into the language of proteins, it’s stuck.
THE HISTORY OF THE DOGMA
The central dogma was originally proposed in 1958 by Francis Crick (of the famed Watson-Crick duo that in 1953 discovered the double-helix structure of DNA) in a symposium publication, On Protein Synthesis (Crick, 1958). Crick discusses two related points, which he calls “the sequence Hypothesis” and (our featured hero) “the central dogma”.
As an aside, this was the first published use of the phrase, “the central dogma”. The only earlier use I could find was from a draft of Crick’s manuscript from two years earlier in 1956 (Crick, 1956), which excitedly ends with the exclamation:
This scheme explains the majority of the present experimental results!
The sequence hypothesis states that the information of nucleic acids (i.e. DNA and RNA) is articulated solely in its linear sequence nucleotide bases (the A’s, T’s or U’s, C’s, and G’s of the genetic code), and that this sequence is a simple code for the amino acids that make up proteins. Naturally, the central dogma complements this basic concept. Whereas the sequence hypothesis is a positive statement (“transfers from nucleic acid to protein exist”), the central dogma is a negative statement (“transfers from protein to nucleic acid do not exist”). The light gray arrows of Fig.4 below underscore the core of Crick’s original conception of the central dogma, wherein information stored as protein is not allowed to be transmitted to DNA, RNA, nor more protein.
At the time, the Central Dogma was a contentious statement. Many scientists misapplied Crick’s original definition, more broadly equating it to the sequence hypothesis. As a result, they dramatically claimed to have “reversed the central dogma” (See Nature News and Reviews, 1970) by finding examples of biology where genetic information did not flow according to Fig.1. Furthermore, naming such a controversial claim “dogma” attracted much criticism. Crick responded to the dissenters in a 1970 rebuttal (Crick, 1970), restating the central dogma in even plainer language (and with a sharp tongue!). Later, in his 1990 autobiography, What Mad Pursuit: A Personal View of Scientific Discovery, Crick confessed:
As it turned out, the use of the word dogma caused almost more trouble than it was worth.
Figures 2 and 3
From an information theory perspective, given three storage modalities (here DNA, RNA, and protein), there exist nine possible information transfer processes (see Fig.2). In this architecture, RNA can beget DNA, protein can beget protein, DNA can beget protein, and so on.
However, as we know from the restrictions imposed by Crick’s dogma, molecular biology has not been so cavalier with designing information architectures as Fig.2 may suggest. Fig.3 shows a comprehensive view of the flow of genetic information in molecular biology. The classic and general pathway of DNA-to-RNA-to-protein is shown in black, special cases such as reverse transcription of RNA-to-DNA are shown in medium gray, and disallowed fluxes such as protein-to-DNA pathways are shown in light gray. Note that these disallowed are the only mechanisms addressed in the original Central Dogma.
But, as the fictitious Dr. Ian Malcolm of Jurassic Park will tell you, “life, uh, finds a way.” Many exceptions to the linear, unidirectional architecture suggested by Fig.1 exist in nature.
The classic violation involves RNA-to-DNA information transfer by a curious class of enzymes appropriately called reverse transcriptases. These enzymes can take the information stored in an RNA molecule and reverse transcribe it into a hard DNA copy. This mechanism is quite important to the replication lifestyle of retroviruses – a class of virus that includes HIV and Hepatitis B Virus. At some point in a retrovirus’ lifecycle, they use a reverse transcriptase to copy their RNA genome into a DNA intermediate. As it turns out, reverse transcriptases participate in important biological process that are less nefarious than viral replication. For example, telomerase is a human reverse transcriptase that is responsible for protecting chromosome ends (or telomeres) from shortening, which causes cell aging.
RNA-dependent RNA polymerases (RdRPs) also break away from the general architecture of Fig.1. These enzymes are responsible for replicating RNA into more RNA, similar to the process of DNA replication. Similar to viral reverse transcriptases, RdRPs are a common element in viral replication.
A pervasive theme in biology also shows that RNAs themselves can serve functional roles aside from encoding proteins. These are called noncoding RNAs. Noncoding RNAs don’t necessarily “reverse” the central dogma, but they do serve as a counterexample to the concept that the functional subunits of life are protein-based.
INFORMATION THEORY AND THE CENTRAL DOGMA
What exactly do I mean by genetic information? As Fig.4 illustrates, genetic information is stored in DNA, a long linear molecule composed of two strands that form a double helix structure. The language of DNA is written in the four nucleotide bases adenine (A), thymine (T), cytosine (C), and guanine (G). RNA, the single-stranded molecular cousin of DNA, is composed of a similar alphabet of bases, with the only substitution of uracil (U) for thymine. Protein, too, is a linear molecule which folds into complex three-dimensional structures. These structures are dictated by its primary sequence of some twenty amino acids, such as Methionine (M), Glycine (G), and Proline (P) here.
The transfer of RNA’s A, U, C, and G sequences to protein’s twenty letters of amino acids is an irreversible process; once that information is decoded, it cannot be “untranslated”. From an information theory perspective, one of the reasons for this unallowable transmission is because multiple RNA sequences can code for a single amino acid in a protein. For example, GGA, GGU, GGC, and GGG all encode the amino acid glycine. If given a glycine in an amino acid sequence, how is one to “untranslate” it to the RNA sequence? Was the glycine encoded by a GGG or a GGA? This characteristic where multiple RNA sequences can be translated into a single amino acid is known as codon degeneracy.
Crick FHC. On Protein Synthesis. (1958) Symposium of the Society of Experimental Biology, 12:139-163.
The first published use of the term “the central dogma” that I could find.
Crick FHC. Ideas on Protein Synthesis. (1956) Personal draft.
I found this manuscript here. It is almost beautiful in its sloppy excitement.
Central Dogma Reversed. (1970) Nature News and Views, 226:1998-1999.
Crick’s dissenters find “exceptions” to the central dogma. Note that this article was published anonymously - drama!
Crick FHC. The Central Dogma of Molecular Biology. (1958) Nature, 227:561-563.
Crick’s sharp response to the naysayers. Note that this was published in the very next issue of Nature - more drama!
Shapiro, JA. Revisiting the Central Dogma in the 21st Century. (2009) Natural Genetic Engineering and Natural Genome, 1178: 6-28.
A comprehensive and modern perspective on the central dogma – a very good read for the thorough.
Crick FHC. What Mad Pursuit: A Personal View of Scientific Discovery. Basic Books Publishing (1990).