DNA/RNA Transcription
DNA/RNA transcription is a fundamental biological process where genetic information from DNA is copied into RNA. This process is essential for synthesizing proteins in eukaryotic cells and involves the enzyme RNA polymerase, which binds to a specific location on the DNA strand, separating its two strands to read the nucleotide sequence. During transcription, the noncoding strand of DNA serves as a template, producing a complementary RNA strand that contains uracil instead of thymine. The resulting messenger RNA (mRNA) carries this genetic information from the nucleus to the ribosomes, where proteins are synthesized with the help of transfer RNA (tRNA) and ribosomal RNA (rRNA).
Transcription is the first step in gene expression, which ultimately leads to the creation of proteins that are vital for various cellular functions. The nucleotide sequence in the RNA is organized into codons, which are three-nucleotide sequences that correspond to specific amino acids, facilitating the precise assembly of proteins. Understanding transcription has been pivotal in molecular biology, contributing to insights about genetics and the relationship between DNA and protein synthesis. Overall, this process plays a crucial role in the identity and functioning of biological organisms.
DNA/RNA Transcription
FIELDS OF STUDY: Biochemistry; Genetics; Molecular Biology
ABSTRACT
The process of DNA/RNA transcription is defined, and its importance in the biochemistry of living systems is discussed. The process of transcription is essential in living systems for the synthesis of proteins in eukaryotic cells, as well as a variety of other functions that require the extraction of genetic information from DNA.
The Structure of DNA versus RNA
The molecular structure of DNA is often likened to a zipper, with the two halves of the molecule matching up in a way that resembles the two halves of a zipper fitting together. However, the actual structure is much more complicated. A DNA molecule contains two complementary strands made up of specific combinations of nucleotides. Each nucleotide in a strand of DNA is composed of a molecule of the sugar deoxyribose bonded to a phosphate group and a nitrogenous base. Only four bases are used in DNA molecules: adenine, thymine, guanine, and cytosine.
Structurally, a DNA strand consists of a very long chain of alternating deoxyribose-and-phosphate units, forming what can be considered the backbone of the molecule, with the various bases appended to the deoxyribose. The complementary strand of a molecule of duplex DNA has the same basic structure but with a different sequence of bases. In duplex DNA, the bases form specific pairs, with adenine complementary to thymine and guanine complementary to cytosine.
The structure of the RNA molecule is very similar to that of a single DNA strand. There are two essential differences between them, however: the sugar component of the RNA molecule is the sugar ribose, not deoxyribose, and the base uracil is used in place of thymine. The difference in the sugar component is what allows DNA to form the familiar double-helix structure, which RNA cannot do.
The DNA molecule carries the genetic information that defines the identity of biological organisms. The sequence of nucleotides in each DNA strand specifies the order in which amino acids are to be assembled into proteins, the basic components of all known life. Each cell in an organism must have a DNA molecule in its nucleus—or, in the case of a prokaryote, its intracellular fluid—in order to produce the proteins and other compounds that are essential to its existence. The mechanism by which the instructions for protein assembly are translated and put into action is the DNA transcription process, which is the fundamental first step in the process of gene expression. Transcription can be described in simplified terms as RNA making a mold of the nucleotide sequence found in the parent DNA molecule and using it to synthesize new proteins.
The Transcription Process
Transcription is the copying of the nucleotide pattern in a strand of DNA by an enzyme called RNA polymerase. An enzyme is a protein that carries out a specific chemical function, which is determined by the relative locations of various atoms and functional groups within its three-dimensional structure. The chemical names of virtually all enzymes have the suffix -ase, as in lipase, transcriptase, and polymerase. Others that were first named as proteins rather than enzymes have names that end with -in, such as trypsin and pepsin. The various enzymes that participate in the transcription process, other than RNA polymerase, are called transcription factors.
With very few exceptions, the genetic information of the duplex DNA strand is transcribed from only one of the two strands. Due to the complementary nature of the two strands, the nucleotide sequence in the new RNA strand is identical to the DNA strand that was not transcribed, save for the substitution of uracil for thymine, and both are the reverse of the DNA strand that served as the template. Because of this, the nontemplate strand is alternately called the "coding strand" or the "sense strand," while the template strand from which the RNA is assembled is called the "noncoding strand" or the "antisense strand."
Transcription from DNA to RNA involves a number of types of RNA. Messenger RNA (mRNA) copies genetic information from a DNA molecule to replicate the nucleotide sequence in the coding strand. Transfer RNA (tRNA) carries the amino acids specified by the nucleotide sequence to the growing end of a polypeptide chain. Ribosomal RNA (rRNA) forms the ribosome, which is where polypeptide assembly takes place.
In a eukaryote, transcription begins when an RNA polymerase attaches to an appropriate location on the duplex DNA strand and helicase enzymes temporarily separate the two strands of the DNA molecule at that location. The RNA polymerase begins to assemble nucleotides and attach them to the noncoding strand in the appropriate sequence, building up a hybrid RNA-DNA duplex strand. Due to the structural differences between the ribose and deoxyribose sugars and the complementary pairing of adenine with uracil instead of thymine, this hybrid duplex strand is not stable, so when assembly is complete, the RNA strand separates as mRNA. The mRNA strand then moves to the ribosomes formed by rRNA, where tRNA units carrying different amino acids are matched to the mRNA strand in the order specified. The amino acids are joined to one another with peptide bonds, forming the primary structure of the particular protein that has been encoded. The overall process of synthesizing proteins from genetic information contained in DNA is called "translation," with transcription being the initial step in the overall process.
Because DNA and RNA both use only four nucleotides each to specify structure—adenine, cytosine, guanine, and either thymine or uracil—and proteins are synthesized from twenty different amino acids, individual amino acids are specified by unique three-nucleotide sequences called "codons." Since the four nucleotides can produce sixty-four distinct combinations, most amino acids correspond to more than one codon, and some codons serve other purposes, such as initiating protein formation or signaling a stopping point. The codons between a start codon and a stop codon constitute what is called a "reading frame" for a specific nucleotide sequence. It is possible for multiple reading frames to overlap and the same sequence to code for different amino acids, depending on where in the sequence transcription begins.

Unraveling the Genetic Code
When biochemists recognized the role of mRNA in transcription, it became possible to investigate the structure of DNA in detail. By assembling synthetic mRNA molecules from just one type of nucleotide base, such as uracil—thus forming polyuridylic acid, or poly(U)—and examining the polypeptides that result, it can be determined which nucleotide sequences code for certain amino acids in protein synthesis. Using this technique, which they developed in 1961, biochemists Marshall Nirenberg and J. Heinrich Matthaei discovered that the codon UUU produces the amino acid phenylalanine—the first time an individual codon was linked to a specific amino acid. Similar experiments with synthetic poly(A) (polyadenylic acid) and poly(C) (polycytidylic acid) determined that the codons AAA and CCC code for lysine and proline, respectively. Poly(G) (polyguanylic acid) was found to form an unusable stacked structure that did not translate into protein synthesis. Synthetic codons of mixed nucleotide units also revealed which codons function as "start" and "stop" signals in protein synthesis and which ones code for the same amino acids.
The sequence of nucleotides in the structure of a DNA molecule is known as a "genome." If one were to transcribe the entire human genome as a sequence of nucleotides, using A, C, T, and G, the result would fill approximately one million densely typed pages. In February 2001, the science journal Nature published its report of the first complete analysis of the human genome. Further study of the genome has revealed, among other things, that all humans in the world today are descended from a mere handful of populations that originated in Africa; that the DNA of humans and chimpanzees only differs by approximately 2 percent; and that many modern humans, particularly those of Asian and European descent, carry Neanderthal genes—in some cases as much as 4 percent of their DNA.
PRINCIPAL TERMS
- complementary strand: one of the two strands of nucleotides that make up a DNA molecule, with each nucleotide in one strand corresponding to the position of its complementary nucleotide (cytosine for guanine, adenine for thymine, and vice versa) in the other.
- enzyme: a protein molecule that acts as a catalyst in biochemical reactions.
- gene expression: the process by which RNA copies genes, which are specific segments of the DNA molecule, and uses the information to synthesize either proteins or other types of RNA.
- RNA polymerase: the enzyme responsible for initiating gene transcription in order to assemble and replicate strands of RNA.
- transcription factor: a protein that binds to DNA in order to initiate, regulate, or block gene transcription.
Bibliography
Berg, Jeremy M., John L. Tymoczko, and Lubert Stryer. Biochemistry. 7th ed. New York: Freeman, 2011. Print.
The Human Genome. Spec. issue of Nature 409.6822 (2001): 745–964. Print.
Lodish, Harvey, et al. Molecular Cell Biology. 7th ed. New York: Freeman, 2013. Print.
Pelczar, Michael J., Jr., E. C. S. Chan, and Noel R. Krieg. Microbiology: Concepts and Applications. New York: McGraw, 1993. Print.
Reece, Jane B., et al. Campbell Biology. 9th ed. Boston: Pearson, 2011. Print.