Protein synthesis

Significance: Cellular proteins can be grouped into two general categories: proteins with a structural function that contribute to the three-dimensional organization of a cell, and proteins with an enzymatic function that catalyze the biochemical reactions required for cell growth and function. Understanding the process by which proteins are synthesized provides insight into how a cell organizes itself and how defects in this process can lead to disease.

The Flow of Information from Stored to Active Form

The cell can be viewed as a unit that assembles resources from its environment into biochemically functional molecules, then organizes these molecules in three-dimensional space in a way that allows for cell growth and replication. In order to carry out this process, a cell must have the biosynthetic means to assemble resources into molecules, and it must contain the information required to produce the biosynthetic and structural machinery. Deoxyribonucleic acid, or DNA, represents the stored form of this information, whereas proteins are the end product. There are thousands of different proteins in cells, either serving a structural role or acting as enzymes that catalyze the biosynthetic reactions of the cell. Following the discovery of the structure of DNA in 1953 by James Watson and Francis Crick, scientists began to study the process by which the information stored in DNA is converted into protein molecules.

94416656-89512.jpg

Proteins are linear, functional molecules composed of a unique sequence of amino acids. About twenty different amino acids are used to synthesize proteins. Although the sequence of amino acids for each protein is present in each DNA molecule, DNA cannot synthesize proteins by itself. Another, similar molecule, called ribonucleic acid (RNA), is necessary to decode the information contained in DNA and do the work of assembling proteins.

There are various types of RNA, each one distinguished by its function. The process of protein synthesis involves three types of RNA. Messenger RNA (mRNA) copies the information contained in a cell's DNA—the sequence of amino acids that makes up a particular protein—and carries it to a part of the cell called the ribosome, where protein synthesis takes place. Transfer RNA (tRNA) decodes the information carried by the mRNA and then transports the required amino acids to the appropriate location during synthesis. Ribosomal RNA (rRNA) acts as the engine that carries out most of the steps during protein synthesis. Together with a specific set of proteins, rRNA forms ribosomes that bind the mRNA, serve as the platform for tRNA to decode the mRNA, and catalyze the formation of peptide bonds between amino acids. Each ribosome is composed of two subunits: a small (40s) and a large (60s) subunit, each of which has its own function. The “s” in 40s and 60s is an abbreviation for Svedberg units, which are a measure of how quickly a large molecule or complex molecular structure sediments (or sinks) to the bottom of a centrifuge tube while being centrifuged. The larger the number, the larger the molecule.

Like all RNA, mRNA is composed of just four types of nucleotides: adenine (A), guanine (G), cytosine (C), and uracil (U). (DNA contains thymine in the place of uracil, but the other three bases are the same.) Each type bonds only with one other type: guanine with cytosine, and adenine with uracil (or, in DNA, thymine). Therefore, when an mRNA molecule temporarily bonds with a DNA molecule, it forms a mirror image of the nucleotide sequence contained in the DNA. This step is called transcription.

Transcription is the first step in the process by which a DNA molecule composed of a linear sequence of nucleotides gives rise to a protein molecule composed of a linear sequence of amino acids. This process is called translation, since it converts the “language” of nucleotides that make up DNA into the “language” of amino acids that make up a protein. This is achieved by means of three-nucleotide sequences called codons. The four nucleotides—A, G, C, and U—mean that there are sixty-four possible codons. Each codon (save for some exceptions) corresponds to a specific amino acid. As there are only twenty amino acids coded for in DNA, most amino acids correspond to several different codons. For example, six different codons (UCU, UCC, UCA, UCG, AGU, and AGC) specify the amino acid serine, whereas only one (AUG) specifies the amino acid methionine. A molecule of mRNA, therefore, is simply a linear array of codons (that is, three-nucleotide “words” that are “read” by tRNAs together with ribosomes). The region within an mRNA containing this sequence of codons is called the coding region.

Before translation can occur in eukaryotic cells, mRNA molecules undergo processing steps at both ends to add features that will be necessary for translation. (These processing steps do not occur in prokaryotic cells.) Nucleotides are structured such that they have two ends, a 5′ and a 3′ end, that are available to form chemical bonds with other nucleotides. Each nucleotide present in an mRNA has a 5′-to-3′ orientation that gives a directionality to the mRNA, so that the RNA begins with a 5′ end and finishes in a 3′ end. The ribosome reads the coding region of an mRNA in a 5′-to-3′ direction. Following the synthesis of an mRNA from its DNA template, one guanine is added to the 5′ end of the mRNA in an inverted orientation. It is the only nucleotide in the entire mRNA present in a 3′ to 5′ orientation and is referred to as the cap. A long stretch of adenosine is added to the 3′ end of the mRNA to make what is called the poly-A tail.

Typically, mRNAs have a stretch of nucleotide sequence that lies between the cap and the coding region. This is referred to as the leader sequence and is not translated. Therefore, a signal is necessary to indicate where the coding region initiates. The codon AUG usually serves as this initiation codon; however, other AUG codons may be present in the coding region. Any one of three possible codons (UGA, UAG, or UAA) can serve as stop codons that signal the ribosome to terminate translation. Several accessory proteins assist ribosomes in binding mRNA and help carry out the required steps during translation.

The Translation Process: Initiation

Translation occurs in three phases: initiation, elongation, and termination. The function of the 40s ribosomal subunit is to bind to an mRNA and locate the correct AUG as the initiation codon. It does this by binding close to the cap at the 5′ end of the mRNA and scanning the nucleotide sequence in its 5′ to 3′ direction in search of the initiation codon. Marilyn Kozak identified a certain nucleotide sequence surrounding the initiator AUG of eukaryotic mRNAs that indicates to the ribosome that this AUG is the initiation codon. She found that the presence of an A or G three nucleotides prior to the AUG and a G in the position immediately following the AUG were critical in identifying the correct AUG as the initiation codon. This is referred to as the “sequence context” of the initiation codon. Therefore, as the 40s ribosomal subunit scans the leader sequence of an mRNA in a 5′ to 3′ direction, it searches for the first AUG in this context and may bypass other AUGs not in this context.

Nahum Sonenberg demonstrated that the scanning process by the 40s subunit can be impeded by the presence of stem-loop structures present in the leader sequence. These form from base pairing between complementary nucleotides present in the leader sequence. Two nucleotides are said to be complementary when they join together by hydrogen bonds. For instance, the nucleotide (or base) A is complementary to U, and these two can form what is called a “base pair.” Likewise, the nucleotides C and G are complementary. Several accessory proteins, called eukaryotic initiation factors (eIFs), aid the binding and scanning of 40s subunits. The first of these, eIF4F, is composed of three subunits called eIF4E, eIF4A, and eIF4G. The protein eIF4E is the subunit responsible for recognizing and binding to the cap of the mRNA. The eIF4A subunit of eIF4F, together with another factor called eIF4B, functions to remove the presence of stem-loop structures in the leader sequence through the disruption of the base pairing between nucleotides in the stem loop. The protein eIF4G is the large subunit of eIF4F, and it serves to interact with several other proteins, one of which is eIF3. It is this latter initiation factor that the 40s subunit first associates with during its initial binding to an mRNA.

Through the combined action of eIF4G and eIF3, the 40s subunit is bound to the mRNA, and through the action of eIF4A and eIF4B, the mRNA is prepared for 40s subunit scanning. As the cellular concentration of eIF4E is very low, mRNAs must compete for this protein. Those that do not compete well for eIF4E will not be translated efficiently. This represents one means by which a cell can regulate protein synthesis. One class of mRNA that competes poorly for eIF4E encodes growth-factor proteins. Growth factors are required in small amounts to stimulate cellular growth. Sonenberg has shown that the overproduction of eIF4E in animal cells leads to a reduction in the competition for this protein, and mRNAs such as growth-factor mRNAs that were previously poorly translated when the concentration of eIF4E was low are now translated at a higher rate when eIF4E is abundant. This in turn results in the overproduction of growth factors, which leads to uncontrolled growth, a characteristic typical of cancer cells.

A protein that specifically binds to the poly-A tail at the 3′ end of an mRNA is called the poly-A-binding protein (PABP). Discovered in the 1970’s, the only function of this protein was thought to be to protect the mRNA from attack at its 3′ end by enzymes that degrade RNA. Daniel Gallie demonstrated another function for PABP by showing that the PABP-poly-A-tail complex was required for the function of the eIF4F-cap complex during translation initiation. The idea that a protein located at the 3′ end of an mRNA should participate in events occurring at the opposite end of an mRNA seemed strange initially. However, RNA is quite flexible and is rarely present in a straight, linear form in the cellular environment. Consequently, the poly-A tail can easily approach the cap at the 5′ end. Gallie showed that PABP interacts with eIF4G and eIF4B, two initiation factors that are closely associated with the cap, through protein-to-protein contacts. The consequence of this interaction is that the 3′ end of an mRNA is held in close physical proximity to its cap. The interaction between these proteins stabilizes their binding to the mRNA, which in turn promotes protein synthesis. Therefore, mRNAs can be thought of as adopting a circular form during translation that looks similar to a snake biting its own tail. This idea is now widely accepted by scientists.

One additional factor, called eIF2, is needed to bring the first tRNA to the 40s subunit. Along with the initiator tRNA (which decodes the AUG codon specifying the amino acid methionine), eIF2 aids the 40s subunit in identifying the AUG initiation. Once the 40s subunit has located the initiation codon, the 60s ribosomal subunit joins the 40s subunit to form the intact 80s ribosome. (Svedberg units are not additive; therefore, a 40s and 60s unit joined together do not make a 100s unit.) This marks the end of the initiation phase of translation.

The Translation Process: Elongation and Termination

During the elongation phase, tRNAs bind to the 80s ribosome as it passes over the codons of the mRNA, and the amino acids attached to the tRNAs are transferred to the growing polypeptide. Binding of the tRNAs to the ribosome is assisted by an accessory protein called eukaryotic elongation factor 1 (eEF1). A codon is decoded by the appropriate tRNA through base pairing between the three nucleotides that make up the codon in the mRNA and three complementary nucleotides within a specific region (called the anticodon) within the tRNA. The tRNA binding sites in the 80s ribosome are located in the 60s subunit. The ribosome moves over the coding region one codon at a time, or in steps of three nucleotides, in a process known as translocation. When the ribosome moves to the next codon to be decoded, the tRNA containing the appropriate anticodon will bind tightly in the open site in the 60s subunit (the A site). The tRNA that bound to the previous codon is present in a second site in the 60s subunit (the P site). Once a new tRNA has bound to the A site, the ribosomal RNA itself catalyzes the formation of a peptide bond between the growing polypeptide and the new amino acid. This results in the transfer of the polypeptide attached to the tRNA present in the P site to the amino acid on the tRNA present in the A site. A second elongation factor, eEF2, catalyzes the movement of the ribosome to the next codon to be decoded. This process is repeated one codon at a time until a stop codon is reached.

The termination phase of translation begins when the ribosome reaches one of the three termination or stop codons. These are also referred to as “nonsense” codons, as the cell does not produce any tRNAs that can decode them. Accessory factors, called release factors, are also required to assist this stage of translation. They bind to the empty A site in which the stop codon is present, and this triggers the cleavage of the bond between the completed protein from the last tRNA in the P site, thereby releasing the protein. The ribosome then dissociates into its 40s and 60s subunits, the latter of which diffuse away from the mRNA. The close physical proximity of the cap and poly-A tail of an mRNA maintained by the interaction between PABP and the initiation factors (eIF4G and eIF4B) is thought to assist the recycling of the 40s subunit back to the 5′ end of the mRNA to participate in a subsequent round of translation.

Impact and Applications

The elucidation of the process and control of protein synthesis provides a ready means by which scientists can manipulate these processes in cells. In addition to infectious diseases, insufficient dietary protein represents one of the greatest challenges to world health. The majority of people now living are limited to obtaining their dietary protein solely through the consumption of plant matter. Knowledge of the process of protein synthesis may allow molecular biologists to increase the amount of protein in important crop species. Moreover, most plants contain an imbalance in the amino acids needed in the human diet that can lead to disease. For example, protein from corn is poor in the amino acid lysine, whereas the protein from soybeans is poor in methionine and cysteine. Molecular biologists may be able to correct this imbalance by changing the codons present in plant genes, thus improving this source of protein for those people who rely on it for life.

Key terms

  • amino acid: the basic subunit of a protein; there are twenty commonly occurring amino acids, any of which may join together by chemical bonds to form a complex protein molecule
  • peptide bond : the chemical bond between amino acids in protein
  • polypeptide: a linear molecule composed of amino acids joined together by peptide bonds; all proteins are functional polypeptides
  • ribonucleic acid (RNA): a nucleic acid (chain of nucleotides) that serves various functions with respect to DNA, including protein synthesis and gene expression and regulation
  • translation: the process of forming proteins according to instructions contained in an RNA molecule

Bibliography

Atkins, John F., Raymond F. Gesteland, and Thomas R. Cech, eds. RNA Worlds: From Life's Origins to Diversity in Gene Regulation. Cold Spring Harbor: Cold Spring Harbor Laboratory, 2011. Print.

Bethaz, Carlo, and Vito Li Puma, eds. New Research on Protein Synthesis. New York: Nova, 2014. Print.

Crick, Francis H. C. “The Genetic Code: III.” Scientific American Oct. 1966: 55–62. Print.

"How Do Genes Direct the Production of Proteins?" MedlinePlus, 26 Mar. 2012, medlineplus.gov/genetics/understanding/howgeneswork/makingprotein/. Accessed 4 Nov. 2022.

Keiler, Kenneth C., ed. Bacterial Regulatory RNA: Methods and Protocols. New York: Humana, 2012. Print.

Krebs, Jocelyn E., Elliott S. Goldstein, and Stephen T. Kilpatrick, eds. Lewin's Genes XI. 11th ed. Burlington: Jones, 2014. Print.

Lake, James A. “The Ribosome.” Scientific American Aug. 1981: 84–97. Print.

Liljas, Anders, and Måns Ehrenberg. Structural Aspects of Protein Synthesis. 2nd ed. Hackensack: World Scientific, 2013. Print.

Rich, Alexander, and Sung Hou Kim. “The Three-Dimensional Structure of Transfer RNA.” Scientific American Jan. 1978: 52–62. Print.

Tropp, Burton E. Molecular Biology: Genes to Proteins. 4th ed. Sudbury: Jones, 2012. Print.

Whitford, David. Proteins: Structure and Function. Hoboken: Wiley, 2005. Print.