Proteomics and genetics

SIGNIFICANCE: The study of proteomics and its relationship to genomics focuses on the vast family of gene-regulating proteins. These polypeptides and their functions affect the expression of various genetically related diseases, such as Alzheimer’s and cancer. By focusing on the interrelated groups of regulator functions, geneticists are learning the connections between structure, abundance within the cell, and how each protein relates to expression.

What Is Proteomics?

Historically, much of the focus in genetic research was on genes and the completion of the Human Genome Project. However, the focus shifted to a related topic, the proteome. Proteins are known to perform most of the important functions of cells. Therefore, is, essentially, the study of proteins in an organism and, most important, their function. There are many aspects to the understanding of protein function, including where a particular protein is located in the cell, what modifications occur during its activity, what ligands may bind to it, and its activity. Researchers are seeking to identify all the proteins made in a given cell, tissue, or organism and determine how those proteins interact with metobolites, with themselves, and with nucleic acids. By studying proteomics, scientists hope to uncover underlying causes of disease at the cellular level, invent better methods of diagnosis, and discover new, more efficient medicines for the treatment of disease.

94416657-89513.jpg94416657-89514.jpg

Proteomics has moved to the forefront of molecular research, especially in the area of drug research. Neither the structure nor the function of a protein can be predicted from the DNA sequence alone. Although genes code for proteins, there is a large difference between the number of messenger (mRNA) molecules transcribed from DNA and the number of proteins in a cell. In addition, two hundred known modifications occur during the stages between transcription and post-translation, including phosphorylation, glycosylation, proteolytic processing, deamidation, sulfation, and nitration. Other factors that affect the expression of proteins include aging, stress, environmental forces, and medications. In addition, changes to the sequence of amino acids may occur during or after translation.

Methods of Proteomic Research

To study the functions of a protein, it must be separated from other proteins or contaminants, purified, and structurally characterized. These are the major tasks facing researchers in the field.

In order to obtain a sufficient quantity of a particular protein for study, the coding can be injected into Escherichia coli bacteria and the cells will translate the protein multiple times. Alternatively, it must be extracted from biological tissues. The desired must then be separated from cells or tissues that may contain thousands of unique proteins. This can be accomplished by homogenizing the tissue, extracting the proteins with solvents or by centrifugation, and further purifying the protein by various means, including high-pressure liquid (HPLC, separation by solubility differences) and two-dimensional (2-D) (separation of molecules by charge and molecular mass). A relatively recent development in laboratory technique research is three-dimensional (3-D) gel electrophoresis, allowing for further separation and identification of proteins.

Structural characterization begins with establishing the order of linked amino acids in the protein. This can be accomplished by the classical techniques of using proteases to fragment the protein chemically and then analyzing the fragments by separation and spectroscopic analysis. The molecular mass of small polypeptides can be investigated by employing several techniques involving mass spectrometry (MS). Sequentially coupled mass spectrometers (the “tandem” MS/MS techniques) are being used to analyze the sequence and molecular masses of isolated larger polypeptides. These MS/MS analyses are sometimes added to a separation method, such as HPLC, to analyze mixtures of polypeptides.

Historically, Linus Pauling used analytical data from x-ray diffraction (or crystallography) to determine the three-dimensional, helical structure of proteins. The method is still being used to investigate the structures of proteins and ligand-protein complexes. Such studies may lead to significant improvements in the design of medicinal drugs. One significant drawback to analyzing protein structure by X-ray diffraction, however, is that the method requires a significant quantity (approximately 1 milligram) of the protein. Transmission electron microscopy (TEM), which uses electron beams to produce images and diffraction patterns from extremely small samples or regions of a sample, is therefore often preferable. The TEM method may involve auxiliary techniques to analyze data, including enhancement of images by means of computer software.

Although such methods provide valuable information in analyzing the structure of proteins, they suffer from the loss of spatial information that occurs when tissues are homogenized, when the protein is obtained from a manufactured, bacterial environment, or when it is otherwise isolated. Matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry is a complementary method of analysis that does not yield structural information but provides protein profiles from intact tissue, allowing comparison of diseased versus normal tissue.

Large databases of mass spectroscopic data are being assembled to assist in future identification of known proteins. Further databases of proteome information include particular molecular masses, charges, and, in some cases, connections to the genes regulated or the parent gene of the peptide in question. Scientists hope to relate regulators and the complex web of peripheral proteins that affect the function of each gene.

Challenges and Limitations of Current Methods

The amount of data being obtained by proteomics research poses a problem in organizing and processing the information obtained on proteins. The Human Proteome Organization (HUPO) and the European Bioinformatics Institute (EBI) are two organizations whose purposes include the management and organization of proteomics information and databases, and the facilitation of the advancement of this scientific endeavor.

Analyzing MS data from proteins and relating the complex array of proteins within a single cell to the linear genetic material of DNA present challenges to researchers that they are tackling through computer algorithms, programs, and databases. The SWISS-PROT database, for example, is an annotated protein-sequence database maintained by the Swiss Bioinformatics Institute.

Other obstacles to relating proteins to parent genes include the loss of quaternary structure during separation and the presence of post-translation processing, which can alter the amino acid sequence to the extent that it becomes almost unrecognizable from the parent gene. A lack of protein amplification methods—techniques that would produce more copies of a protein to aid in study—requires sensitive analysis methods and increasingly strong detectors.

Disease

Proteins often act as markers for disease. As researchers study proteins, they have found that disease may be characterized by some proteins that are being overproduced, not being produced at all, or being produced at inappropriate times. As the correlation of proteins to disease becomes clearer, better diagnostic tests and drugs are being explored. For example, Alzheimer’s disease and Down syndrome are associated with a common protein fragment as the major extracellular protein component of senile plaques.

Researchers are investigating changes in protein expression in heart disease and heart failure, and several hundred cardiac proteins have already been identified. The study of proteomics in immunological diseases has revealed that there is a connection between the human neutrophil α-defensins (HNPs) and human immunodeficiency virus, HIV-1. HNPs are small, cysteine-rich, cationic antimicrobial proteins that are stored in the azurophilic granules of neutrophils and released during phagocytosis to kill ingested foreign microbes.

Similarly, cancer is being studied to find a roster of proteins that are present in cancerous cells but not in normal cells. The Clinical Proteomics Program, a joint effort of the National Cancer Institute (NCI) and the Food and Drug Administration, has searched for the differences between cancerous and normal cells, and also for protein “markers.” In 2011, NCI started the Clinical Proteomic Tumor Analysis Consortium, which sought to uncover cancers' molecular bases through proteomic research and technology.

Possible Future Directions

Three fields of proteomics have taken the lead in research: glycomics, metabolomics, and metabonomics. Glycomics addresses the importance of the sugar coatings of proteins and cells. This area of study has arisen because of the many roles of sugar coatings in important cellular functions, including the immunological recognition sites, barriers, and sites for attack by pathogens. Metabolomics is the study of the proteins left behind as the cell performs its processes. This field primarily looks at small proteins produced as by-products. Metabonomics is often used interchangeably with metabolomics but differs in that it examines the change that proteins produce when the cell responds to stresses, such as disease.

Key terms

  • chromatographya separation technique involving a mobile solvent and a stationary, adsorbent phase
  • mass spectroscopya method of analyzing molecular structure in which sample molecules are ionized and the resulting fragmented particles are passed through electric and magnetic fields to a detector
  • peripheral proteinsproteins of the chromosome that do not directly affect transcription
  • protein folding structurethe three-dimensional structure of proteins created by the folding of linked amino acids upon each other; this structure is held together by intermolecular forces, such as hydrogen bonds and ionic attractions
  • protein markera sequence of DNA that chemically attracts a particular regulatory protein sequence or structure
  • regulatorsproteins that control the transcription of a gene
  • senile plaquesprotein sections that are no longer functional and clutter the intercellular space of the brain, disrupting proper processes
  • transcriptionthe process by which mRNA is formed using DNA as a template
  • translationthe process of building a protein by bonding amino acids according to the mRNA marker present

Bibliography

Donev, Rossen, ed. Proteomics in Biomedicine and Pharmacology. San Diego: Elsevier, 2014. Print.

Glick, Bernard R., T. L. Delovitch, and Cheryl L. Patten. Medical Biotechnology. Washington, DC: ASM, 2014. Print.

Hopker, Hans-Rudolf, et al. Proteomics in Practice: Guide for Successful Research Design. 2nd ed. Weinham: Wiley, 2008. Print.

Liebler, David G. Introduction to Proteomics: Tools for the New Biology. Totowa: Humana, 2001. Print.

Link, Andrew J., ed. 2-D Proteome Analysis Protocols. Totowa: Humana, 1999. Print.

Messner, Christoph B., et al. "The Proteomic Landscape of Genome-Wide Genetic Perturbations." Cell, vol. 186, no. 9, 2023, pp. 2018-2034, doi: 10.1016/j.cell.2023.03.026. Accessed 5 Sept. 2024.

Poliseno, Laura. Pseudogenes: Functions and Protocols. New York: Humana, 2014. Print.

Twyman, Richard M. Principles of Proteomics. 2nd ed. Oxon: Garland, 2013. Print.