DNA sequencing technology
DNA sequencing technology refers to the methods used to determine the precise order of nucleotides in a strand of DNA, comprised of adenine, cytosine, guanine, and thymine. This technology has pivotal applications in biological research, medical diagnostics, forensics, and innovations in various fields. Since the discovery of DNA in the late 19th century, understanding its structure and function has evolved significantly, especially with the advent of rapid sequencing methods in the 1970s that enabled researchers to analyze and compare genetic information effectively.
Modern sequencing techniques, including Sanger sequencing and Next-Generation Sequencing (NGS), allow for the rapid sequencing of entire genomes, significantly advancing the field of genomics. The Human Genome Project, completed in 2003, exemplifies the impact of sequencing technology by mapping the human genetic code, which has led to insights into hereditary diseases and the development of targeted therapies. Automated sequencing methods have further enhanced efficiency, enabling large-scale sequencing projects with minimal manual intervention.
As technologies improve, future advancements in DNA sequencing aim to increase accuracy and reduce costs, with potential applications ranging from personalized medicine to agricultural optimization and environmental monitoring. Overall, DNA sequencing technology plays a crucial role in advancing our understanding of genetics and its implications across various domains.
DNA sequencing technology
SIGNIFICANCE: The genetic code is contained in the ordered, linear arrangement of the four nucleotides attached to the sugar-phosphate backbone of a strand of DNA: adenine, cytosine, guanine, and thymine. DNA sequencing is the determination of this ordered arrangement, and is used in basic biological research, as well as diagnostic applications, forensic investigations, and medical innovations.
The Need for Sequencing
DNA was first discovered in 1869 as a viscous material in pus, and its basic chemical composition was well established by the 1930s. By 1950, the role of DNA as the hereditary material was clearly defined. In the 1950s, the classic papers by James Watson and Francis Crick and Matthew Meselson and Frank Stahl gave scientists a clear picture of the structure and function of DNA. In 1961, Crick demonstrated that the consisted of sets of three nucleotides in sequence (triplet codons) that identified specific amino acids. However, there was no system to read the sequence and uncover the actual words that spelled out the code of life.
![DNA sequencing gels. DNA sequencing gels. By John Crawford (Photographer) [Public domain or Public domain], via Wikimedia Commons 94416453-89182.jpg](https://imageserver.ebscohost.com/img/embimages/ers/sp/embedded/94416453-89182.jpg?ephost1=dGJyMNHX8kSepq84xNvgOLCmsE2epq5Srqa4SK6WxWXS)
![DNA sequencing. Two male scientists wearing lab coats in a laboratory looking at a highlighted light board, reading the genetic code in the DNA. By Linda Bartlett (Photographer) [Public domain or Public domain], via Wikimedia Commons 94416453-89183.jpg](https://imageserver.ebscohost.com/img/embimages/ers/sp/embedded/94416453-89183.jpg?ephost1=dGJyMNHX8kSepq84xNvgOLCmsE2epq5Srqa4SK6WxWXS)
The discovery of rapid sequencing methods in the 1970s created a flood of new discoveries in biology. The coding regions and control elements of DNA could be identified and compared. The sequence changes in different alleles of the same gene could be evaluated, genes could be examined in divergent species, and evolutionary changes could be studied. Today, an entire genome can be sequenced, identifying every nucleotide in the correct order along every chromosome, in a matter of months. This ability to sequence the genomes of entire organisms has created a new field called genomics, the study and comparison of whole genomes of different organisms. Sequencing is now at the core of many of the new discoveries in biology. Modern DNA sequencing technology has allowed for the sequencing of the entire human genome, as well as many plant, animal, and microbial genomes.
The Human Genome Project, completed in 2003, isolated and identified the 3.2 billion base pairs (bp) comprising the entire set of genetic information contained in human DNA. The sequence has led to the discovery of genes associated with specific diseases, the isolation of DNA responsible for regulating cellular functions, and the development of gene-targeted drug therapies.
Principles of DNA Sequencing
Molecular biologists cannot observe DNA molecules directly, even through a microscope, so they must devise controlled chemical reactions whose outcomes are indicative of what occurs at the submicroscopic level. In DNA sequencing, the key is to use a chemical method that allows for the analysis of the base sequence one base at a time. Such a method needs to produce a collection of DNA fragments whose lengths can be used to detect the identity of the base located at the end of each different-sized fragment. For example, if fragments of the short DNA sequence ACGTCCGATCG can be predictably produced, then the size of each fragment could be used to determine the location of each base. If the fragment is cut to the right of each thymine base, fragments of 4 and 9 bp will be produced. Repeating the process for the other three nucleotides can identify their positions. The DNA sequence is obtained by reading from smallest to largest fragment and identifying which reaction generates each fragment. Although this is a very simple example, the principles apply to all current sequencing methods. Electrophoresis in denaturing polyacrylamide gels (to keep the DNA single-stranded) is used to separate fragments that are hundreds of base pairs in length but differ by only a single nucleotide. The DNA is labeled with either radioactive or fluorescent markers so that the bands of DNA fragments can be detected.
Maxam-Gilbert Sequencing
Maxam-Gilbert sequencing, also known as chemical sequencing, was developed in the early 1970s and based on the chemical modification and cleavage of a strand of DNA at specific base pairs. To sequence DNA with this method, the DNA fragment to be sequenced is isolated and the 5′ end of one of the strands is labeled with a radioactive phosphorous-32 atom in the terminal phosphate group. This creates the endpoint for DNA elongation. In separate tubes, the DNA is reacted with chemicals that will cleave the backbone of the DNA strand at one of the four nucleotides. The method requires dangerous chemicals and does not easily lend itself to automation, so it is rarely used today.
Sanger Sequencing
Sanger sequencing, or chain-terminator sequencing, is named for its developer, Frederick Sanger. This method requires a short DNA segment of known sequence adjacent to the unknown region to be sequenced so that a short synthetic can be made. The oligonucleotide acts as a for DNA synthesis in the direction of the DNA to be sequenced. The DNA to be sequenced is often cloned into a whose sequence is known, facilitating primer synthesis. The DNA is denatured and the primer is allowed to anneal to the DNA strand. A is added to the reaction mixture, extending the DNA for a short distance in the presence of radioactive nucleotides, which labels the new DNA. The reaction is then divided into four equal parts and placed into four separate reaction tubes, each containing all four deoxynucleotides, the nucleotide precursors for DNA synthesis: deoxyadenosine triphosphate (dATP), deoxyguanosine triphosphate (dGTP), and deoxycytidine triphosphate (dCTP), and deoxythymidine triphosphate (dTTP). One modified dideoxynucleotide (ddATP, ddGTP, ddCTP, or ddTTP) is also added to each reaction. The dideoxynucleotides are missing the 3′ hydroxyl group; without the hydroxyl group, no more nucleotides can be added and DNA elongation terminates. Since the dideoxynucleotide constitutes only a small percentage of the available nucleotides, DNA elongation will proceed normally until the DNA polymerase inserts a dideoxynucleotide in place of the normal deoxynucleotide.
Since the terminated fragment is attached to the larger strand, the DNA must be denatured by heat before electrophoresis on a polyacrylamide gel so that the size will correspond accurately to the position of the terminated base. Each of the four reactions is run in a separate lane on the gel and the DNA bands are visualized by autoradiography or UV light. The DNA sequence can then be read directly from the image, reading from top to bottom, or smallest fragment to largest fragment.
Automated Sequencing
Automated sequencing methods are based on variations of chain-terminator methods. In dye-terminator sequencing, for example, each of the four dideoxynucleotides has a different fluorescent dye attached. When the DNA elongation is terminated, the fragment will be labeled with a specific color indicating which nucleotide is in the terminal position. As a result, only one reaction is needed instead of four separate reactions. Additionally, polymerase chain reaction (PCR) is often used in automated sequencing reactions, since it requires much smaller amounts of DNA than original sequencing methods and does not present a risk of sample contamination by cloning vectors. Modern automated sequencing also uses capillary electrophoresis, rather than gel electrophoresis. In this case, the reaction products are electrophoresed through a narrow capillary of polyacrylamide gel with a laser and fluorescence detector at the bottom. As the different-sized fragments reach the bottom, they pass the detector that registers the colors. The data are logged on a computer, which outputs the DNA sequence. This system can be automated so that robots move the samples into reaction tubes and load them into the capillaries. Computers compile and compare the sequence data. Automated sequencing methods can generate tens of thousands of bp of new sequence data per day, often with very little manpower.
Impact
New technologies have led to an increased volume of sequencing throughout the scientific community by simplifying sample preparation and increasing the accessibility to sequencing chemistries and equipment. Emerging DNA sequencing strategies focus on larger-scale sequencing. Further, increased efficiency of high-throughput sequencing technologies will lower the cost of traditional DNA sequencing methods. For example, Next-Generation Sequencing (NGS), which was developed in the 2010s, is a type of DNA sequencing that examines parallel sequencing of millions of small fragments of DNA at once. With NGS, some labs were capable of sequencing more than 100,000 billion bases per year at a substantially reduced cost.
Additional DNA sequencing technologies include in vitro cloning to amplify DNA molecules; parallel sequencing, in which DNA is bound to a solid surface and many samples are sequenced simultaneously; and sequencing by ligation, which uses the to identify nucleotides in a strand of DNA.
The goal of future DNA sequencing is to expand the scale of sequencing—possibly to entire chromosomes or large genomes all at once—and to enhance the precision and decrease the error rate of sequencing reactions. Together, new technologies will have far-reaching applications in the diagnosis and treatment of disease, the development of new biofuels, the protection against chemical and biological warfare agents, the study of anthropology and evolution, the determination of personalized genomes, and the optimization of agriculture, livestock breeding, and bioprocessing of food products.
Key Terms
- automated fluorescent sequencinga modification of chain-termination sequencing that uses fluorescent markers to identify the terminal nucleotides, allowing the automation of sequencing in which robots can carry out large-scale projects
- base pair (bp)two nucleotides on opposite strands of DNA that are linked by a hydrogen bond; in DNA, adenine always pairs with thymine and guanine always pairs with cytosine; often used as a measure of the size of a DNA fragment or the distance along a DNA molecule between markers; both the singular and plural are abbreviated bp
- Maxam-Gilbert sequencingA method of base-specific chemical degradation to determine DNA sequence
- primerA short piece of single-stranded DNA that can hybridize to denatured DNA and provide a start point for extension of DNA by a DNA polymerase
- Sanger sequencingAlso known as chain-terminator sequencing, a method using nucleotides that are missing the 3′ hydroxyl group in order to terminate the polymerization of new DNA at a specific nucleotide
Bibliography
Garcia-Sancho, Miguel. Biology, Computing, and the History of Molecular Sequencing. New York: Palgrave, 2012. Print.
Lister, R., B. D. Gregory, and J. R. Ecker. “Next Is Now: New Technologies for Sequencing of Genomes, Transcriptomes, and Beyond.” Current Opinion in Plant Biology 12.2 (2009): 107–18. Print.
Mardis, E. R. “Next-Generation DNA Sequencing Methods.” Annual Review of Genomics and Human Genetics 9 (2008): 387–402. Print.
Maxam, Allan M., and Walter Gilbert. “A New Method for Sequencing DNA.” Proceedings of the National Academy of Sciences 74 (1977): 560. Print.
Reilly, Philip R. Abraham Lincoln’s DNA and Other Adventures in Genetics. Cold Spring Harbor: Cold Spring Harbor Laboratory P, 2000. Print.
Sanger, F., S. Nicklen, and A. R. Coulson. “DNA Sequencing with Chain-Terminating Inhibitors.” Proceedings of the National Academy of Sciences 74 (1977): 5463. Print.
Satam, Heena, et al. “Next-Generation Sequencing Technology: Current Trends and Advancements." Biology, vol. 12, no. 7, 13 July 2023, doi: 10.3390/biology12070997. Accessed 5 Sept. 2024.
Smith, Lloyd M., et al. “Fluorescence Detection in Automated DNA Sequence Analysis.” Nature 321 (1986): 674. Print.
Trent, R. J. Molecular Medicine: Genomics to Personalized Healthcare. 4th ed. London: Academic, 2012. Print.
Wink, Michael, ed. An Introduction to Molecular Biotechnology. 2nd ed. Weinheim: Wiley, 2011. Print.