Human Genome Project

SIGNIFICANCE: The Human Genome Project will have a profound effect in the twenty-first century, providing the means to identify disease-causing mutations (including those involved in cancer), to design new drugs, to provide human gene therapy, to learn how genes control development, and to understand the origins and evolution of the human race.

Perspective

April 25, 1953, marked the publication of the double-helix model of DNA by James Watson and Francis Crick, based on the experimental data of Rosalind Franklin and others. It was fitting then that fifty years later, in April of 2003, the complete sequence of the human genome was published, marking probably one of the greatest achievements in not only genetics but also all of science. In the years since then, thousands of scientists have been mining these data for information about the human body, how its genes shape development and behavior, and the role mutations play in diseases.

94416532-89667.jpg

Origins of the Human Genome Project

The Human Genome Project (HGP) began as a result of the catastrophic events of World War II: the dropping of atomic bombs on the Japanese cities of Nagasaki and Hiroshima. There were many survivors who had been exposed to high levels of radiation, which is known to cause mutations. Such survivors were stigmatized by society and were considered poor marriage prospects because of potential genetic damage. The US Atomic Energy Commission of the US Department of Energy (DOE) established the Atomic Bomb Casualty Commission in 1947 to assess mutations in such survivors. However, there were no suitable methods to measure these mutations, and it would be many years before suitable techniques would be developed. Knowing the sequence of the human genome would be the greatest tool for identifying human mutations.

Advances in Molecular Biology

As in all areas of science, progress in molecular biology was limited by available technology. Many advances in molecular biology made feasible the undertaking of the HGP. Starting in the 1970s, techniques were developed to isolate and clone individual genes. By 1977, Walter Gilbert and Frederick Sanger had independently developed methods for sequencing DNA, and in 1977, Sanger’s group published the sequence of the first genome, the small bacterial virus Phi X174. In 1985, Kary B. Mullis and colleagues developed the method of polymerase chain reaction (PCR), in which extremely small amounts of DNA could be amplified billions of times, providing significant amounts of specific DNA for analysis. Finally, in 1986, Leroy Hood and Applied Biosystems developed an automated DNA sequencer that could sequence DNA hundreds of times faster than was previously possible. Additional advances in computer technology made it possible to sequence the human genome.

The “Holy Grail” of Molecular Biology

In 1985, a conference of leading scientists was held at the University of California, Santa Cruz, to discuss the feasibility of sequencing the entire human genome. Biologists were looking for the equivalent of a Manhattan Project for biology. The Manhattan Project was the concerted effort of physicists to develop atomic weapons during World War II and resulted in huge increases in government funding for physics research. Walter Gilbert called the HGP the Holy Grail of molecular biology. With impetus from the DOE and the National Research Council, the Human Genome Project was launched in 1990 with James Watson as head. The goal of this project was to completely sequence the human genome of three billion base pairs by 2005 at a cost of $1.00 per base pair. In 1992, Watson resigned over a controversy surrounding the patenting of human sequences. Francis Collins took over as head of the HGP at the National Human Genome Research Institute (NHGRI) of the National Institutes of Health (NIH). The sequencing of genetic model organisms in addition to the human genome was another of the goals of the NHGRI. This included genomes of the bacterium Escherichia coli, yeast, the fruit fly Drosophila melanogaster, the roundworm Caenorhabditis elegans, and other organisms. Moreover, 10 percent of the funding was to be directed toward studies of the social, ethical, and legal implications of learning the human genome.

Competition Between the Public and Private Sectors

J. Craig Venter, a former National Institutes of Health researcher, left the NIH and formed a private company, The Institute for Genomic Research (TIGR). TIGR, using a different approach (known as the shotgun method), was able to sequence the 1.8 million-base-pair genome of the first free-living organism, the bacterium Haemophilus influenzae, in less than a year. In 1998, Venter, along with Perkin-Elmer Corporation, formed the biotech company Celera Genomics to sequence the human genome privately. Celera had more than three hundred of the world’s fastest automated sequencers and a supercomputer to analyze data. Meanwhile, public funds supported scientists in the United States, the United Kingdom, Japan, Canada, Sweden, and fourteen other countries working on HGP sequencing. The public sector was now in competition with Celera. To assure free access, new sequence data from public projects was made available on the Internet each day.

The Human Genome Project Is Completed

In 2001, the first draft of the human genome sequence was published in the February 15 issue of Nature and the February 16 issue of Science. There are many short, repeated sequences of DNA in the genome, and certain regions that were difficult to sequence needed to be sequenced again for accuracy, plus proofreading the sequence for errors in the process. Thus, in April 2003, the final sequence of the human genome was achieved. It is remarkable that a government-funded project was completed two and a half years ahead of schedule and under budget, due to the ever-increasing improvement of DNA technology and accuracy. April 25, 2003, was designated National DNA Day and has remained an annual day to educate the public, especially school-age children, about DNA and genetics in general.

Findings from the Human Genome Project

Perhaps the most surprising finding from the HGP is the relatively small number of human genes in the genome. Scientists had predicted the human genome would contain about one hundred thousand functional genes, yet the actual number of protein-coding sequences is approximately twenty-five thousand, representing only about 1 percent of the entire genome. In comparison, yeast has about six thousand genes, the fruit fly about thirteen thousand, and the Caenorhabditis about eighteen thousand. It was surprising that a complex human had less than twice the number of genes as the roundworm. The human genome also contains 740 genes that encode stable RNAs. The genome of the mouse, another model genetic organism, has provided interesting comparisons to the human genome.

Whose Genome Is It?

Although more than 99.99 percent of the DNA sequences of all humans are identical, a 0.01 percent difference equals approximately 30 million base pair changes among individuals. One important question is, then, whose genome was sequenced? Venter has acknowledged that Celera has been sequencing mostly his DNA. However, the final sequence database is an “average” or “consensus” genome that is a conglomerate of many individuals contributing to the total sequence. Every human carries many and perhaps even hundreds of varying DNA changes. Even before the HGP was completed, databases listing single nucleotide polymorphisms were being established. These databases list the types of genetic variations that occur at individual nucleotides in the genome. For example, a cancer gene database lists the types of mutations that have been identified in specific cancer-causing genes and the frequency of such mutations. Mutations in genes such as BRCA1 and BRCA2 are responsible for breast and ovarian cancers, while mutations in the tumor suppressor gene p53 have been found in the majority of human tumors.

The Future: Genomics and Proteomics

The Human Genome Project has given rise to two new fields of study. Genomics is the study of genomes. To do so requires databases and search engines to seek out information from these sequences. There are hundreds of such databases already established. Scientists can search for complete gene sequences if they know only a short segment of a gene. They can look for related sequences within the same genome or among different species. From such information, one can study the evolution of particular genes.

The next step is to define the human proteome, giving rise to the field of proteomics. Proteomics seeks to determine the expression patterns of genes, the functions of the proteins produced, and the structure of specific proteins derived from their DNA sequence. If a particular protein is involved in a disease process, specific drugs to interfere with it may be designed.

Since 2003, many projects have developed to enhance knowledge of the human genome. Two notable projects are the Cancer Genome Atlas and the Cancer Genome Anatomy Project . The goals of both projects are to determine the genes that underlie the cause of cancer, to find targeted gene therapy treatments, and to prevent those diseases. To date, several outcomes have become important to further progress in understanding the human genome, including the identification of numerous cancer-related genes and the establishment of publicly accessed databases of expressed sequence tags found throughout the genome.

With the success of the sequencing of the human genome has come the sequence completion of many more genomes of organisms, including the sequencing of the cow and dog genomes in 2004, five different domesticated pig breeds in 2005, the domesticated cat in 2007, the gorilla in 2012, the zig-zag eel in 2014, and the Canada lynx in 2018. The genomes of other strategically selected organisms have also been sequenced. The view of the National Human Genome Research Institute (NHGRI) is that to study essential functional and structural components of the human genome most effectively is to compare it with other organisms. In 2021, researchers announced plans to find the genomes of creatures with a backbone. This amounts to 71,657 species. The group set a goal of 125 species per week. In 2023 alone, at least 1,000 different species had their genomes sequenced for the first time.

Another great achievement of the HGP has been the acceleration of innovative technologies to use sequenced data. For example, copy number variants and single nucleotide polymorphisms (SNPs) are now being analyzed and used for the development of genetic tests that were unavailable before. Another technology, microarray analysis, utilizes the human genome to look at large numbers of small segments of DNA that, if mutated, may cause disease. Direct results of the Human Genome Project also include the International HapMap Project, the 1,000 Genomes Project, and commercial "personal genotype sequencing." Interest in genomics also flourished outside of the United States, with the prime minister of the United Kingdom announcing in 2012 the launch of the 100,000 Genomes Project, which aims to work on behalf of the National Health Service in England to improve disease diagnosis and medical research. It was also hoped that this initiative would serve as the beginning of a larger genomics industry in the country. The 1,000 Genomes Project, which came to an end in late 2015, exceeded its goal by analyzing the DNA of more than 2,500 people and offered much insight into genetic differences as well as how genetic changes can lead to disease. The study of the human genome has allowed scientists to make breakthroughs not only in the basic understanding of DNA and the genome but also in how the human genome changes with time and in individuals to cause disease and evolution. Humanity is just beginning to reap the benefits from the Human Genome Project.

Key Terms

  • genomethe entire complement of genetic material (DNA) in a cell
  • genomicsthe branch of genetics dealing with the study of genetic sequences
  • proteomicsthe branch of genetics dealing with the expression, function, and structure of proteins
  • single nucleotide polymorphism (SNP)differences at the individual nucleotide level among individuals

Bibliography

Choudhuri, Supratim. Bioinformatics for Beginners: Genes, Genomes, Molecular Evolution, Databases, and Analytical Tools. Burlington: Elsevier Science, 2014. Digital file.

Collins, Francis, and Karin G. Jegalian. “Deciphering the Code of Life.” Scientific American 281.6 (1999): 86–91. Print.

Dennis, Carina, and Richard Gallagher. The Human Genome. London: Palgrave, 2002. Print.

Greenfieldboyce, Nell. "25 Down and 71,632 to Go: Scientists Seek Genomes of All Critters with a Backbone." NPR, 28 Apr. 2021, www.npr.org/sections/health-shots/2021/04/28/991474676/25-down-and-71-632-to-go-scientists-seek-genomes-of-all-critters-with-a-backbone. Accessed 29 Aug. 2024.

Hampton, Tracy. "Human Genome Initiatives Make Strides to Better Understand Health and Disease." JAMA: The Journal of the American Medical Association 309.14 (2013): 1449–51. MEDLINE with Full Text. Web. 25 July 2014.

Hyde, Michael J., and James A. Herrick. After the Genome: A Language for Our Biotechnological Future. Waco: Baylor UP, 2013. Print.

International Human Genome Sequencing Consortium. “Finishing the Eukaryotic Sequence of the Human Genome.” Nature 431 (2004): 931–45. Print.

International Human Genome Sequencing Consortium. “Initial Sequencing and Analysis of the Human Genome.” Nature 409 (2001): 860–921. Print.

Lythgoe, Luke. "1,000 Species Get Their Genomes Sequenced for the First Time." Sanger Institute, 14 Sept. 2023, sangerinstitute.blog/2023/09/14/1000-species-get-their-genomes-sequenced-for-the-first-time. Accessed 29 Aug. 2024.

Morris, Peter J. "From Mendel to the Human Genome Project." North Carolina Medical Journal 74.6 (2013): 477. MEDLINE Complete. Web. 25 July 2014.

Naidoo, Nasheen, et al. "Human Genetics and Genomics a Decade after the Release of the Draft Sequence of the Human Genome." Human Genomics 5.6 (2011): 577–622. MEDLINE Complete. Web. 25 July 2014.

"The 100,000 Genomes Project." Genomics England. Dept. of Health, n.d. Web. 21 Jan. 2016.

Sulston, John, and Georgina Ferry. The Common Thread: A Story of Science, Politics, Ethics, and the Human Genome. Washington, DC: Joseph Henry, 2002. Print.

Wade, Lizzie. "What 2,500 Sequenced Genomes Say about Humanity's Future." Wired. Condé Nast, 30 Sept. 2015. Web. 21 Jan. 2016.

Wolfsberg, Tyra G., et al. “A User’s Guide to the Human Genome.” Nature Genetics Supplement 32 (2002): 1–79. Print.