| Home | Article Database | Resources | Tools & Just for Fun | Search HY |

About the Human Genome Project

The Human Genome Project is an ambitious effort to understand the hereditary instructions that make each of us unique. The goal of this effort is to find the location of the 100,000 or so human genes and to read the entire genetic script, all 3 billion bits of information, by the year 2005.

Inside the nucleus of nearly every cell in the body, a complex set of genetic instructions, known as the human genome, is contained on 23 pairs of chromosomes. Chromosomes are mostly made of long chains of a chemical called DNA--deoxyribonucleic acid.

Even before it is complete, the Human Genome Project promises to transform both biology and medicine. Our genes orchestrate the development of a single-celled egg into a fully formed adult. Genes influence not only what we look like but what diseases we may eventually get. Understanding the complete set of genes, known as the human genome, will shed light on the mysteries of how a baby develops. It also promises to usher in an era of molecular medicine, with precise new approaches to the diagnosis, treatment, and prevention of disease. In short, the international Human Genome Project, which involves hundreds of scientists worldwide, is an investigation of ourselves. Launched in 1990, the project is supported in the United States by the National Institutes of Health and the Department of Energy.

Hereditary instructions are written in a four-letter code, with each letter corresponding to one of the chemical constituents of DNA: A, G, C, T. Each band on this electrophoresis gel represents one of the letters.

Our genes are made of DNA, a long, threadlike molecule coiled inside our cells. Within the cell nucleus, the DNA is packaged into 23 pairs of chromosomes. Each chromosome, in turn, carries thousands of genes arrayed like beads on a string. Genes, which are simply short segments of DNA, are packets of instructions that tell cells how to behave. They do so by specifying the instructions for making particular proteins. The hereditary instructions are written in a four-letter code, with each letter corresponding to one of the chemical constituents of DNA: A, G, C, T. Genes are, in essence, the "paragraphs" in this DNA language, with a certain sequence of As, Gs, Cs, and Ts constituting a recipe for a specific protein. If the DNA language becomes garbled or a word is misspelled, the cell may make the wrong protein, or too much or too little of the right one-- mistakes that often result in disease. In some cases, such as sickle cell anemia, just a single misplaced letter is sufficient to cause the disease.

Once the molecular basis of a disease is revealed, scientists have a far better chance of defeating it. These scientists are developing strategies for gene therapy.

Errors in our genes are responsible for an estimated 3000 to 4000 clearly hereditary diseases, including Huntington's disease, cystic fibrosis, neurofibromatosis, Duchenne muscular dystrophy, and many others. What's more, altered genes are now known to play a part in cancer, heart disease, diabetes, and many other common diseases. In these more common and complex disorders, genetic alterations increase a person's risk of developing that disorder. The disease itself results from the interaction of such genetic predispositions and environmental factors, including diet and lifestyle.

The Human Genome Project will develop tools to identify the genes involved in both rare and common diseases over the next 15 or 20 years. Such discoveries, in turn, are likely to bring improvements in the early detection and treatment of disease and new approaches to prevention. Once the molecular basis of a disease is revealed, scientists have a far better chance of defeating it. One approach is to design highly targeted drugs that act on the cause, not merely the symptoms, of disease. Another is to correct or replace the altered gene through gene therapy. Even before that, however, gene discovery can lead to predictive tests that can tell a person's likelihood of getting a disease long before symptoms appear. In some cases, preventive actions can then be undertaken that may avert the disease entirely or else detect it at its earliest stages, when treatment is more likely to be successful.

Once physical maps are complete, investigators can localize a gene to a particular region of a chromosome and then simply go to the freezer where DNA for the physical map is stored and pick out the piece that contains the gene.

But finding disease genes can be harder than looking for the proverbial needle in a haystack. This is especially true when the disease is poorly understood at the start of the gene search, as was the case for cystic fibrosis. The problem lies in the vast size of the human genome, which consists of 3 billion chemical bases. If printed out, the entire human genome would fill 1000 one-thousand page telephone books. Somewhere in that mass of letters lurks the suspect gene--but where? Without clues to guide them, scientists have had to scour all the chromosomes, a practice that until recently could take up to 10 years. Not surprisingly, only about two dozen disease genes have been found this way.

New technologies, such as fluorescent in situ hybridization, allow scientists to identify the location of specific pieces of DNA on a chromosome.

The Human Genome Project is designed to speed this process once and for all by providing new tools and techniques that will enable scientists to find genes quickly and efficiently. The first of these tools are maps of each chromosome. One type of map, called a genetic map, consists of thousands of landmarks--short, distinctive pieces of DNA--more or less evenly spaced along the chromosomes. Now very detailed, this map should enable researchers to pinpoint the location of a gene between any two markers. Another important step is to create what are called physical maps of each chromosome, a process that is also well under way. Physical maps consist of overlapping pieces of DNA spanning an entire chromosome. Once these maps are complete, investigators can localize a gene to a particular region of a chromosome by using a genetic map and then can simply go to the freezer, where the DNA for the physical map is stored, and pick out that piece to study rather than searching through the chromosomes all over again.

The ultimate goal of the Genome Project is to decode, letter by letter, the exact sequence of all 3 billion nucleotide bases that make up the human genome. It will be a daunting task. Before plunging into massive sequencing, researchers from numerous fields--biology, physics, engineering, and computer science, to name a few--are developing automated technologies to reduce the time and cost of sequencing. Once the human genome sequence is completed, attention can shift from the job of finding genes, which will then simply be a matter of scanning a computer database, to understanding them.

The ultimate goal of the Human Genome Project is to decode, letter by letter, the exact sequence of all 3 billion nucleotide bases that make up the human genome. Just a single misplaced letter is sufficient to cause disease.

In its first 5 years alone, the Human Genome Project has already had a profound effect. Thanks to tools emerging from the project, the pace of gene discovery has nearly quadrupled. The gene involved in cystic fibrosis, the most common lethal hereditary disease among Caucasians, was identified in 1989; already, a diagnostic test is available to identify gene carriers among high-risk families, and the first human gene therapy efforts are under way in federally approved clinical trials. In early 1994, scientists discovered two genes involved in a hereditary form of colon cancer. An estimated 1 million Americans carry misspelled copies of these genes, which give them a 70% to 80% likelihood of developing colon cancer. Now that the genes are in hand, a simple blood test to detect those high-risk individuals is not far off. This test will open the door to preventive strategies that promise to greatly reduce deaths from this disease (see below).

The newfound ability to probe our genes, however, can be a double- edged sword. For some diseases, for instance, our ability to detect the nonfunctional gene has outpaced our ability to do anything about the disease it causes. Huntington disease is a case in point. Although a predictive test for high-risk families has been available for years, only a minority of these individuals has decided to be tested. The reason? There is no way to cure or prevent Huntington's disease, and some individuals would rather live the uncertainty than with the knowledge that they will be struck, sometime in midlife, with a fatal disease. And what might happen if a health insurance company or a potential employer learns that an individual is destined to develop Huntington disease--might that person be denied coverage or a job?

Because of such concerns, the Human Genome Project has, since its inception, devoted about 5% of its budget to research aimed at anticipating and resolving the ethical, legal, and social issues likely to arise from this research. This is one of the first times scientists have begun to explore the potential consequences of their research before a crisis had arisen. With careful attention to these ethical quandaries, and adequate safeguards when necessary, society can reap the full benefits of the Human Genome Project.

Hereditary Colon Cancer: Genetic Discoveries Offer Hope for Prevention

Case 1:
Beth M.'s father died of colon cancer, as did her grandmother. Now two of her brothers, both in their 40's, have been diagnosed with colon cancer. Beth, age 37, feels a curse is hanging over her family and is worried about her future and that of her children.
Case 2:
Paul C. was 35 when his doctor told him the grim news: he had advanced colon cancer. As far as he knew, Paul had no family history of the disease. But after checking, Paul learned that several aunts and uncles had died of colon cancer at an early age.

Further research revealed that some members of both Beth and Paul's families carry an altered gene, passed from parent to child, that predisposes them to a form of inherited colon cancer, known as hereditary nonpolyposis colorectal cancer (HNPCC). Sometimes difficult to diagnose, HNPCC is believed to account for one in six of all colon cancer cases. Cancers arise from a multistep process, which involves the interplay of multiple changes, or mutations, in several different genes, in combination with environmental factors such as diet or lifestyle. In the most common, noninherited forms of cancer, the genetic changes are acquired after birth. But individuals who have an hereditary risk for cancer are born with one altered gene--in other words, they are born one step into the cancer process. In hereditary nonpolyposis colorectal cancer, for instance, children who inherit an altered gene from either parent face a 70 to 80% chance of developing this disease, usually at an early age. Women also face a markedly increased risk of uterine and ovarian cancer.

Though scientists had known for years that an altered gene was to blame for this hereditary colon cancer, finding it was tricky for they had few clues as to where, on any of the 23 pairs of chromosomes, the gene might reside. Finally, using tools emerging from the Human Genome Project, an international team tracked the gene to a region of chromosomes 2. Seven months later, two teams zeroed in on the culprit. Just three months after that, they had identified a second gene on chromosome 3 also at work in HNPCC. Together, these genes account for most cases of this inherited cancer.

Using tools emerging from the Human Genome Project, an international team tracked the gene for hereditary nonpolyposis colon cancer to a region of chromosome 2.

These discoveries offer a preview of how the Human Genome Project is likely to transform medicine by opening up new approaches to prevention. The earliest beneficiaries will be those families facing a very high risk of colon cancer. First, for those who choose to take it, will come a simple blood test to determine who in these cancer- prone families does or does not carry the altered genes. The consequences could be enormous, for as many as 1 in 200, or 1 million Americans, may carry one or the other of these altered genes. Individuals found to carry an altered gene would likely be counseled to adopt a high-fiber, low-fat diet in the hope of preventing cancer. They would also be advised to start yearly examinations of the colon at about age 30. Such exams should help physicians detect any benign polyps, wart-like growths on the colon, early in the disease process and then remove them before they turn malignant. For those individuals who turn out not to carry the altered genes, the diagnostic test may be a huge relief, removing the fear they have lived under and sparing them the need for frequent colonoscopies.

Despite the life-saving potential of such diagnostic tests, numerous issues need to be resolved before they are introduced into general medical practice. Genetic testing is not so simple as drawing blood and telling someone the results. For one, the best way to test large numbers of individuals is by no means clear. In deciding whether or not to be tested, individuals need information not only about the disorder and its risk but also about the test and its limitations. Equally important, genetic testing must be accompanied by counseling to help people cope with information about their future risk, whatever the outcome of the test. Those who test positive and who are trying to decide what course to pursue will need to know how effective various strategies, such as frequent colonoscopy and polyp removal, actually are at preventing colon cancer. Definitive answers are still lacking for these questions. Broader, societal issues arise as well, such as how to protect the confidentiality of genetic information and ensure that it is not used to discriminate against individuals in employment or insurance.

Genetic testing must be accompanied by conseling to help people cope with information about their future risk, whatever the outcome of the test.

Even before these colon cancer susceptibility genes were discovered, the Human Genome Project had begun planning pilot studies to address these and other questions about testing for cancer risk. It is important that these questions be answered now, before widespread testing begins. The identification of genes involved in hereditary colon cancer is just one in a long string of discoveries that can be expected as the Human Genome Project progresses. Careful attention to these social and ethical issues now will help prepare the public and the medical profession for the choices that lie ahead.

How to Conquer a Genetic Disease

Adapted with permission from material provided by the Howard Hughes Medical Institute.
Nearly 4,000 genetic diseases afflict human beings. Given enough time and effort, scientists can learn to prevent or treat a great many of them. This requires answering three questions--major landmarks on a trail of genetic discoveries:

  • Which altered gene causes the disease?
  • What protein does this gene normally produce?
  • Can the altered protein or gene be fixed or replaced?

Two different strategies may be used. Researchers may find the altered protein first (if it can be detected chemically in tissues that are affected by the disease) and then locate the gene that codes for it. When this is impossible, they use positional cloning: They find the gene first (by zeroing in on the DNA inherited with the disease, or by locating a similar gene in a mouse) and the identify the protein the gene makes.

The trail shown here illustrates positional cloning. This strategy recently led to spectacular progress toward diagnosing or treating cystic fibrosis (CF), Duchenne muscular dystrophy, neurofibromatosis, and other inherited disorders.

A Brief Key to Basic Genetics

A human cell
Each of the 100 trillion cells in the human body (except blood cells contains the entire human genome--all the genetic information necessary to build a human being. This information is encoded in 6 billion base pairs, subunits of DNA. (Egg and sperm cells each have half this ammount of DNA.)

The cell nucleus

Inside the cell nucleus, 6 feet of DNA are packaged into 23 pairs of chromosomes (one chromosome in each pair coming from each parent).

A chromosome

Each of the 46 human chromosomes contains the DNA for thousands of individual genes, the units of heredity.

A gene

Each gene is a segment of double-stranded DNA that holds the recipe for making a specific molecule, usually a protein. These recipes are spelled out in varying sequences of the four chemical bases in DNA: adenine (A), thymine (T), guanine (G) and cytosine (C). The bases form interlocking pairs that can fit together in only one way: A pairs with T; G pairs with C.

A protein

Proteins, which are made up of amino acids, are the body's workhorses--essential components of all organs and chemical activities. Their function depends on their shapes, which are determined by the 50,000 to 100,000 genes in the cell nucleus.

Which Gene is at Fault?

1. A child (blue box) develops a currently incurable genetic disease. Scientists who wish to find a specific treatment or a means of preventing the disease in other children must trace it to its cause: an altered gene.
2.Various clues, such as a visibly missing piece of a chromosome, may reveal the gene's rough location on a chromosome. When there are no such clues, researchers look for markers of the disease by comparing the DNA of the affected child to that of parents, relatives, and persons in other families. When placed on a "genetic map," these markers reveal which chromosome carries the altered gene. As more markers are added to the map, the location is narrowed to the space between two known markers.
RESULT: Scientists may able to diagnose the disease prenatally by following the inheritance of markers in an affected family. They may also recognize healthy carriers of the altered gene.
"Walking" or "jumping" toward the gene, scientists create a physical map, or a chain of overlapping segments of DNA in the space between the flanking markers. One of these segments must contain the altered gene. In the future, when researchers have covered all the chromosomes with overlapping fragments of DNA, any fragment they want will be available from a computer database.

Zeroing in on the altered gene, scientists analyze each segment: Is it different from normal DNA? Finally they find the guilty gene and determine the error in its sequence of bases. The most common error in the CF gene is a deletion of three DNA bases out of a total 250,000.

What Protein Does it Make?

RESULT: Scientists can test for the disease directly in patients and prenatally. They can identify healthy carriers of the altered gene in the general population--not just among members of an affected family. They can study the disease process in cultured cells and in animals, with a view to developing new treatments.

Knowing the gene's sequence, scientists use the genetic code to determine which amino acids make up the protein. Then they study the protein and find out what it is supposed to do. The tens of thousands of proteins in the body have different shapes and do different jobs, depending on instructions encoded in the gene.

Can the Protein or Gene be Replaced?

  • Does the sick child's altered gene produce too little protein, a flawed protein, or not protein at all? Scientists need to understand just how the protein change causes the disease.
  • RESULT: As the mechanism of the disease becomes clear, scientists can devise new approaches to treatment involving either the protein or the gene. Understanding a relatively rare inherited disorder may also bring important insights into more common and complex diseases.
  • To make up for the genetic error, scientists may try to replace a missing or ineffective protein with a drug or with the normal protein. Such experiments are usually carried out first in cultured cells in the laboratory, then in animals, and finally in humans.
  • Another option is gene therapy. Some scientists "infect" cells with a virus into which they have inserted normal genes. Others use non-viral methods or even inject DNA directly into cells. Experiments that work in cultured cells are tried in animals and then in humans. For example, a patient's bone-marrow cells may be removed, treated with normal genes, and returned to the patient.
  • RESULT: Treatments are being developed for some genetic diseases. People will always carry genetic alterations, but in the future, prevention and treatment will vastly reduce suffering from genetic diseases.

Document Information

For more information about the Human Genome Project, contact the Office of Communications, National Human Genome Research Institute, National Institutes of Health, Building 31, Room 4B09, 9000 Rockville Pike, Bethesda, MD 20892, (301) 402-0911

DEPARTMENT OF HEALTH AND HUMAN SERVICES Public Health Service National Institutes of Health Hardcopy: NIH Publication No. 95-3897