Structure and Function of Genes

Genetic information is stored in DNA, and the expression of this information requires several steps that flow in one direction:

Genes are segments of DNA encoding information that ultimately:

Structure of DNA

The molecular structure of DNA forms a double helix with a "backbone" of each strand of the helix consisting of a repeating ...sugar-phosphate-sugar-phosphate... polymer; the sugar is deoxyribose. Attached to the sugar ring is one of four nitrogen-containing bases: adenine (A), guanine (G), cytosine (C), and thymine (T). The Human Genome Project has confirmed that the human DNA contains a little over 3 billion of these bases over 99% of them are the same in all people. In 2001, the first major goal of the Human Genome Project -- a detailed working draft of the sequence of human DNA was published (Nature, Feb 15, 2001; Science, Feb. 16, 2001).

The combination of one of these nitrogenous bases, a sugar molecule, and a phosphate molecule is called a nucleotide -- the basic building block of the DNA molecule.

The two strands of DNA wind around each other, forming a double helix that is held together by weak hydrogen bonds between each thymine and adenine base, as well as between each guanine and cytosine base; each of these pairs of bases is called a base pair, or "bp" for short. The two strands of DNA, then, are complementary; that is, if one strand has the sequence GCATGCCTA, the other strand would be CGTACGGAT. DNA is coiled very tightly -- in order to fit into the nucleus of a cell -- into structures called chromosomes. The DNA from an adult human would actually stretch out to be more than 5 feet long though only 50 trillionths of an inch in width.

The structure of DNA has several important features:

 

Transcription and Translation of DNA

The transfer of genetic transformation from DNA into the primary structure of a protein has two basic steps:

Transcription entails the synthesis of a single-stranded polynucleotide of RNA at an unwound section of DNA with one of the DNA strands serving as a template for the synthesis of the RNA. The product of this process is called an RNA transcript or an mRNA molecule. The result of transcription is that the genetic information encoded in DNA is transferred to RNA; this occurs in the nucleus of the cell.

RNA then carries this information out of the nucleus and into the cytoplasm where it becomes directly involved with protein synthesis via translation.

Translation follows the movement of mRNA to the cytoplasm where it interacts with structures called ribosomes to synthesize a protein. Proteins are a linear sequence of amino acids each of which is specified by the sequence of nucleotides in the RNA (which in turn was specified by the DNA where it was synthesized).

Genetic information is encoded in a sequence of 3 nucleotides termed codons. The four nucleotides of RNA (adenine, guanine, cytosine, and uracil which replaces thymine in the nucleotides from DNA) can be arranged in various combinations to form 64 codons each containing three letters. Since there are 20 amino acids that nature draws on to create proteins, there are more than enough codons in the genetic code to specify the 20 amino acids used in proteins. Though coding for proteins is a critical function of DNA, less than 10% of the genome is actually involved in this.

Gene Structure

The number of genes in the human genome is estimated to be about 35,000, to 40,000 -- considerably fewer than once thought -- dispersed throughout the set of chromosomes. Though the average gene is about 3,000 bases long, the smallest genes may be just a few hundred base pairs; the largest is over two million base pairs in length.

Human genes, like most genes from multi-cellular organiism (eucaryotes), contain introns -- stretches of DNA located within the gene, transcribed into RNA and then spliced out before the RNA is translated into protein. These stretches of DNA have no discernible coding functions. However, it also appears that splicing may occur at various alternative points along DNA allowing for differing proteins to be made from what might otherwise appear to be a single 'gene.'

The diagram also shows regions that contain the coding information that are both transcribed and translated into proteins -- these are termed exons.

On either side of a gene there are regions called flanking regions that play roles in the regulation of gene expression. In the first stage of transcription, an enzyme called RNA polymerase binds to a TATA base sequence in the 5' flanking region (at the "front end" of the gene) adjacent to where transcription is initiated. There are other sequences in the region that serve as sites to which proteins that assist in transcription bind. This entire flanking region prior to the coding region of the gene is called the promotor.

On the far end of the gene, past the coding region of introns and exons, is the 3' flanking region which largely remains untranslated.

Once an mRNA is transcribed from the DNA coding region of a gene, it goes through several processing steps before it leaves the nucleus to be translated in the cytoplasm. This "processing" involves:

Translation of the mRNA at the ribosomes in the cytoplasm then involves several components including other types of RNA including tRNA (transfer RNA which transfers amino acids, one by one, to the growing protein being synthesized and rRNA (ribosomal RNA which makes up most of the ribosome itself.).


Last updated Friday, February 23, 2001.
© 2001 by Fiddler and Pergament. All rights reserved.