Interesting online CONFCHEM discussion going on right now on the ChemWiki and
greater STEMWiki Hyperlibary project. Come join the discussion.
This page, looking at the structure of DNA, is the first in a sequence of pages leading on to how DNA replicates (makes copies of) itself, and then to how information stored in DNA is used to make protein molecules.
These days, most people know about DNA as a complex molecule which carries the genetic code. Most will also have heard of the famous double helix.
I'm going to start with a diagram of the whole structure, and then take it apart to see how it all fits together. The diagram shows a tiny bit of a DNA double helix.
The backbone of DNA is based on a repeated pattern of a sugar group and a phosphate group. The full name of DNA, deoxyribonucleic acid, gives you the name of the sugar present - deoxyribose. Deoxyribose is a modified form of another sugar called ribose. I'm going to give you the structure of that first, because you will need it later anyway. Ribose is the sugar in the backbone of RNA, ribonucleic acid.
This diagram misses out the carbon atoms in the ring for clarity. Each of the four corners where there isn't an atom shown has a carbon atom. The heavier lines are coming out of the screen or paper towards you. In other words, you are looking at the molecule from a bit above the plane of the ring.
So that's ribose. Deoxyribose, as the name might suggest, is ribose which has lost an oxygen atom - "de-oxy".
The only other thing you need to know about deoxyribose (or ribose, for that matter) is how the carbon atoms in the ring are numbered. The carbon atom to the right of the oxygen as we have drawn the ring is given the number 1, and then you work around to the carbon on the CH2OH side group which is number 5.
You will notice that each of the numbers has a small dash by it - 3' or 5', for example. If you just had ribose or deoxyribose on its own, that wouldn't be necessary, but in DNA and RNA these sugars are attached to other ring compounds. The carbons in the sugars are given the little dashes so that they can be distinguished from any numbers given to atoms in the other rings. You read 3' or 5' as "3-prime" or "5-prime".
The other repeating part of the DNA backbone is a phosphate group. A phosphate group is attached to the sugar molecule in place of the -OH group on the 5' carbon.
The final piece that we need to add to this structure before we can build a DNA strand is one of four complicated organic bases. In DNA, these bases are cytosine (C), thymine (T), adenine (A) and guanine (G).
These bases attach in place of the -OH group on the 1' carbon atom in the sugar ring.
What we have produced is known as a nucleotide.
We now need a quick look at the four bases. If you need these in a chemistry exam at this level, the structures will almost certainly be given to you. Here are their structures:
The nitrogen and hydrogen atoms shown in blue on each molecule show where these molecules join on to the deoxyribose. In each case, the hydrogen is lost together with the -OH group on the 1' carbon atom of the sugar. This is a condensation reaction - two molecules joining together with the loss of a small one (not necessarily water).
For example, here is what the nucleotide containing cytosine would look like:
A DNA strand is simply a string of nucleotides joined together. I can show how this happens perfectly well by going back to a simpler diagram and not worrying about the structure of the bases.
The phosphate group on one nucleotide links to the 3' carbon atom on the sugar of another one. In the process, a molecule of water is lost - another condensation reaction.
. . . and you can continue to add more nucleotides in the same way to build up the DNA chain. Now we can simplify all this down to the bare essentials!
What matters in DNA is the sequence the four bases take up in the chain. We aren't particularly interested in the backbone, so we can simplify that down. For the moment, we can simplify the precise structures of the bases as well. We can build the chain based on this fairly obvious simplification:
There is only one possible point of confusion here - and that relates to how the phosphate group, P, is attached to the sugar ring. Notice that it is joined via two lines with an angle between them.
By convention, if you draw lines like this, there is a carbon atom where these two lines join. That is the carbon atom in the CH2 group if you refer back to a previous diagram. If you had tried to attach the phosphate to the ring by a single straight line, that CH2 group would have got lost!
Joining up lots of these gives you a part of a DNA chain. The diagram below is a bit from the middle of a chain. Notice that the individual bases have been identified by the first letters of the base names. (A = adenine, etc). Notice also that there are two different sizes of base. Adenine and guanine are bigger because they both have two rings. Cytosine and thymine only have one ring each.
If the top of this segment was the end of the chain, then the phosphate group would have an -OH group attached to the spare bond rather than another sugar ring. Similarly, if the bottom of this segment of chain was the end, then the spare bond at the bottom would also be to an -OH group on the deoxyribose ring.
Have another look at the diagram we started from:
If you look at this carefully, you will see that an adenine on one chain is always paired with a thymine on the second chain. And a guanine on one chain is always paired with a cytosine on the other one.
So how exactly does this work?
The first thing to notice is that a smaller base is always paired with a bigger one. The effect of this is to keep the two chains at a fixed distance from each other all the way along.
But, more than this, the pairing has to be exactly . . .
adenine (A) pairs with thymine (T);
guanine (G) pairs with cytosine (C).
That is because these particular pairs fit exactly to form very effective hydrogen bonds with each other. It is these hydrogen bonds which hold the two chains together.
The base pairs fit together as follows.
The A-T base pair:
The G-C base pair:
If you try any other combination of base pairs, they won't fit!
A final structure for DNA showing the important bits
Notice that the two chains run in opposite directions, and the right-hand chain is essentially upside-down. You will also notice that I have labelled the ends of these bits of chain with 3' and 5'.
If you followed the left-hand chain to its very end at the top, you would have a phosphate group attached to the 5' carbon in the deoxyribose ring. If you followed it all the way to the other end, you would have an -OH group attached to the 3' carbon.
In the second chain, the top end has a 3' carbon, and the bottom end a 5'. This 5' and 3' notation becomes important when we start talking about the genetic code and genes. The genetic code in genes is always written in the 5' to 3' direction along a chain.
Jim Clark (Chemguide.co.uk)
This material is based upon work supported by the National Science Foundation under Grant Number 1246120