Before considering the genetic evidence, this chapter will be a quick overview of some key concepts in genomics and genetics. This is not intended to read like a chapter from a biology textbook, but rather to highlight some of the salient points that are relevant to our overall topic. (FYI: Genetics is the study of single genes and their role in the way traits or conditions are passed from one generation to the next. Genomics is a term that describes the study of all parts of an organism’s genes.)
DNA, chromosomes & genes
DNA (deoxyribonucleic acid) is a molecule containing the genetic instructions that help organisms function. Think of DNA as a set of instructions that describe how something should be built or function. RNA (ribonucleic acid) is a similar molecule, but its job is to carry out the instructions contained in the DNA.
Of the two, RNA is more versatile than DNA and performs a variety of tasks in an organism, but DNA is more stable and holds more complex information for longer periods of time. Precisely how long DNA remains ‘readable’ depends on how well it’s preserved. Research has shown that in perfect conditions it could still be readable after about 1.5 million years, however few things are ever this well preserved.1
DNA is made up of chemicals called nucleotides which have nitrogenous bases, or ‘bases’ for short. In a way, bases are the genetic equivalent of letters in the alphabet. In DNA there are 4 bases: adenine (A), cytosine (C), guanine (G), and thymine (T). These bases always link in pairs, so A always pairs with T, and C always pairs with G.
There about 3.2 billion base pairs in the human genome (The genome is all the DNA in an organism). Remember, DNA is a double-helix, and so one side of the DNA mirrors the other side. Each side of 3.2 billion bases is as long as 1,000 Bibles, and so both sides together contains as many characters as 2,000 Bibles.
The genome’s 3.2 billion base pairs are organised into chromosomes. If the genome was a book, then the chromosomes would be its chapters. Human beings have 23 pairs of chromosomes in every cell (46 chromosomes in total).
The chromosomes are further broken down into genes, which are the basic unit of genetics. Following the book analogy, if a chromosome is a chapter then genes would be the sentences. A gene is a set or sequence of DNA bases that codes for a specific protein, and there are more 20,000 in humans. The longest chromosome has about 8,000 genes, the shortest about 300.
With the genes, the sequences of bases or letters are structured into groups of threes called codons. Thus, the ‘words’ of a gene would read something like GGC TCG TAC AGC ATG. In the book analogy, codons are equivalent to words.
A good way to picture the structure of DNA is to think of a ladder that has been twisted, and down one side of the ladder is a sequence of bases and on the other side of the ladder are the corresponding pairs. This is especially useful for 2 reasons:
- It means that DNA can easily make copies of itself. If we were to pull our ladder apart, it would be easy to manufacture another side for it by simply assembling the corresponding sequence of bases. (i.e. Matching up C with G, T with A, G with C, and A with T).
- The sequence in which these pairs are assembled is the ‘code’ for making proteins. Each protein that is produced is responsible for a particular function – which in turn determines how an organism functions, grows, develops, etc.
Key take-outs and an analogy
The explanation above is a very brief overview, but the key ‘take-out’ from this explanation is the length of the genome, made up of about 3.2 billion base pairs (i.e. pairs of ATs, CGs, GCs and TAs). That provides a lot of data to study by observing the patterns and sequences within the various genes, and even how some of the chromosomes are structured.
Another take-out from this is that because DNA is a chemical molecule, it can remain readable for hundreds of thousands of years in the right conditions. Whilst this might be disappointing from the perspective of reading DNA from dinosaurs that died many millions of years ago (sorry Jurassic Park fans), it’s perfect for our context of understanding human origins, and it gives researchers more than enough information to work with as they compare ancient and modern genomes to understand human origins.
To summarise the short explanation above into a crude analogy, we could think of the genome like this:
- Genome = Book.
- Chromosome = Chapters.
- Genes = Sentences.
- Codons = Words.
- Bases = Characters in the alphabet.
Let’s build on this book analogy to explain some of the essentials of genomics and genetics that are relevant to our context.
Let’s pretend that many hundreds of years ago, there was a village. Each family in the village had something called a ‘Family Book’, and as part of the coming-of-age tradition every teenager had to re-write their own copy of the Family Book.
To do this, they strictly followed a process which meant they took certain chapters from their mother’s copy and certain chapters from their father’s copy, and made every effort to copy exactly what was there. This meant that if either parent had made a copy error, then they copied the error too, exactly as it was.
However, being human, and because these books were quite long, they made their own mistakes as well from time to time. There were various kinds of mistakes:
- A base change (swapping a letter that changes a word):
- e.g.: “It was a good store” becomes “It was a good story”.
- Deletion (dropping a letter or word)
- e.g.: “It was a good story” becomes “It was a god story”.
- Recombination of letters or words
- e.g.: “It was a god story” becomes “It was a dog story”.
- Insertion (inserting a letter or word)
- e.g.: “It was a dog story” becomes “It was a dog history”.
These are very hypothetical changes, but they illustrate how small changes over time can completely transform the meaning of a saying. In real life, think of the example earlier of the chicken embryo where one small genetic tweak made the chicken embryo form a snout with a palate rather than a beak. Back to our story…
Assuming this tradition goes on for hundreds of years, eventually there would be a large number of Family Books to compare to one another. As the young adults married between families they would have brought their Family Books with them, and their children would have copied them as per the tradition. However, based on the content of specific paragraphs it would be possible to find these ‘textual mutations’ or some other kind of marker that helped to trace the family lineages.
By comparing the books of children to the books of their parents, we’d be able to finds all sorts of interesting and helpful statistics, such as: how many mutations get made per generation (on average 130); how large certain families might have been (the Smiths became a huge family); which family groups tended to marry most (Smiths and Joneses); where various family groups tended to live (Smiths seem to live mostly in the north, Jones in the East); who had some interesting ancestors (the Whyatts have some Brownes in their distant family tree) and so on.
This little analogy is a cameo of the broader fields of genomics and genetics. Genomics is the study of the whole book, whereas genetics is the study of what particular sentences mean.
Remember that each set of chromosomes in the human genome is as long as 1,000 Bibles, so that is a very rich source of sequences to compare. So even if someone were to say that only 1% of our genome was sequenced, that would still equate to 10 Bibles worth of characters (bases) to study.
DNA is especially important in the study of human origins, because the variations between two genomes can tell us how closely or distantly related they are. The mutations that were alluded to in the story above are the random changes that happen in all DNA sequences over time. While they can have significant consequences, most don’t and are quietly passed down to descendants. When genomes are sequenced and compared, these inherited mutations can be used to trace lineages and build a family tree of sorts. If two individuals share the same mutation, they must be descended from the same ancestor.
But, aren’t mutations lethal?
Some are, in which case the embryo doesn’t develop. Some are detrimental, but not lethal (like Down Syndrome). Others occur in areas of the genome that are known as non-coding, and so they are simply harmless sequences of DNA. These non-coding sequences make up most of our genome, and so they are passed down from generation to generation with no impact at all. By tracking where these mutations occur, it’s possible to assess who is descended from whom. (Remember that on average, there are roughly 130 mutations between each generation.)
DNA as a mystery solver
Using DNA to solve human mysteries has come a long way over last five decades. For example, DNA evidence has been robust enough to convict or exonerate people for crimes. Thirty years ago in 1987, Colin Pitchfork became the first murderer to be convicted after his genetic profile matched semen samples found on two murdered women.2 Meanwhile, at the opposite end of justice, Ron Williamson was convicted for murder and spent 11 years in prison before the evidence presented against him was DNA tested and found to be that of a man who was ironically a witness during Williamson’s trial. (Incidentally, Williamson’s story was used as the basis for a John Grisham novel The Innocent Man: Murder and Injustice in a Small Town.)3
In 2015, an Australian man was convicted of multiple assaults after police were able to match his DNA from crime scenes with that of a close relative whose DNA was already on record. This helped the police narrow down suspects until he was caught.4
The point is that DNA has proven to be immensely valuable in solving mysteries, and so it’s easy to understand the benefits genetic discoveries are having on archaeology and palaeoanthropology. So with that overview of genomics and genetics in mind, let’s return to our original discussion…
1 Morten Allentoft et al, The half-life of DNA in bone: measuring decay kinetics in 158 dated fossils (Proceedings of the Royal Society B: Biological Sciences, 2012) 279:4724-4733
2 https://en.wikipedia.org/wiki/Colin_Pitchfork (Accessed 2 Oct 2016)
3 https://en.wikipedia.org/wiki/Ron_Williamson (Accessed 2 Oct 2016).
4 Alleged ‘North Adelaide rapist’ charged over sexual assaults in 2012 after police use DNA testing for evidence (ABC News, 24 July 2015) https://goo.gl/ZTqb6W