Once Upon a DNA

Omar Shabana explores the ins and outs of the information in our DNA, and how we use it.

Image credit: Arek Socha, Pixabay

26 Million. That’s the number of alleged DNA testing kits to be sold by the 4 biggest genetic-testing companies by the end of 2018. These companies have had a very successful marketing strategy: a promise to show you your undeniable ancestry, a seemingly attractive offer to anyone who might be of mixed race or is simply just curious. Since then the idea of DNA testing, its accuracy, and privacy has been under a strong spotlight and even stronger scrutiny by the media. A few key questions arise: how does testing work?; why might one get varying results depending on the company used?; what other information could be derived from DNA tests?; and what issues might all this raise? 


“Your backstory is in your DNA” are the first words you’ll see if you were to visit  AncestryDNA.com at the time of writing. Is this statement true? Not exactly, and here’s why: 

The DNA code, consisting of the four nucleotides A, T, C, and G, is present in every living thing and the machinery that reads it is (more or less) all the same. Those nucleotides could be thought of as legos of four different colours, and a huge number of them align to form two long helical strands. When it comes to humans, in particular, our DNA is made up of ~6 billion nucleotides arranged as multiple unbranched helical strands. The DNA sequence is read from left to right, just like you reading this article. For example, AGCCCTCCAG is the beginning of the 1,481 nucleotides long sequence for insulin. 

But how does all of this explain anything about ancestry? To put it quite bluntly, it doesn’t. If I take your DNA and apply the strongest sequencing technology present at the moment, it would only show me a bunch of meaningless letters. They become meaningful when they’re compared to other already existing sequences. This is incredibly important to keep in mind. DNA left in a crime scene is meaningless until it’s compared to the DNA of a suspect. Similarly, your DNA alone doesn’t say anything about your origins or your ancestry. Instead, genetic companies put your DNA sequence in an algorithmic software that compares your DNA against thousands of others; and this is where SNPs come in. Humans have ~99.5% identical DNA, therefore the differences that separate you and I are present in only 0.5% of our sequence. Single Nucleotide Polymorphisms (or SNPs for short) are the most common type of genetic differences that separate us. Simply put, they’re the change of a single nucleotide in a person’s genome – my C in a certain gene could perhaps be a G in yours. These SNPs are heritable, passed from parent to child, and they are what is most commonly used to determine how genetically similar you are to someone else. The more SNPs you have in common with someone, the more genetically related you are and the more likely you share a recent common ancestor. 

Companies like AncestryDNA compare your ~300,000 SNPs to their own reference population – thousands of individuals who self-reported their ancestry and provided their DNA for testing. The clusters of SNPs that were found to be common to a specific geographical location were then divided into 43 categories and assigned labels such as “Sub-saharan African” or “West Asian”. AncestryDNA assumes that your DNA is a mixture of their 43 reference populations. The company looks at haplotypes (stretches of DNA) instead of every single SNP because our DNA is inherited in segments. 

But to understand what haplotypes are and how they’re inherited through generations, we need to first take a little detour. Each one of us has two copies of each chromosome, one copy inherited from the father and one from the mother. Each sperm and egg (i.e. gametes), contains only one copy of each chromosome, so when the egg and sperm fuse, the baby will have the normal two copies. We might then ask, how does the body decide whether to use the father’s copy or the mother’s one when making its own gametes? The answer is, it uses a mix of both. During meiosis (the process of producing gametes) every chromosome exchanges DNA fragments with its sister copy. Those fragments are called haplotypes and they contain a cluster of SNPs. Therefore, each chromosome in a resulting gamete is a mix of the two parent chromosomes, with a mixture of haplotypes. This happens in every single generation, so if you have a haplotype that is usually unique to individuals elsewhere in the world, chances are one of your ancestors came from there.

AncestryDNA (2018) depicts how various pieces of DNA from your parents could shuffle and recombine. The red and blue chromosomes from one of your parents will connect with each other and exchange haplotypes. This will produce two chromosomes with a mix of red and blue haplotypes, and one of those chromosomes will be passed on to you.

23andme (2019) shares the geographical distribution of haplotypes they assess for. One of the hundreds of haplotypes is the maternal haplogroup H. It is mainly concentrated around Europe, therefore, if you have it there’s a reasonable chance that one of your ancestors is European.

All the information from the haplotypes is then gathered, added up and put through a statistical model to give out the results you see in advertisements. Both 23andme and AncestryDNA use a hidden Markov model for this step. 

Going back to the “Your backstory is in your DNA” phrase, I hope you could now appreciate that your DNA doesn’t spell out your history; it isn’t compared to preserved remains of humans across history. Instead, it’s compared to other individuals worldwide to get a good estimate of where your parents and grandparents might have moved from. It’s more of an educated guess than a foolproof statement. 

Accuracy and precision

When it comes to ancestral DNA, there are multiple issues with precision, despite what the companies might claim. For starters, the reference population is not uniformly divided into the geographical categories they have. An example would be the disproportionate number of reference populations from Europe when compared to Africa, so much so that some of the companies offered Africans free DNA testing in order to obtain a larger sample size from them. This means that if you are of African descent, your DNA results will be far less accurate and precise than your European counterparts. 

Even though both 23andme and AncestryDNA have very similar numbers of individuals in their reference populations, they still manage to generate different ancestry results from the same customer. This is most likely due to there being different participants in each of the reference groups. In addition, a reference population of around 14,000-16,000 (used by most companies) is most certainly not representative of a global population that is 7 billion strong. In AncestryDNA, North Africa was represented by only 41 people in 2018, when the actual population number is closer to 200 million. 

So different companies use slightly different models, with different reference populations, and different distributions of these populations. It isn’t a surprise then if you get different results depending on the company used. This, however, is changing rapidly. If you’ve done a test you’ll notice that your results will keep changing slightly as time goes on. This is because there are more and more people taking these tests now, which means that companies can dramatically increase their reference populations. This leads to more precise results, as the more data they have, the better their model can be. 

What else can my DNA show me?

Predisposition to diseases is perhaps the most important thing that your DNA could tell you. Some diseases are complicated to decipher through genes while others are relatively straight forward. Cystic fibrosis and thalassemias have become very easy to spot through genetic testing, for example. Such diseases have been extensively studied and their linkage to a particular mutation so strongly correlated that it is now considered a fact that x mutation causes y disease. DNA testing of you and your partner could therefore determine the likelihood of your future child having certain diseases.

That being said, when it comes to diseases such as depression, schizophrenia or cancer, things become much more complicated. Genetic linkages to such diseases are less clearly correlated, and in addition to that, environmental and lifestyle factors come into play. Due to this ambiguity, you are instead said to have a predisposition to a disease. Metaphorically speaking, genetic predisposition is a loaded gun, your environment points it at you, and your lifestyle fires it. Some predispositions are much more concrete in the scientific literature than others, so if your results claim that you’re predisposed to something I’d suggest you do your research about it. 23andme is arguably the most trustworthy private company to obtain such results from, as they are monitored by the Food and Drug Administration (FDA). 

Speaking of food and drugs, there are two fields now on the rise: nutrigenomics and pharmacogenetics. Nutrigenomics studies your DNA to reveal how your body could react towards certain foods. For example, links have been made between particular SNPs and an individual’s likelihood to gain weight from a high fat/carbohydrate diet. Those with the AA genotype in the FTO gene tended to have higher BMIs when compared to those with the TT genotype when ingesting high levels of fat and carbohydrates. Pharmacogenetics studies the same principle, but with drugs instead. The field looks at how different SNPs could affect your reactions to certain drugs. The aim here is to give birth to personalised medicine, where the ‘one size fits all’ approach to drugs and their doses gives way to everyone receiving a personalised drug regime depending on their genetic composition. Both nutrigenomics and pharmacogenetics are relatively young, so we’ll be waiting for several years before such DNA analyses are precise enough and, in the case of pharmacogenetics, approved for use. 

Beyond that, things become very cloudy. Some genetic companies claim to be able to predict your child’s inherent athleticism or academic ability, and it wouldn’t be exaggerating to label these claims as almost laughably bold. We don’t even have universal definitions of athleticism or intellect, never mind know enough about our genes to make these claims. Additionally, such characteristics are so heavily impacted by environment,lifestyle, and epigenetics, that it ends up becoming a mess. Such bold claims bring me to onefinal point.

Have the terms ‘DNA’ and ‘gene’ become cultural buzzwords? 

Who owns DNA? Is it companies, governments, the police, or is it you? There doesn’t necessarily need to be one owner of your DNA. But given the amount of information one could determine about you from this molecular string, it would be wise to limit its access to a handful of bodies. Just like your internet history and the recorded conversations between you and your smartphone, DNA should be private. The issue of DNA privacy is slowly becoming a boogeyman that will scare a lot of us. Once you give your DNA to a company, you give them authorisation to use it however they see fit, including potentially passing that information on to other institutes.  

In 1995 Dorothy Nelkin and M. Susan Lindee published their book The DNA Mystique: The Gene as a Cultural Icon warning of what they called “genetic essentialism”. They argued that there will be a rise in the severe social overestimation of what DNA is able to tell us and the consequent over-reliance on genes to explain things such as criminality and political leanings. The book explored the eugenics movements in the early 90s and the concerning obsession people had with genes. Nelkin and Lindee said in their book, “DNA has assumed a cultural meaning similar to that of the biblical soul. It has become a sacred entity, a way to explore fundamental questions about human life, to define the essence of human existence.” 

As Bill Clinton put it “DNA is the language in which God created life.” It seems that some people are gradually turning to DNA to explain their likes and dislikes, their personality, and who they are as a person. That is misguided to say the least. DNA is studied by biologists as the essence of life, as something that every single known organism possesses and all use it in the same way. However, what is sometimes left out by the media is the fact that a lot of your genes are not expressed, or are expressed in different levels. Gene expression depends on many different factors – including external influences – that scientists are still trying to understand.. So yes, your DNA explains you biologically but it’s in your hands to either exploit it to its best limits or to its worst.

Written by Omar Shabana and edited by Ailie McWhinnie.

Leave a Reply

Your email address will not be published. Required fields are marked *