Photo: iStock.com/Claude Dagenais
Mark Twain famously said: “Anyone who can only think of one way to spell a word obviously lacks imagination.”
Twain may have been referring to the English language, but the language of our genetic code is just as quirky when it comes to spelling.
The secret in the sequence
The human genome contains all the instructions that a human needs to develop, grow and function, written in genetic code. Each separate instruction is known as a gene.
In the same way as computers store digital information using 1s and 0s, humans (and almost every other living organism on the planet) use DNA to store their genetic blueprints.
DNA is incredibly efficient at storing large amounts of information. In theory, just a teaspoon of DNA could store the digital data of the whole world. It does this using an alphabet of four letters, known as A, T, C and G. The key to understanding the genetic code is the exact sequence of these DNA letters.
Hunting for rare genetic spelling mistakes
With over 3 billion letters to choose from in the human genome, spelling mistakes can happen, leading to permanent damage. They can occur in any gene, including those with important roles in protecting us against cancer, although thankfully, very rarely.
Rare mistakes that happen from the time of conception can cause individuals to develop cancer at younger ages than expected, as well as their family members.
Scientists here at The Institute of Cancer Research, London, were the first to discover the BRCA2 gene – mistakes in which cause high risks of breast, ovarian, prostate and other cancers.
Hunting for rare gene mistakes is complex as there is no single ‘right’ answer for what the human genome sequence should be. If there were, all seven billion humans on earth would be identical.
How do you spell yours?
At least 99.9% of the genome sequence is exactly the same in any two humans. The remainder is scientifically interesting because it not only makes us look unique (like giving us different eye and hair colours) but also affects our risk of many diseases, in combination with our lifestyle and environment.
There is variation in the spelling of 0.1% of our genetic sequence, just as in the English language. Did you know that there are at least 155 ways of spelling the name ‘Caitlin’? (notwithstanding the originality of baristas at a certain coffee chain).
These genetic variations are harmless and can occur at 88 million different places in the genome. Some variations are more common in certain populations of the world. A good analogy is British and American spelling – neither ‘donut’ nor ‘doughnut’ is a spelling mistake. Two people from the UK are likely to use ‘doughnut’ whereas a Brit and an American are more likely to use different versions – it all depends on where you come from.
Harmless variations can equally happen in cancer-linked genes like BRCA2. If we look at part of its DNA sequence:
A CAT AAA AGT
The ‘C’ of CAT can either be a DNA letter ‘C’ or ‘T’ and the ‘A’ of CAT can either be an ‘A’ or ‘G’, with no effect on cancer risk.
We know this because cheaper and faster DNA sequencing technologies have allowed researchers to look at the genes of thousands of healthy people. These spelling alternatives were found too frequently in healthy people to be mistakes causing a high risk of cancer. Neither is ‘right’ or ‘wrong’; they are different accepted versions of the same sequence.
High stakes (steaks?) and hot potatoes
As we test more genes in more people, we are increasingly finding spelling differences that have never been seen before, whose effects are unknown. These so-called ‘variants of unknown (or uncertain) significance’ (VUS for short) are a hot potato for geneticists.
Deciding whether a VUS is a harmless spelling variant or a harmful spelling mistake matters hugely for genes like BRCA2. It can affect the treatment of patients with cancer. It can also affect whether those without cancer get a window of opportunity to reduce their risks, such as preventative surgery. Irreversible life-changing decisions hang in the balance, meaning stakes are high to make the right call.
There is no fool-proof method to interpret a VUS; scientists have to use a balance of probabilities, using complicated mathematical calculations and clever computer software.
In many cases, the available evidence is incomplete or conflicting, meaning subjective judgement calls have to be made, which may even change over time. This can lead to disagreements amongst laboratories and added anxiety for patients and their families.
A global genetic spelling bee
It has been 13 years since we first read all 130 volumes of genetic instructions that make up our genome. Sadly we are not yet fluent enough in this genetic language to understand or interpret exactly what we have read, let alone spell its words correctly.
Collaborative initiatives such as the Global Alliance for Genomics and Health, the Human Variome Project and ClinVar are attempting to collect comprehensive information about all of our genetic variations and mistakes. The aim is to come up with a list of which are definitely linked with disease and which are harmless. It is a gargantuan task, hampered by the fact that incorrect assumptions made in the past have been difficult to undo.
BRCA2 is one of 20,000 genes in the human genome and despite being one of the most closely studied over the years, there is still much we do not know. Only after we understand what is ‘normal’ can we can start to unpick more causes of genetic diseases, like rare inherited forms of cancer.
What is for sure is that hundreds of thousands, even millions, more people need to have their genes sequenced before we can truly appreciate how imaginative and variable our genetic language can be in finding right and wrong ways to spell (or should it be ‘skin’?) A CAT.
This article was a joint winner of the Professor Mel Greaves Science Writers of the Year 2016, originally presented at the ICR annual conference in June 2016.
Dr Sabrina Talukdar is a Clinical Research Fellow at the ICR.
comments powered by