What we do


We consider a wide variety of health information from our cohort volunteers.
We also link these data to "omics" information that is typically obtained from blood samples donated by our volunteers. This includes:

Genomics - DNA can be represented by a readout that is six billion letters long across 22 pairs of chromosomes (one each from our birth parents) plus the chromosomes that determine our biological sex (X and Y chromosomes - XX is female, XY is male). There are only four letters in the DNA alphabet (A, C, T, and G) and we have the same letters at most positions. Where we differ helps to make us who we are by influencing things such as hair colour, eye colour, height, and risk of diseases.
Epigenomics - whereas genetic information is fixed throughout our lives, chemical additions to our DNA can help to turn our genes on and off, a bit like a dimmer switch for a light bulb. One example is DNA methylation (DNAm), which is referred to as an epigenetic modification. DNAm involves the addition of a chemical group to the C letter in the DNA alphabet. While the DNA alphabet can influence where these additions take place, they are also influenced by environmental factors. For example, we can count the number of cells with DNAm at a certain gene to identify individuals who are likely to be current smokers, former smokers, or never smokers. Some of these DNAm marks remain present up to 30 years after we might have stopped smoking. As DNAm is sensitive to both genetic and environmental factors, it has great potential to improve risk prediction for many disease outcomes.   
Proteomics – our blood contains many proteins that keep us healthy and these proteins are the targets of many drugs. There are around 20,000 proteins. The levels of some proteins in our blood are linked to health outcomes like cognitive decline and dementia. Protein levels can be influenced by our underlying genetics and can be regulated by epigenetic factors, such as DNAm.
By looking at how we differ in terms of our genetics and epigenetics, we can better understand how differences might arise in both protein levels and subsequent health outcomes. If we identify associations between a specific protein and a health outcome, we often want to know if higher levels of the protein are increasing disease risk or if the disease process is increasing the levels of the protein. We can use a variety of analysis methods that link genetic and epigenetic factors to both the protein and the health outcome to disentangle what is cause and what is consequence.


We are also interested in combining omics information (such as genetics, epigenetics, and proteomics) to improve risk prediction of disease outcomes. We try to identify patterns in these omics data that help us to predict who develops a disease more accurately than the current gold standard predictors.

Risk prediction 
It’s important to note the difference between “probabilistic” and “deterministic” risk. The latter suggests that something is definitely going to happen – if I jump into a swimming pool then I will definitely get wet – whereas the former tries to quantify chance. For example, it doesn’t always rain when the weather forecast suggests it will. In the biomedical field, we typically make probabilistic predictions i.e., based on certain characteristics (genetics, epigenetics, lifestyle etc.), you might be at a higher risk of developing a disease than someone with a different pattern of characteristics. Of course, it is entirely possible that you end up not developing the disease but the other person does. However, at a large-scale level, those with higher risk will typically be more likely to develop the disease than those at low risk. How accurate these predictions are depend on how accurate our statistical models are and how well they translate across different groups of people.