Making science work for health

Polygenic scores – one size does not fit all

PHG Foundation Season 1 Episode 4

Dr Sowmiya Moorthie, Senior Policy Analyst at the PHG Foundation, explains polygenic scores, their potential  - and their limitations – in predicting risk of common diseases.

Welcome back to Making science work for health, the PHG Foundation podcast that explains the most promising developments in science and their implications for healthcare.
 
 In each episode, host Ofori Canacoo discusses with a PHG Foundation policy analyst, the underpinning science, the ambitions for improving population health and the impact it could have on patients, on society and on the people delivering your healthcare.
 
 Lots  more about polygenic scores  can be found on the PHG Foundation website  phgfoundation.org, where you can also read our latest report on the topic  Evaluation of polygenic score applications which was authored by Dr Moorthie.

If you have any questions about the topic then you can email us at intelligence@phgfoundation.org

Ofori: 0:12

Welcome to'Making science work for health', the PHG Foundation's podcast, exploring developments in genomics and related emerging health technologies. Social media and the many digital news outlets now mean more of us than ever are aware of the progress being made by teams of intrepid scientists and researchers around the world. Many of the latest advances feature genomics and'omics related technologies, the field in which the PHG Foundation has 25 years of experience, helping policymakers get to grips with practical, on the ground delivery.'Making science work for health' aims to strip away the gloss and explain what new science means for patients, health professionals, and members of society. My name is Ofori Canacoo, part of the communications team at the PHG Foundation and host of'Making science work for health'. For this episode, we're talking about polygenic scores. Genetic research is ever increasing and one area of particular interest is using knowledge of common variants, or commonly occurring genetic changes, to determine the potential risk of disease susceptibility. Joining me for this episode is Dr. Sowmiya Moorthie, Senior Policy Analyst for Epidemiology at the PHG Foundation. Hello, Sowmi.

Sowmiya: 1:23

Hello.

Ofori: 1:24

How are you?

Sowmiya: 1:25

I'm good, thank you.

Ofori: 1:26

Would you like to tell us a bit about yourself?

Sowmiya: 1:28

I am a Senior Policy Analyst at the PHG Foundation. I work as part of the science group. My background is in epidemiology, science, public health, and I lead the work program on polygenic scores.

Ofori: 1:45

Great. So, polygenic scores is why we're here today. They're being talked about a lot at the moment in the context of disease prevention, particularly for cancers and other common diseases. Is that correct?

Sowmiya: 1:57

Yes, well, they've been talked a lot, actually, over the past 10 years. They've been talked about quite a lot, actually. And definitely from the perspective of common diseases, like cancer, diabetes, cardiovascular disease. But also, they've been discussed in the context of Mendelian diseases. So, when you think about cancers that run in families, or cardiac arrhythmias that have an inherited basis, and so on, as a way to improve risk prediction.

Ofori: 2:21

So why is there such an interest in polygenic scores?

Sowmiya: 2:24

Because we know that diseases have a genetic component and we already use knowledge of the involvement of genes in disease to help treatment and management of people with cancer or rare diseases and often the genetic changes we're interested in in such cases are rare or they're not very common and they can be considered as a small subset of the genetic underpinnings of a disease. So for many years people have been trying to understand the role of common genetic variants in disease so that we can use that information as well to help sort of improve disease prevention management and so on. That's where polygenic scores come in. For many years people have been trying to understand the role of common genetic variants in disease and to develop products that enable application of this knowledge as part of clinical and public health practice. And as the name suggests, common variants are found in more people, so such information ultimately could be relevant across more populations or more of the population.

Ofori: 3:19

I believe the concept of polygenic scores is rather complicated, so are you able to provide a user friendly definition to help us grasp what we'll be talking about today, at least for my benefit?

Sowmiya: 3:29

So you can think of polygenic scores as one way of assessing an individual's probability of developing certain outcomes, including diseases, but essentially what they're capturing is the genetic component of that risk. But there are some provisos in that. So, I often think of them as a proxy biomarker or measure, calculated on the basis of knowing your, what we call an individual's genotype. And they capture some proportion of genetic risk. So not all your genetic risk, because it depends what we know, and that genetic risk is only a part of overall risk.

Ofori: 4:10

Okay, so going a bit further on that then, what can polygenic scores tell us about a person's risk of developing a common disease such as prostate cancer or heart disease?

Sowmiya: 4:23

So this will differ depending on the disease and exactly what you're trying to predict. So if you think about common diseases, or more often now common complex diseases because they are common but they're also complex, because lots of things contribute to risk or your probability of developing that disease. There's your biology or genetics. There's also your lifestyle then also other environmental or what some people call random factors that you can't quite capture and often it's those random factors that play a much larger role and often for common complex diseases our risk is not just due to one of those things, it's an interplay between all those different things. So genetics plays a role and we can use polygenic scores to estimate some of this genetic component using a polygenic score. So for heart disease, polygenic scores can provide an estimate of the genetic contribution to risk of disease, but it only captures a proportion of overall genetic risk. i. e. That you can find from those sort of common genetic variants that have been identified in epidemiological studies to be associated with disease, because we only know a little bit, because we know a lot of genes, a lot of rare variants, and a lot of common variants will contribute to heart disease. We've identified some rare variants, we've identified some common variants, but not the whole totality of it. And that's why it's quite important to emphasize that it doesn't capture overall genetic risk, but a proportion based on what we know currently. Overall risk, as I said, is modulated by other non genetic factors like your lifestyle, especially in the case of complex diseases. So you have to be really careful in the way you interpret information from polygenic scores. And because we're at different stages of identifying the genetics underlying different diseases, and in developing models that can help you calculate a polygenic score, and then interpret that, it sort of differs for different diseases, what you can tell about your risk of developing it, based on how you want to use it, and our understanding of the genetics.

Ofori: 6:24

In what ways do polygenic scores differ from other biomarkers?

Sowmiya: 6:28

I think the major way I would say they differ from other genetic biomarkers is that they're a calculated score. So usually when we use genetics, such as in the context of rare diseases or cancer, where you're using it either to diagnose whether someone's got a condition or not. So for example, if you're thinking about if someone might come to you presenting with symptoms of cystic fibrosis, you might run a diagnostic test. In this context, you're looking for the presence or absence of specific variants to then make a decision about whether or not they have that particular condition. With polygenic scores, what you're doing is you're looking across a series of variants, what we call single nucleotide polymorphisms, or SNPs for short. So they're individual changes in the genome and you're looking across many of these, often hundreds of them across the genome, bringing it together and producing a score and using that score to give you some information about whether that person may or may not develop a particular condition. So I think it's important to bear in mind that they differ from other genetic biomarkers in that you're not looking directly for the presence or absence of a variant. You're bringing together information from across the genome, and you're using an algorithm to bring that information together and calculate a score. And that score tells you something about whether or not an individual may or may not develop a particular condition. So it's quite different to other genetic biomarkers and conceptually they're also different because it's giving you probabilistic risk information and you can think of it as a proxy biomarker. It's not directly measuring whether the presence or absence of variants, but using a calculated score to give some indication of whether or not someone will develop a disease.

Ofori: 8:11

If at all possible, could you take us through the potential ways of calculating polygenic scores?

Sowmiya: 8:16

Yep, so this requires a number of steps and I'll go through each one a little bit. So the first step is identifying SNPs. So we have studies which have identified many common genetic variants, or as I said SNPs, that are associated with disease. And we know that each variant only has a small effect on disease. So if you look at each individually, it doesn't tell you a lot about susceptibility of getting something. Also, individuals have different sets of these variants. So it makes using these SNPs individually for disease risk prediction not very user friendly. But you can look at their collective impact, which is what polygenic scores are doing, and use this information. So you aggregate all the information across the SNPs, and that's what a polygenic score is actually telling you. So in order to aggregate and calculate it, what you need to do is develop a model. And a model is basically just an algorithm that allows you to bring together all this information from these SNPs and calculate a score. And I always go back to cardiovascular disease and the QRISK tool as a good example. So the QRISK tool is widely used within clinical practice and what it's doing is taking into consideration things like your age, sex, BMI, maybe a cholesterol measure, your blood pressure, and it brings it all together, applies an algorithm to this information, and calculates a score that tells you where you lie on a distribution of risk. So if you think about polygenic scores, they're essentially doing the same thing. So you're taking each of these individual SNPs, using an algorithm to bring information across these SNPs together, and calculating a score, and you can look at where you fall in a distribution and translate that into an indication of risk. So in a very simplistic way, they're essentially a process of calculating a score. So usually what happens is that within research, you do research to identify the SNPs, usually through genome wide association studies. Then you have further studies that go on to identify which SNPs are best used as part of a model. So these are polygenic score modeling studies. There's also research that goes on in creating different algorithms and models to explain what's the best way of bringing together the SNP information. There's a lot of underpinning research, which is trying to identify which is the best SNPs to include in a model. What's the weight you should give them. And so this is really about which is more important in relation to disease. And this is based on looking at their effect sizes, and what's the best way of bringing this information together that can give you the best prediction. And if you go back to what I said originally, polygenic scores give you a little bit of information on your overall risk. So what we know is that you can use a model to calculate your polygenic score, but it doesn't really tell you much about your overall risk of developing a particular outcome and you really want to think about all the other factors that also contribute to risk. So, similar to the cardiovascular disease tool, you might want to think about age, sex, BMI and bring all of those together with a polygenic score. And you can include that as well as part of this and this is what is called an integrated risk model or integrated risk calculation to then improve your estimation of risk. And I think this is what's been put across now and it being more accepted, is that polygenic scores by themselves are not very predictive. But if you put them to use as part of integrated risk prediction, they have much more power to identify someone who might be at increased risk.

Ofori: 11:34

So all of this work that has gone into polygenic scores, would it not be adding more complexity and potentially more work for disease prevention pathways?

Sowmiya: 11:45

Yes and no. It depends on how PGS is being used and the pathway. As I said before, PGS are a series of different biomarkers and they can be used in different healthcare contexts. So if you take that together, it means that you're using information generated from such analysis differently across different healthcare pathways. For example, one of the use cases that is often discussed is incorporation of polygenic scores into existing risk tools. So here in the NHS, people between 40 and 74 are eligible for an NHS health check. As part of this, risk of cardiovascular disease is assessed, usually using a risk tool called QRISK, which essentially brings together information on different risk factors like age, sex, cholesterol, BMI, et cetera. To calculate the probability of having an event of the next five or 10 years. So in this use case, PGS would be another data point like cholesterol and so on that you can add to this tool. So the humanities team has led some work on this and what they've looked at is as long as you have a validated mechanism for getting that information into the tool, it really doesn't add too much complexity to the pathway. You may need extra information for GPs and so on, but largely doesn't really cause a lot of complexity. Another use case, taking again cardiovascular disease as an example, is screening people earlier at a younger age for cardiovascular disease. This is because although you have quite a lot of events after 40, some people have an event at a younger age. And it might be good at identifying these people early to offer them some interventions. But from a health system perspective, this is a more complex thing to deliver because we don't currently do it at the moment. So you have to think about what age group you might want to invite people. You have to think about what risk assessment tool you might want to offer them. We don't have a validated tool at the moment. How that would work from a system perspective, you know, who do you need in place? What interventions you would offer them. And also need to think about how effective that would be. Again, you have to think about each use case carefully for some things, it would be an added benefit and won't make much difference in terms of implementation in other cases, especially when it's a new or novel use of risk information, it's much more complex.

Ofori: 13:56

If I were to ask a healthcare professional to assess my polygenic score for a particular disease. Is that an easy and realistic request?

Sowmiya: 14:06

I would say yes and no. And it's quite nice to bring up specific disease because as I said before, it's quite important to bear in mind that for some diseases it's realistic and some it's not. So coming to the yes, in terms of calculating a score, in some senses, it's relatively straightforward. Once you have your genotype data, you have a validated algorithm to apply to that data to calculate a score. And there's science that allows you to interpret it. Fairly straightforward. And you could consider, to some extent, it's fairly cheap and easy to do. The main reason that it's not a realistic request currently is because we don't have such tests widely available in the NHS to provide this information. There are some niche uses being trialed out and there are companies developing products, but there are some challenges in getting together the evidence base to really understand, is this a really good test? Does it work well and how should we be providing it? So there is quite a lot of discussion that's ongoing on how do you evaluate such products and decide if it's useful. And so that means we don't use it widely at the moment. And we need to sort of start thinking about that next really.

Ofori: 15:18

So, based on everything you've said so far, would it be fair to say that the use of polygenic scores is quite versatile?

Sowmiya: 15:26

Yes, definitely, yeah, it is. I think that's one of the advantages of polygenic scores in that sense. In terms of their calculation, they're so easy. You need genotype data, which you can get quite easily these days and quite cheaply. You need to apply an algorithm to it, and yeah, I think we're getting quite good at developing these algorithms. If you have algorithms for different diseases, you can calculate scores for different diseases for different contexts. So in some senses, that's the easy part. The difficult part is thinking, okay, when is it useful? How is it useful? Are these algorithms well validated, well developed? Are they ready for implementation? What kind of information should we give health professionals? Should we give the public? Should we give patients in interpreting them? And so on. So I think that's the difficult part.

Ofori: 16:15

As you said at the beginning of the episode, you've done a lot of work on polygenics scores for the PHG Foundation, and have produced a fair few reports and publications. Would you be able to talk us through some of them?

Sowmiya: 16:26

Yep, sure. So we started off looking at polygenic scores, was it five or six years ago, when in some senses they reemerged as an area of interest with papers being published saying that they could have value in cardiovascular disease prevention. And so that's one of the big areas there's a lot of interest in. So we did some work looking at readiness for implementation in relation to cardiovascular disease. And following on from that, what we found was one of the key gaps is that there are different understandings of utility, or usefulness, in relation to this kind of information. So we also did some work exploring this concept of clinical utility, or usefulness, and how that might apply to polygenic scores. And we also, in parallel, the humanities team led some work really looking at some of the implementation issues. So, if you did have a test that was ready to be implemented within practice, what are some of the things that would need to happen for it to be implemented responsibly and effectively. And really carrying on from that, what a lot of those two or three pieces of work showed is that there's a big gap in understanding how you think about polygenic scores as a biomarker, products that produce the score as a test, i.e. in terms of how it's used and how you apply it, and then how do you gather together the evidence to sort of... Show that it has some value and should be implemented. And so our next big piece of work is around how you evaluate such products. And we've also done some work looking at, you know, all the uses within the cancer landscape. And hopefully we'll continue within this space doing more.

Ofori: 18:00

Do you think the integration of polygenic scores is on the near horizon for disease prevention?

Sowmiya: 18:08

I think it's getting there. I think the main bottleneck at the moment is building up consensus on the evidence base, and I think everyone recognises that we don't need perfect evidence, but it's just making sure that we have the right pieces of evidence, and I certainly think there are niche use cases where it will be useful. Certainly, I think over the next five years or so, we might be moving more towards implementation possibly.

Ofori: 18:33

Sowmi, this has been really informative. So thank you, but before I let you go, could you tell us what you're working on at the moment?

Sowmiya: 18:39

As I said, I'm wrapping up a report looking at evaluation of polygenic scores and we're hoping to take that work forward. I also do a lot of work with collaborators at the primary care unit looking at some of the challenges in sort of delivering risk stratified bowel cancer screening.

Ofori: 18:56

Well, that just leaves me to say thank you for joining us, Sowmi.

Sowmiya: 19:00

Thank you.

Ofori: 19:07

Well, that brings us to the end of the episode. If you liked it, please leave us a rating and review and make sure to subscribe. If you would like to find out more about what was discussed in this episode, there are useful links included in the podcast description. You can also find additional information on our website, phgfoundation.org. And if you have any further questions about the topic, then you can email us at intelligence@phgfoundation.org. Thank you for listening. My name is Ofori Canacoo and I look forward to bringing you a new topic in the next episode.