In genetics, a polygenic score (PGS), also called a polygenic risk score (PRS), genetic risk score, or genome-wide score, is a number that summarizes the estimated effect of many genetic variants on an individual’s phenotype, typically calculated as a weighted sum of trait-associated alleles.
It reflects the estimated genetic predisposition for a given trait and can be used as a predictor for that trait.
PRS gives an estimate of how likely an individual is to have a given trait only based on genetics, without taking environmental factors into account.
Polygenic risk scores (PRS) are numerical scores that estimate an individual’s genetic susceptibility to a certain trait or disease based on multiple genetic variations: usually single nucleotide polymorphisms or SNPs, across their genome.
These scores are calculated by assigning weights to each genetic variant based on its association with the trait or disease in large-scale genome-wide association studies (GWAS).
Specific genetic variants are identified that are associated with the trait or disease of interest through GWAS, which compare the genomes of large cohorts of individuals with and without the trait or disease.
Each genetic variant is assigned a weight or effect size based on its strength of association with the trait or disease.
This weight reflects the contribution of that specific variant to the overall risk.
The polygenic risk score is calculated by summing up the weighted effects of all selected genetic variants that an individual carries.
The more risk-associated variants an individual has, the higher their polygenic risk score will be.
The polygenic risk score is then compared to a reference population to determine an individual’s relative risk for the trait or disease.
This risk estimation can help predict an individual’s likelihood of developing a specific condition or their predisposition for a particular trait.
PRS have been used in predicting complex diseases such as cardiovascular disease, diabetes, cancer, and psychiatric disorders. However, it’s important to note that polygenic risk scores are still an active area of research, and their predictive power may vary depending on the trait or disease being assessed. Additionally,
Polygenic risk scores are influenced by both genetic and environmental factors, and they should not be used as the sole determinant for medical decisions.
Polygenic scores are widely used in animal breeding and plant breeding due to their efficacy in improving livestock breeding and crops.
In humans, polygenic scores are typically generated from genome-wide association study (GWAS) data.
Genetics studies has enabled the creation of polygenic predictors of complex human traits, including risk for many important complex diseases.
Diseases are typically affected by many genetic variants that each confer a small effect on overall risk.
Polygenic scores are used to estimate an individual’s genetic predisposition or risk for certain traits, conditions, or diseases.
Gathering genome-wide association studies (GWAS) are conducted to identify genetic variations associated with specific traits, conditions, or diseases.
Such genetic data is collected from thousands or even millions of individuals, comparing genetic patterns to identify common genetic variants linked to the trait of interest.
Assigning weights or effect sizes to these genetic variants based on their statistical association with the trait: Variants that have a larger effect on the trait will receive higher weights.
To calculate the polygenic score for an individual, their genetic data is compared to the GWAS data.
The individual’s genetic makeup is assessed for the presence of the specific genetic variants associated with the trait, and their corresponding weights are applied.
The scores from each variant are then summed together to provide an overall estimate of the individual’s genetic risk or predisposition for the trait or condition.
The polygenic score represents the cumulative effect of multiple genetic variants on the trait of interest.
Higher polygenic scores indicate a higher genetic predisposition or risk for that trait.
Polygenic scores are statistical estimates, not absolute predictions, and provide insight into a person’s genetic predispositions.
Polygenic scores can help identify individuals at higher risk and tailor interventions or screenings accordingly.
In a polygenic risk predictor the lifetime risk for a disease is captured by the score which depends on the states of thousands of individual genetic variants, that is single nucleotide polymorphisms.
DNA is a string of four nucleotide bases (Thymine, Guanine, Cytosine, and Adenosine) found across 23 chromosomes.
Each cell in the human body contains about 3 billion bases.
The human genome can be broadly separated into coding and non-coding sequences.
The coding genome makes up a small portion of all the bases and encodes instructions for genes, some of which code for proteins.
Genome-wide association studies enable mapping phenotypes to variation in nucleotide bases in human populations.
An individual’s breeding value was the sum of single nucleotide polymorphism weight by their effect on a trait.
A polygenic score (PGS) is constructed from the weights derived from a genome-wide association study (GWAS).
In a GWAS, single nucleotide polymorphisms (SNPs) are tested for an association between cases and controls.
The results from a GWAS provides the strength of the association at each SNP: the effect size, and a p-value for statistical significance.
The effect size derived from GWAS for a SNP is often referred to as the weight.
A polygenic risk score is then calculated by adding together the number of risk-modifying alleles across a large number of SNPs, where each number of alleles for every SNP is multiplied by the weight for the SNP.
Methods for generating polygenic scores is difficult and is an active area of research.
A consideration in developing polygenic scores is which SNPs and the number of SNPs to include, estimating regression of the trait on each genetic variant, ensuring that each marker is approximately independent.
Independence of each SNP is important for the score’s predictive accuracy.
SNPs that are physically close to each other are more likely to be in linkage disequilibrium, meaning they are often inherited together and therefore don’t provide independent predictive power.
As the number of genome-wide association studies has increases, along with rapid advances in methods for calculating polygenic scores, thye have greater application is in clinical settings for disease prediction or risk stratification.
The polygenic contribution for each individual is such that the genetic liability does not change over an individual’s lifespan.
The risk arising from one’s genetics has to be interpreted in the context of environmental factors.
Amn individual with the highest polygenic risk score, top 1%, has a lifetime cardiovascular risk >10% which was comparable to those with rare genetic variants.
Polygenic risk scores have been studied in obesity, coronary artery disease, diabetes, breast cancer, prostate cancer, Alzheimer’s disease and psychiatric diseases.
Most polygenic scores are not predictive enough to diagnose disease, but they could be used in addition to other covariates.such as age, BMI, smoking status to improve estimates of disease susceptibility.
There is poorer predictive performance in individuals of non-European ancestry, limiting widespread use.
Embryo genetic screening is common with millions biopsied and tested each year worldwide, and the embryo genotype can be determined to high precision.
Testing for aneuploidy and monogenetic diseases has increasingly become established.
Polygenic diseases have begun to be employed more recently, having been first used in embryo selection.
The use of polygenic scores for embryo selection has been criticized due to alleged ethical and safety issues as well as limited practical utility.
Polygenic scores from well over a hundred phenotypes have been developed from genome-wide association statistics, categorized as anthropometric, behavioural, cardiovascular, non-cancer illness, psychiatric/neurological, and response to treatment/medication.
A PGS gives a continuous score that estimates the risk of having or getting the disease, within some pre-defined time span.
PGS predictor performance increases with the dataset sample size available for training.
Methods to construct polygenic predictors are sensitive to the ancestries present in the data.
The majority of current usage of PGS individuals is through consumer genetic testing, where a report PRS for a number of diseases and traits are presented.
An individual’s germ-line genetic risk can be calculated at birth for a variety of diseases after sequencing their DNA one time.
Recognizing an increased genetic burden earlier can allow clinicians to intervene earlier and avoid delayed diagnoses.
Polygenic score combined with traditional risk factors increases clinical utility.
PRS my help improve diagnosis of diseases, as distinguishing Type 1 from Type 2 diabetes, and may reduce invasive diagnostic procedures as in Celiac disease.
Polygenic risk scored may empower individuals to alter their lifestyles to reduce risk for diseases, by behavior modification as a result of knowing one’s genetic predisposition.
Polygenic scores can identify a subset of the population at high risk that could benefit from screening: breast cancer, and heart disease.
Polygenic risk scores were originally designed to predict the prevalence and etiology of complex, heritable diseases.
These diseases are typically affected by many genetic variants that individually confer a small effect to overall risk.
A polygenic score can be used in several different ways: to test whether heritability estimates may be biased; as a measure of genetic overlap of traits, which might indicate shared genetic bases for groups of disorders; as a means to assess group differences in a trait such as height, or to examine changes in a trait over time due to natural selection indicative of a soft selective sweep, in Mendelian randomization, to detect and control for the presence of genetic confounds in outcomes, or to investigate gene–environment interactions and correlations.
Polygenic scores also have useful statistical properties in (genomic) association testing.
Polygenic scores is that they can be used to predict the future for crops, animal breeding, and humans alike.