Background.—

In principle, it is generally accepted that DNA methylation measures can be used to predict mortality. However, as of yet, no epigenetic metric has been successfully incorporated into underwriting procedures. In part, this failure results from the relative incompatibility of many DNA methylation measures with conventional underwriting practices.

Objective.—

To test the ability of previously established epigenetic markers of smoking, drinking and diabetes to standard lipid-based approaches for predicting mortality.

Method.—

We constructed a series of Cox proportional hazards models for mortality using clinical data and DNA methylation data from 4 previously described loci from the Framingham Heart Study.

Results.—

The incorporation of vital signs, standard lipid and diabetes laboratory assessments to a base model consisting of age and sex only modestly increased prediction of mortality from 0.732 to 0.741 area under the curve (AUC). However, the addition of epigenetic marker information for smoking and drinking to the base model markedly increased prediction (AUC=0.787) while the addition of epigenetic marker for diabetes increased prediction even further (AUC=0.792).

Conclusion.—

These results demonstrate the potential of simple interpretable, epigenetic models to predict mortality in a manner compatible with standard underwriting procedures. Potentially, this epigenetic approach using rapid methylation sensitive digital PCR procedures that can utilize saliva or whole blood DNA would increase prediction power even further while facilitating more accurate accelerated underwriting assessments of mortality.

The Framingham Heart Study (FHS) is one of the nation’s premier resources for understanding the relationship of behaviors and medical illness to mortality.1  In particular, the FHS Offspring Cohort, characterized in more than 10 waves of comprehensive medical assessments, is well known to the Life Insurance Industry for its value in understanding the relationship of smoking, drinking, and other lifestyle choices to cardiovascular and cancer outcomes.2 

Critically, many of the laboratory and clinical assessment procedures used in the FHS are used by underwriters to assess the presence or absence of conditions, such as diabetes, that predict mortality. For example, hemoglobin A1c (HbA1c) levels are routinely used by underwriters to determine the presence and/or severity of diabetes while serum lipid levels, in conjunction with other information, are used to assess the likelihood of coronary heart disease.3  These results and those from clinical measures are then incorporated along with other information into algorithms to rate mortal risk.

These risk classification algorithms are typically based on both the medical literature and retrospective analyses of existing portfolios.4  Because the accurate classification of mortal risk is vital to maintain solvency, deviations from accepted underwriting procedures can introduce risk to profitability. As such, changes to underwriting practices are generally undertaken in increments with intense attention given to the effects of the proposed changes on predicting actual mortality.

Coincidentally, the FHS can also be used to understand the relationship of epigenetic factors to mortality. In the 8th Exam Wave of the Offspring Cohort, genome-wide DNA methylation data were gathered.5  Because these participants have also been assessed with many of the measures used in current underwriting procedures, not only can the FHS be used to understand the relationship of epigenetic factors to these medical and other mortality related outcomes, it can also be used to compare the potential effectiveness of epigenetic based metrics to conventional underwriting procedures.

The use of epigenetics to predict mortality is not a new concept to either the general scientific or life insurance-specific literature. Fraga and Esteller formally introduced the concept of using epigenetics to infer age in 2007.6  In 2013, using data from the newly introduced Illumina Infinium HumanMethylation450 BeadChip (aka, 450K array) groups led by Hannum and Horvath then constructed the first widely used “epigenetic clocks.”7,8  However, neither of these algorithms was specifically devised to predict mortality. In contrast, the GrimAge clock, introduced in 2019, was specifically designed to predict mortality. Still, despite the efforts of start-ups such as FOXO Life, neither GrimAge nor a host of other clocks have been adopted for use in general underwriting.9,10 

There are many reasons for the failure of this technology to be embraced by the Life Insurance Industry. The first reason is cost. The cost to process a single sample on an Illumina array is over $200 with sample acquisition and data processing further increasing cost. Second, the content of arrays continues to evolve. Production of the 450K chip and its immediate successor, Infinium MethylationEpic v1.0, on which much of the epigenetic clock literature has been developed, has been discontinued. It is also well known that the performance of individual probes varies based on the version of array. Third, because of the nature of the arrays and the way these algorithms are formulated, epigenetic clocks also incorporate genetic heritability in their predictions and can suffer from racial bias, which raises legal and ethical challenges to their use.11,12  Finally, array measurements of methylation are not very precise with test retest differences for epigenetic age commonly exceeding 5 years.13,14 

However, perhaps the most telling reason they have not gathered widespread use is that their broad output, typically expressed as accelerated age, is not easily integrable with current underwriting practices. Specifically, the algorithms that underwriters use focus on determining the presence or absence of major lifestyle risk factors, such as smoking or drinking, or significant medical illness to determine risk. Although epigenetic clocks can predict the likelihood that certain conditions such as coronary heart disease are present,15  the informativeness of their predictions is low and not consistent with the stringencies necessary for underwriting purposes. As such, their output is of limited value in today’s underwriting environment.

Conceivably, an epigenetic index that directly assessed risk factors or conditions could find utility in the underwriting space. Ideally, the tool should precisely load on critical parameters and be integrated with both conventional and accelerated underwriting practices.

In this communication, we provide proof of principle for such an index. Specifically, we construct Cox Regression models to compare the performance of algorithms containing epigenetic array information from 3 loci contained in our Smoke Signature® (cg05575921) and Alcohol Signature™ (cg04987734 and cg02583484) assays together with data for cg19693031, a loci that is differentially methylated in diabetes, to that of a model that uses a standard lipid-based assessment procedures for predicting all-cause mortality in the FHS.

The data used in this study were obtained from the Framingham Heart Study.2,16  The use of these data and analytic procedures in this study were approved by the University of Iowa Institutional Review Board (IRB 201503802).

The clinical and epigenetic data used in this study were extracted from a larger dataset of 2295 individuals who participated in the 8th Examination wave of the FHS Offspring Cohort Study. A description of the procedures used to obtain and prepare these data has been previously reported,17,18  with the full dataset needed for replication of these results residing in the Data Base of Genotypes and Phenotypes (dpGAP) maintained by the National Center for Biotechnology Information (https://www.ncbi.nlm.nih.gov/gap/).

A detailed description of the methods used to prepare the genome-wide DNA methylation data for analysis has been previously described.5  From this data set, we extracted the methylation values for epigenetic probes (cg05575921, cg04987734, cg02583484, cg19693031). In parallel, the demographic and clinical data for these subjects were also extracted from the master data files, and then merged with the epigenetic data to form a final dataset.

A series of Cox proportional hazards models were used to assess the performance of the markers compared to demographic, lipid, and vital measures for predicting all-cause mortality.19  The metric for comparison was the time-dependent area under the ROC curve (AUC).20  In the estimation of AUC, bootstrap cross validation methods were applied, and AUC values were calculated as an average across 100 bootstrap datasets.21  The proportional hazards assumption was investigated, and age was treated as a time varying covariate to account for violation of the assumption.22 

From the best fitting model, we created a composite score using the parameter estimates of the epigenetic measures, adjusting for age and sex. The composite was transformed to a z-score and quartiles of the composite z-score were used to create risk classes. Descriptive statistics of the risk classes were calculated, and a Kaplan-Meier curve for each class was created.

Table 1 delineates the demographic and key laboratory characteristics of the sample. At intake, the male (n=1039; 66.2 ± 8.9 yrs) and female subjects (n=1239; 66.5 ± 9.0 yrs) were all White with both sexes having an average age in the mid-60s. Although diastolic blood pressures were lower in females, systolic blood pressures were similar. Both total cholesterol and high density lipoprotein (HDL) cholesterol were higher in females. Finally, levels of the 4 epigenetic markers were significantly different between the sexes with cg05575921 levels pointing to higher rates of smoking intensity and cg04987734 and cg02583484 indicating higher levels of alcohol consumption in men. Interestingly, cg19693031 methylation indicated more risk for diabetes in men despite Hemoglobin A1c levels being similar.

Table 1.

Clinical and Demographic Characteristics of the FHS Cohort

Clinical and Demographic Characteristics of the FHS Cohort
Clinical and Demographic Characteristics of the FHS Cohort

Using these FHS data, Cox survival analysis was run for the following 4 models: 1) age, sex; 2) age, sex, total cholesterol, HDL cholesterol, triglycerides, A1c, systolic blood pressure, diastolic blood pressure; 3) age, sex, cg05575921, cg04987734, cg02583484; and 4) age, sex, cg05575921, cg04987734, cg02583484, cg19693031. For each model listed in Table 2, we have 2272 participants and 297 events. Table 2 displays the time dependent AUC values for each model, as well as the AUC difference and confidence interval for the comparison to the other models at 5 years. Model 1 (age and sex) does not differ from Model 2 (age, sex, lipid panel, vitals), while Models 3 & 4 (age, sex, epigenetic markers) have higher AUC values than both Models 1 & 2.

Table 2.

Time Dependent AUC for Each Model and AUC Difference (95% CI) Between Models

Time Dependent AUC for Each Model and AUC Difference (95% CI) Between Models
Time Dependent AUC for Each Model and AUC Difference (95% CI) Between Models

Using the parameter estimates for the epigenetic markers from Model 4, an epigenetic composite score was calculated and transformed to a z-score, where higher scores correspond to increased smoking and/or alcohol use (See Table 3). Quartiles of the z-score distribution (Figure 1) were used to create 3 classes (preferred: bottom 25%, standard: middle 50%, substandard: top 25%). Age and epigenetic means and standard deviations for each class are shown in Table 3.

Figure 1.

Epigenetic Composite Z-Score Distribution.

Figure 1.

Epigenetic Composite Z-Score Distribution.

Close modal
Table 3.

Descriptive Statistics by Epigenetic Composite Z-Score Groups

Descriptive Statistics by Epigenetic Composite Z-Score Groups
Descriptive Statistics by Epigenetic Composite Z-Score Groups

Figure 2 displays the Kaplan-Meier curves for each class. Those with the lowest z-scores (preferred: z ≤ -0.67) had the lowest mortality risk, and the risk increased with z-score for the other 2 classes (standard: -0.67 < z ≤ 0.43 & substandard z > 0.43). In terms of hazard ratios, after adjusting for age and sex the substandard class has a higher mortality risk than the standard (HR=4.6 [1.7, 12,3]) and preferred classes (HR=6.1 [1.7, 21.9]).

Figure 2.

Kaplan-Meier Curves by Epigenetic Composite Z-Score Groups.

Figure 2.

Kaplan-Meier Curves by Epigenetic Composite Z-Score Groups.

Close modal

These data show that, in theory, a simple genetics-free, epigenetic index can effectively partition individuals into mortality risk groups, and that this index markedly outperforms a model constructed from a set of laboratory measures obtained for underwriting. This index does so by objectively and effectively capturing the 2 largest life behaviors for early mortality, smoking and excessive drinking.

Although their derivation is more recent and to some, more mysterious, epigenetic assessments are very much like other laboratory measures, such as serum cholesterol levels, in that they are simply biomarkers that can be used to predict an outcome. The real questions for the insurance industry is whether epigenetic biomarkers can predict critical outcomes better than existing measures and whether they can do so in an affordable and scalable manner.

Although these results from the FHS suggest that there is considerable promise in epigenetic methods, there are several significant differences between the modeling approach that we used in this study as compared to actual underwriting practices. First, given the age of the individuals in this study and depending on the face value of the policy, most, if not all, underwriting assessments of these subjects would have been considerably more extensive. Typically, most of the older subjects would have been assessed with electrocardiograms and both NT-proBNP and urinary cotinine determinations.23  Second, attending physician statements would be likely required from those with significant medical history or evidence of disease on laboratory examination. Still, none of these clinical measures are perfect. In actual practice, urinary cotinine assessments can have high rates of false negatives. In Palmier and colleagues 2014 examination of over 6 million life insurance applicants, urine cotinine tests were negative for 498,426 of 938,944 (53%) of self-reported tobacco users.24  Furthermore, although they provide additional protective value, both the NT-proBNP and electrocardiograms load on the same biological diathesis for heart disease as do the serum cholesterol and diabetes measures. Therefore, some of the information that they provide is redundant. Therefore, it is difficult to state exactly the degree of improvement that would be had by engaging in these extra assessments.

Similarly, there are also marked differences between the epigenetic measures used in the FHS and current DNA methylation assessment procedures. Hybridization arrays are research tools that are known to be error prone with error rates for methylation beta values as high as 10% being noted for technical replicates.25  In contrast, newly developed methylation sensitive digital PCR (MSdPCR) and sequencing techniques are more precise with errors rates of replicate samples of 1% being commonly observed and no batch effects.26,27 

Given the rate of advancement of epigenetic diagnostic tools, it is conceivable that these MSdPCR measures, together with commercially available tests for coronary heart disease29  and cancer30  could supplant current fluid-based assessments used for underwriting prospective clients similar to those found in the FHS. However, given the financial peril, any implementation of these or similar approaches would need to be extensively evaluated from actuarial, financial, and regulatory perspectives.

Instead, we believe that the greatest opportunity for epigenetics in the life insurance industry is for assessing younger prospective clients currently being assessed using intensive accelerated underwriting. Although exact numbers are not known, there is considerable mortality slippage in accelerated portfolios secondary to undeclared smoking31  while the slippage to undisclosed excess alcohol use is completely unknown. Because our epigenetic tests can use saliva DNA as their testing substrate and can be completed within a matter of hours, by using overnight couriers and video-monitored assessment procedures, it should be possible to seamlessly incorporate epigenetic testing procedures for these risky lifestyle behaviors into current accelerated underwriting practices. Furthermore, since electrocardiograms and NT-proBNP are relatively low yield in young adults and the advent of vaping has made cotinine determinations less valuable, it should be possible to substitute epigenetic assessments for many of the more traditional blood-based underwriting assessments of younger clients as well.

However, to do this rationally, it will be necessary to first examine the predictive value of epigenetic assessments in diverse cohorts representative of the those being underwritten, then test the algorithm in under actual underwriting conditions. If successful, the resulting epigenetically driven approach could not only facilitate more accurate actuarial assessment but could also lay the groundwork for the use of epigenetics in continuous underwriting paradigms.32 

This work was supported by R44CA285136 to Dr. Philibert. The authors would like to express their gratitude to the Framingham Heart Study collaborators and participants for their timeless contribution to our nation’s health.

Statement of Conflict

Dr. Philibert is the Chief Executive Officer of Behavioral Diagnostics. The use of cg05575921 to assess smoking status is covered by existing and pending patents including US Patents 8,637,652 and 9,273,358. Similarly, the use of DNA methylation to assess alcohol is covered by existing and pending patents including European Union Patent 3149206.

1.
Dawber
TR,
Meadors
GF,
Moore
FE Jr.
Epidemiological approaches to heart disease: the Framingham Study
.
American Journal of Public Health and the Nations Health
.
1951
;
41
:
279
286
.
2.
Cupples
L,
D’Agostino
R,
Kiely
D.
The Framingham Heart Study, Section 35. An Epidemiological Investigation of Cardiovascular Disease Survival Following Cardiovascular Events: 30 Year Follow-up
.
Lung and Blood Institute
;
1988
.
3.
Braun
RE.
Laboratory Testing and Risk Classification. In:
Brackenridge’s Medical Selection of Life Risks
.
Springer
2006
;
241
250
.
4.
Jansen
M,
Nguyen
H,
Shams
A.
Rise of the machines: The impact of automated underwriting. Forthcoming at
Management Science
.
2023
.
5.
Philibert
RA,
Dogan
MV,
Mills
JA,
Long
JD.
AHRR Methylation is a Significant Predictor of Mortality Risk in Framingham Heart Study
.
J Insur Med
.
2019
.
6.
Fraga,
M.F.
&
Esteller,
M.
Epigenetics and aging: the targets and the marks
.
Trends in genetics
.
2007
;
23
:
413
418
.
7.
Hannum
G,
Guinney
J,
Zhao
L,
et al
Genome-wide Methylation Profiles Reveal Quantitative Views of Human Aging Rates
.
Molecular Cell
.
2013
;
49
:
359
367
.
8.
Horvath
S.
DNA methylation age of human tissues and cell types
.
Genome Biology
.
2013
;
14
:
3156
.
9.
Horvath
S.
DNA methylation age of human tissues and cell types
.
Genome Biology
.
2013
;
14
:
3156
.
10.
Gilyard
B.
Minneapolis-based Foxo Technologies warns of possible bankruptcy, lays off employees
.
Star Tribune
. July 24,
2023
.
11.
Dupras
C,
Song
L,
Saulnier
KM,
Joly
Y.
Epigenetic Discrimination: Emerging Applications of Epigenetics Pointing to the Limitations of Policies Against Genetic Discrimination
.
Frontiers in Genetics
.
2018
;
9
.
12.
Philibert
R,
Beach
SRH,
Lei
MK,
et al
Array-Based Epigenetic Aging Indices May Be Racially Biased
.
Genes
2020
;
11
:
685
.
13.
Higgins-Chen
AT,
Thrush
KL,
Wang
Y,
et al
A computational solution for bolstering reliability of epigenetic clocks: Implications for clinical trials and longitudinal tracking
.
Nature Aging
.
2022
;
2
:
644
661
.
14.
Welsh
H,
Batalha
CMPF,
Li
W,
et al
A systematic evaluation of normalization methods and probe replicability using Infinium EPIC methylation data
.
Clinical Epigenetics
.
2023
;
15
:
41
.
15.
Roetker
NS,
Pankow
JS,
Bressler
J,
Morrison
AC,
Boerwinkle
E.
Prospective Study of Epigenetic Age Acceleration and Incidence of Cardiovascular Disease Outcomes in the ARIC Study (Atherosclerosis Risk in Communities)
.
Circ Genom Precis Med
.
2018
;
11
:
e001937
.
16.
Tsao
CW,
Vasan
RS.
Cohort Profile: The Framingham Heart Study (FHS): overview of milestones in cardiovascular epidemiology
.
Intl J Epidem
.
2015
;
44
:
1800
1813
.
17.
Dogan
MV,
Beach
SRH,
Philibert
RA.
Genetically contextual effects of smoking on genome wide DNA methylation
.
Am J Med Genetics Part B: Neuropsychiatric Genetics
.
2017
;
174
:
595
607
.
18.
Dogan
MV,
Grumbach
IM,
Michaelson
JJ,
Philibert
RA.
Integrated genetic and epigenetic prediction of coronary heart disease in the Framingham Heart Study
.
PLoS One
2018
;
13
:
e0190549
.
19.
Frank
EH.
Regression Modeling Strategies With Applications to Linear Models, Logistic and Ordinal Regression, and Survival Analysis
.
Springer
;
2015
.
20.
Gerds
TA,
Olesen
JB,
Ozenne
B.
riskRegression: Risk Regression Models and Prediction Scores for Survival Analysis with Competing Risks
.
2023
.
21.
Gerds
TA,
Schumacher
M.
Efron‐type measures of prediction error for survival analysis
.
Biometrics
.
2007
;
63
:
1283
1287
.
22.
Zhang
Z,
Reinikainen
J,
Adeleke
KA,
Pieterse
ME,
Groothuis-Oudshoorn
CG.
Time-varying covariates and coefficients in Cox regression models
.
Annals of Translational Medicine
.
2018
;
6
:
121
.
23.
Fulks
M,
Kaufman
V,
Clark
M,
Stout
RL.
NT-proBNP predicts all-cause mortality in a population of insurance applicants, follow-up analysis and further observations
.
J Insur Med
.
2017
;
47
:
107
113
.
24.
Palmier
J,
Lanzrath
B,
Idowu
O,
Dixon
A.
Demographic predictors of false negative self-reported tobacco use status in an insurance applicant population
.
J Insur Med
.
2014
;
44
:
110
117
.
25.
Dedeurwaerder
S,
Defrance
M,
Bizet
M,
et al
A comprehensive overview of Infinium HumanMethylation450 data processing
.
Briefings in Bioinformatics
.
2013
;
15
:
929
941
.
26.
Philibert
R,
Dawes
K,
Moody
J,
et al
Using Cg05575921 methylation to predict lung cancer risk: a potentially bias-free precision epigenetics approach
.
Epigenetics
,
2022
;
17
:
1
13
.
27.
Philibert
R,
Miller
S,
Noel
A,
et al
A Four Marker Digital PCR Toolkit for Detecting Heavy Alcohol Consumption and the Effectiveness of Its Treatment
.
J Insur Med
.
2019
;
48
:
90
102
.
28.
Dawes
K,
Andersen
A,
Reimer
R,
et al
The relationship of smoking to cg05575921 methylation in blood and saliva DNA samples from several studies
.
Scientific Reports
.
2021
;
11
:
21627
.
29.
Philibert
R,
Dogan
TK,
Knight
S,
et al
Validation of an Integrated Genetic‐Epigenetic Test for the Assessment of Coronary Heart Disease
.
J Amer Heart Assoc
.
2023
;
12
:
e030934
.
30.
Hall
MP,
Aravanis
AM.
The Galleri Assay. In:
Circulating Tumor Cells: Advances in Liquid Biopsy Technologies
.
Springer
;
2023
:
633
664
.
31.
De Zilwa
S,
Edwards
E,
Irwin
N,
Inyang
M.
Smoke Signals. Verisk
;
2022
.
32.
Bernard
PI,
Godsal
J,
Kotanko
B,
Reich
A.
The future of life insurance
.
2020
.

Competing Interests

Conflicts of Conflict: Dr. Philibert is the Chief Executive Officer of Behavioral Diagnostics. The use of cg05575921 to assess smoking status is covered by existing and pending patents including US Patents 8,637,652 and 9,273,358. Similarly, the use of DNA methylation to assess alcohol is covered by existing and pending patents including European Union Patent 3149206.