Objectives.—To document the various laboratory and demographic/historical correlates of NT-proBNP levels in applicants for life insurance, and to explore the accuracy of a prediction model based on those variables.

Method.—NT-proBNP blood test results were obtained from 1.34 million insurance applicants between the age of 50 and 85 years, beginning in 2003. Exploratory data analysis was carried out to document correlations with other laboratory variables, sex, age, and the presence of relevant diseases. Further, predictive models were used to quantify the proportion of the variance of NT-proBNP, which can be explained by a combination of these other, easier to determine variables.

Results.—NT-proBNP shows the expected, negative correlation with estimated glomerular filtration rate (eGFR) is markedly higher in those with a history of heart disease and is somewhat higher in those with a history of hypertension. A strong, unexpected, negative correlation between NT-proBNP and albumin was discovered. Of the variables evaluated, a multivariate adaptive regression spline (MARS) model automated selection procedure selected 7 variables (age, sex, albumin, eGFR, BMI, systolic blood pressure, cholesterol, and history of heart disease). Variable importance evaluation determined that age, albumin and eGFR were the 3 most important continuous variables in the prediction of NT-proBNP levels. An ordinary least squares (OLS) model using these same variables achieved a R-squared of 24.7%.

Conclusion.—Expected ranges of NT-proBNP may vary substantially depending on the value of other variables in the prediction equation. Albumin is significantly negatively correlated with NT-proBNP levels. The reasons for this are unclear.

Amino terminal pro-B-type natriuretic protein (NT-proBNP) is the non-active amino-terminal fragment of proBNP, a pro-hormone produced by the heart in response to left ventricular strain.1,2 The active, carboxy-terminus fragment, BNP, was first recognized as a clinically useful analyte in the differentiation of heart failure from pneumonia in patients presenting to the emergency room with shortness of breath and varying degrees of hypoxia, leukocytosis and radiographic abnormalities on chest X-ray.3 

Since then, it has been recognized that NT-proBNP has a longer half-life and is more stable in serum than BNP, making it more suitable for testing. In this publication and others, NT-proBNP has been recognized as a strong predictor of mortality,4,5 not only in the insurance setting, but also in various clinical contexts, most notably congestive heart failure,6 valvular heart disease7 and stable coronary disease.8,9 Recently, NT-proBNP has attracted attention as a potential marker of heart failure suitable for clinical monitoring, though evidence of the superiority of this approach has been lacking.10 Complicating the interpretation of elevated NT-proBNP levels is the fact that normal values vary by age, sex,11 renal function,12 and BMI.13 These other factors may affect NT-proBNP levels to an extent that it becomes difficult to interpret borderline elevations. Also, with so many valid correlates, one may wonder if the NT-proBNP level might be accurately predicted from other, more easily obtained variables.

The purpose of this article is to explore and quantify these correlates of NT-proBNP, with the goal of describing how these various quantities interact to account for some fraction of the total NT-proBNP level. Also, an online application is produced that predicts the mean level and higher quantiles of NT-proBNP when supplied with the predictor variables.

NT-proBNP test results were collected from subjects applying for life insurance. Analysis was restricted to those aged 50-85, since very few NT-proBNP levels are available outside that range. For a subset of these, height, weight, and blood pressure were measured at the time of blood collection. Blood pressures were entered as averages of up to 3 repeated measurements. All applicants filled out a questionnaire with simple yes/no questions, asking about a personal history of heart disease, hypertension, diabetes and cancer. Those not answering lab slip questions were categorized as “NS” (not stated). Collected blood was tested for NT-proBNP, as well as other substances depending on the individual life insurers’ testing protocol. Analysis was restricted to those who had non-missing values for all selected numeric variables, which included age, BMI, systolic and diastolic blood pressure, serum creatinine, estimated glomerular filtration rate by the CKD-Epi equation,14 cholesterol, and albumin.

Correlations with NT-proBNP were studied in univariate and multivariate analyses. Because of a highly skewed distribution, NT-proBNP was log-transformed before inclusion in correlation studies or linear models. Correlation was explored with Pearson r tests for continuous variables and t-tests for categorical variables (non-responders were censored for the purpose of the t-tests). The variables of eGFR and systolic blood pressure were noted to be correlated with age (r = −0.41 for eGFR; r = 0.16 for systolic blood pressure). Therefore, linear models were used to remove the contribution of age and sex from these variables, producing age-corrected eGFR and systolic blood pressure. Variable selection for inclusion in the final model was performed using multivariate adaptive regression splines (MARS) with a grid search for the degree of interaction (up to 3) and the number of included terms (up to 27). Variable importance was computed using change in the residual sum of squares with inclusion/exclusion of the variable, scaled to 100 for the most important variable. The final model utilized ordinary least squares (OLS) linear regression and was restricted to only those terms that showed non-zero importance in the MARS model. To enable non-linear effects, restricted cubic splines were utilized with 4 default knots in the OLS analysis.

Analyses were carried out using R version 4.1, and the following packages: tidyverse, rms, nephron, broom, skimr, matrixStats, table 1, caret, vip, and recipe.

Table 1.

Baseline Characteristics

Baseline Characteristics
Baseline Characteristics

The baseline characteristics of the study group are displayed in Table 1. The average age in the study was 63 years and was similar between men and women. The prevalence of heart disease, diabetes and hypertension were higher among men (5.8%, 10.6% and 36.9%, respectively), than among women (2.5%, 8.9%, and 34.3%). Cancer prevalence was marginally higher among women (4.6% vs 4.3%). The non-response rate was higher for cancer (35.3%) than the others (near 1%) because the cancer question was only added to the lab slip recently. Median levels of albumin, creatinine, eGFR, BMI, and blood pressure were all higher in men. Cholesterol was slightly higher in women as was the median level of NT-proBNP.

Essentially all numeric variables had statistically significant correlations with NT-proBNP (Table 2), but this is not unexpected due to the large sample size. The strongest correlations were found with age (0.416), albumin (−0.25), and eGFR (−0.26). For all categorical variables, statistically significant differences in the mean log (NT-proBNP) level were found between “Yes” and “No” answers. The largest difference was found for heart disease (0.839 log units). In Table 2, the mean differences in log units are translated to a mean % change as one goes from “No” to “Yes”. Thus, the 131% change for heart disease history means that the level of NT-proBNP in those with a history of heart disease is, on average, more than double the level in those without. Figure 1 demonstrates the distribution of NT-proBNP values by disease status. For all conditions, positive status results in a thicker rightward “tail” to the distribution, though the effect is much stronger for heart disease than the others.
Table 2.

Correlations

Correlations
Correlations
Figure 1.

NT-proBNP by disease status.

Figure 1.

NT-proBNP by disease status.

Close modal
Before modeling, the data were split into training (80%) and test (20%) sets, to provide more generalizable error estimates. The MARS model automated process selected 7 variables (age, sex, albumin, eGFR, BMI, systolic blood pressure, cholesterol, and history of heart disease), as well as 2 additional interaction terms (age-sex and age-eGFR). When evaluated using residual sum of squares for variable importance, the interactions terms had zero importance and were removed from consideration (Figure 2).
Figure 2.

Variable importance (by residual sum of squares) for MARS model.

Figure 2.

Variable importance (by residual sum of squares) for MARS model.

Close modal
The variables selected from the MARS model were then utilized in an ordinary least squares (OLS) model where all continuous variables were entered using restricted cubic splines with 4 default knots. In this final model, all terms including main effects and spline terms were significant at the 99.9% confidence level (Table 3). A plot of partial effects (Figure 3) demonstrates a very strong effect of age across the included range. The effect of albumin is also quite strong and mostly log-linear. BMI appears to be associated with elevated NT-proBNP only when the BMI is below about 27 kg/m2. Similarly, there is some evidence of a threshold for eGFR near 90 mg/ml/1.73m2, and for systolic BP near 130 mmHg.
Table 3.

Regression Model Results

Regression Model Results
Regression Model Results
Figure 3.

Model of main effects and spline terms.

Figure 3.

Model of main effects and spline terms.

Close modal
The final model has an R-squared of 25.7%, which means that the predictors, when combined in the manner described, can account for about a fourth of the variation in NT-proBNP levels in this population. The root mean square error on test data was 188.4, meaning that, on average, the prediction equation is in error by approximately188 pg/ml. However, it is more informative to examine the distribution of prediction errors. This is done in Figure 4, which plots predicted vs actual values from the test data, and in Figure 5, which is a histogram of the individual prediction errors. We can see from these plots, that the OLS model is more likely to have highly negative errors (predict a value that is far too low) than a highly positive error (predicting a value which is far too high).
Figure 4.

Predicted vs actual values from test data.

Figure 4.

Predicted vs actual values from test data.

Close modal
Figure 5.

A histogram of the individual prediction errors.

Figure 5.

A histogram of the individual prediction errors.

Close modal

The results presented here confirm the findings of many other studies that have shown a dependence of NT-proBNP on age, renal function, heart disease history, sex, and BMI. A surprising finding is the strong dependence on the albumin level. This finding has not been previously reported in medical literature, and any rationale can only be speculative. What we see is a fairly strong dependence on albumin levels, even well within the normal range, so it would not appear to be a signal of those illnesses which cause markedly low albumin levels (cancer, cachexia, etc). It may be that serum albumin may interfere with the NT-proBNP assay, but this has not been described. It is also conceivable that there is some sort of interaction or binding to albumin within the serum. A recent study found a correlation between high NT-proBNP levels and the risk of cancer.15 Although the authors did control for various laboratory values, albumin was not one of them. There is a significant likelihood that the finding of association is confounded by the correlation between high NT-proBNP and low levels of albumin, which are known to be associated with elevated cancer death rates.

The overall performance of the model suggests that NT-proBNP is not simply an expression of other physiological phenomena as measured by laboratory testing. Rather, much as expected, NT-proBNP brings unique value, which cannot be accurately predicted from other values. However, the degree to which it can be predicted is, perhaps, surprising since about a quarter of its variation is due to other measured factors. Likely, though, prediction is insufficiently accurate to justify a testing strategy based on the output of the OLS model. For instance, among those who were predicted to have NT-proBNP levels below 50 pg/ml, 15% had actual levels over 200 pg/ml.

In the insurance context, this study can help modify expectations for the NT-proBNP level in applicants who have various combinations of the other predictors, and to account for the new finding of a negative correlation with albumin level. For instance, an 80-year-old man with no history of heart disease, a BMI of 28 kg/m2, a systolic BP of 130 mmHg, a creatinine of 1.1 mg/dl, and an albumin of 4.5 g/dl would have a predicted NT-proBNP level of 110 pg/ml. Increasing the creatinine to 1.7 mg/dl results in a predicted NT-proBNP of 152 pg/ml. A similar NT-proBNP of 154 pg/ml would be predicted if the creatinine remained at 1.1 mg/dl, but the albumin was lowered to 4 g/dl.

To help medical directors and underwriters with the various influencers of the NT-proBNP level, an interactive application has been created at https://sjrigatti.shinyapps.io/BNPpredict/. This application can take any combination of inputs from the predictors discussed in this paper and returns a prediction for the NT-proBNP level. It also uses prediction errors to predict higher quantiles of NT-proBNP distribution. This can help determine which levels may be abnormally high based on the other characteristics of the individual.

1.
Daniels
LB,
Maisel
AS.
Natriuretic peptides
.
J Am Coll Cardiol
.
2007
;
50
:
2357
-
2368
.
2.
Hall
C.
Essential biochemistry and physiology of (NT-pro)BNP
.
Eur J Heart Fail
.
2004
;
6
:
257
-
260
.
3.
Maisel
AS,
Clopton
P,
Krishnaswamy
P,
et al
Impact of age, race and sex on the ability of B-type natriuretic peptide to aid in the emergency diagnosis of heart failure: results from the Breathing Not Properly (BNP) multinational study
.
Am Heart J
.
2004
;
147
:
1078
-
1084
.
4.
Clark
M,
Kaufman
V,
Fulks
M,
et al
NT-proBNP as a Predictor of All-Cause Mortality in a Population of Insurance Applicants
.
J Insur Med
.
2014
;
44
:
7
-
16
.
5.
Simsek
MA,
Degertekin
M,
Turer Cabbar
A,
et al
NT-proBNP levels and mortality in a general population-based cohort from Turkey: al long-term follow-up study
.
Biomark Med
.
2018
;
12
:
1073
-
1081
.
6.
Kang
SH,
Park
JJ,
Choi
DJ,
et al
Prognostic value of NT-proBNP in heart failure with preserved versus reduced EF
.
Heart
.
2015
;
101
:
1881
-
1888
.
7.
Gomez Peres
M,
Ble
M,
Cladellas
M,
et al
Combined use of tissue Doppler imaging and natriuretic peptides as prognostic marker in asymptomatic aortic stenosis
.
Int J Cardiol
.
2017
;
228
:
890
-
894
.
8.
Ruwald
MH,
Goetze
JP,
Bech
J,
et al
NT-proBNP independently predicts long-term mortality in patients admitted for coronary angiography
.
Angiology
.
2014
;
65
:
31
-
36
.
9.
Niccoli
G,
Conte
M,
Marchitti
S,
et al
NT-proANP and NT-proBNP circulating levels as predictors of cardiovascular outcome following coronary stent implantation
.
Cardiovasc Revasc Med
.
2016
;
17
:
162
-
168
.
10.
Khan
MS,
Siddiqi
T J,
Usman
MS,
et al
Does natriuretic peptide monitoring improve outcomes in heart failure patients? A systematic review and meta-analysis
.
Int J Cardiol
.
2018
;
263
:
80
-
87
.
11.
Redfield
MM,
Rodeheffer
RJ,
Jacobsen
SJ,
et al
Plasma brain natriuretic peptide concentration: impact of age and gender
.
J Am Coll Cardiol
.
2002
;
40
:
976
-
982
.
12.
McCullough
PA,
Duc
P,
Omland
T,
et al
B-type natriuretic peptide and renal function in the diagnosis of heart failure: an analysis from the Breathing Not Properly Multinational Study
.
Am J Kidney Dis
.
2003
;
41
:
571
-
579
.
13.
Suthahar
N,
Meijers
WC,
Ho
JE,
et al
Sex-specific associations of obesity ad N-terminal pro-B-type natriuretic peptide levels in the general population
.
Eur J Heart Fail
.
2018
;
20
:
1205
-
1214
.
14.
Levey
AS,
Stevens
LA,
Schmid
CH,
et al
A new equation to estimate glomerular filtration rate
.
Ann Intern Med
.
2009
;
150
:
604
-
612
.
15.
Tuñón
J,
Higueras
J,
Tarín
N,
et al
N-Terminal Pro-Brain Natriuretic Peptide Is Associated with a Future Diagnosis of Cancer in Patients with Coronary Artery Disease
.
PLoS ONE
.
2015
;
10
:
e0126741
.