Objectives.—To document the various laboratory and demographic/historical correlates of NT-proBNP levels in applicants for life insurance, and to explore the accuracy of a prediction model based on those variables.
Method.—NT-proBNP blood test results were obtained from 1.34 million insurance applicants between the age of 50 and 85 years, beginning in 2003. Exploratory data analysis was carried out to document correlations with other laboratory variables, sex, age, and the presence of relevant diseases. Further, predictive models were used to quantify the proportion of the variance of NT-proBNP, which can be explained by a combination of these other, easier to determine variables.
Results.—NT-proBNP shows the expected, negative correlation with estimated glomerular filtration rate (eGFR) is markedly higher in those with a history of heart disease and is somewhat higher in those with a history of hypertension. A strong, unexpected, negative correlation between NT-proBNP and albumin was discovered. Of the variables evaluated, a multivariate adaptive regression spline (MARS) model automated selection procedure selected 7 variables (age, sex, albumin, eGFR, BMI, systolic blood pressure, cholesterol, and history of heart disease). Variable importance evaluation determined that age, albumin and eGFR were the 3 most important continuous variables in the prediction of NT-proBNP levels. An ordinary least squares (OLS) model using these same variables achieved a R-squared of 24.7%.
Conclusion.—Expected ranges of NT-proBNP may vary substantially depending on the value of other variables in the prediction equation. Albumin is significantly negatively correlated with NT-proBNP levels. The reasons for this are unclear.
Amino terminal pro-B-type natriuretic protein (NT-proBNP) is the non-active amino-terminal fragment of proBNP, a pro-hormone produced by the heart in response to left ventricular strain.1,2 The active, carboxy-terminus fragment, BNP, was first recognized as a clinically useful analyte in the differentiation of heart failure from pneumonia in patients presenting to the emergency room with shortness of breath and varying degrees of hypoxia, leukocytosis and radiographic abnormalities on chest X-ray.3
Since then, it has been recognized that NT-proBNP has a longer half-life and is more stable in serum than BNP, making it more suitable for testing. In this publication and others, NT-proBNP has been recognized as a strong predictor of mortality,4,5 not only in the insurance setting, but also in various clinical contexts, most notably congestive heart failure,6 valvular heart disease7 and stable coronary disease.8,9 Recently, NT-proBNP has attracted attention as a potential marker of heart failure suitable for clinical monitoring, though evidence of the superiority of this approach has been lacking.10 Complicating the interpretation of elevated NT-proBNP levels is the fact that normal values vary by age, sex,11 renal function,12 and BMI.13 These other factors may affect NT-proBNP levels to an extent that it becomes difficult to interpret borderline elevations. Also, with so many valid correlates, one may wonder if the NT-proBNP level might be accurately predicted from other, more easily obtained variables.
The purpose of this article is to explore and quantify these correlates of NT-proBNP, with the goal of describing how these various quantities interact to account for some fraction of the total NT-proBNP level. Also, an online application is produced that predicts the mean level and higher quantiles of NT-proBNP when supplied with the predictor variables.
Methods
NT-proBNP test results were collected from subjects applying for life insurance. Analysis was restricted to those aged 50-85, since very few NT-proBNP levels are available outside that range. For a subset of these, height, weight, and blood pressure were measured at the time of blood collection. Blood pressures were entered as averages of up to 3 repeated measurements. All applicants filled out a questionnaire with simple yes/no questions, asking about a personal history of heart disease, hypertension, diabetes and cancer. Those not answering lab slip questions were categorized as “NS” (not stated). Collected blood was tested for NT-proBNP, as well as other substances depending on the individual life insurers’ testing protocol. Analysis was restricted to those who had non-missing values for all selected numeric variables, which included age, BMI, systolic and diastolic blood pressure, serum creatinine, estimated glomerular filtration rate by the CKD-Epi equation,14 cholesterol, and albumin.
Correlations with NT-proBNP were studied in univariate and multivariate analyses. Because of a highly skewed distribution, NT-proBNP was log-transformed before inclusion in correlation studies or linear models. Correlation was explored with Pearson r tests for continuous variables and t-tests for categorical variables (non-responders were censored for the purpose of the t-tests). The variables of eGFR and systolic blood pressure were noted to be correlated with age (r = −0.41 for eGFR; r = 0.16 for systolic blood pressure). Therefore, linear models were used to remove the contribution of age and sex from these variables, producing age-corrected eGFR and systolic blood pressure. Variable selection for inclusion in the final model was performed using multivariate adaptive regression splines (MARS) with a grid search for the degree of interaction (up to 3) and the number of included terms (up to 27). Variable importance was computed using change in the residual sum of squares with inclusion/exclusion of the variable, scaled to 100 for the most important variable. The final model utilized ordinary least squares (OLS) linear regression and was restricted to only those terms that showed non-zero importance in the MARS model. To enable non-linear effects, restricted cubic splines were utilized with 4 default knots in the OLS analysis.
Analyses were carried out using R version 4.1, and the following packages: tidyverse, rms, nephron, broom, skimr, matrixStats, table 1, caret, vip, and recipe.
Results
The baseline characteristics of the study group are displayed in Table 1. The average age in the study was 63 years and was similar between men and women. The prevalence of heart disease, diabetes and hypertension were higher among men (5.8%, 10.6% and 36.9%, respectively), than among women (2.5%, 8.9%, and 34.3%). Cancer prevalence was marginally higher among women (4.6% vs 4.3%). The non-response rate was higher for cancer (35.3%) than the others (near 1%) because the cancer question was only added to the lab slip recently. Median levels of albumin, creatinine, eGFR, BMI, and blood pressure were all higher in men. Cholesterol was slightly higher in women as was the median level of NT-proBNP.
Discussion
The results presented here confirm the findings of many other studies that have shown a dependence of NT-proBNP on age, renal function, heart disease history, sex, and BMI. A surprising finding is the strong dependence on the albumin level. This finding has not been previously reported in medical literature, and any rationale can only be speculative. What we see is a fairly strong dependence on albumin levels, even well within the normal range, so it would not appear to be a signal of those illnesses which cause markedly low albumin levels (cancer, cachexia, etc). It may be that serum albumin may interfere with the NT-proBNP assay, but this has not been described. It is also conceivable that there is some sort of interaction or binding to albumin within the serum. A recent study found a correlation between high NT-proBNP levels and the risk of cancer.15 Although the authors did control for various laboratory values, albumin was not one of them. There is a significant likelihood that the finding of association is confounded by the correlation between high NT-proBNP and low levels of albumin, which are known to be associated with elevated cancer death rates.
The overall performance of the model suggests that NT-proBNP is not simply an expression of other physiological phenomena as measured by laboratory testing. Rather, much as expected, NT-proBNP brings unique value, which cannot be accurately predicted from other values. However, the degree to which it can be predicted is, perhaps, surprising since about a quarter of its variation is due to other measured factors. Likely, though, prediction is insufficiently accurate to justify a testing strategy based on the output of the OLS model. For instance, among those who were predicted to have NT-proBNP levels below 50 pg/ml, 15% had actual levels over 200 pg/ml.
In the insurance context, this study can help modify expectations for the NT-proBNP level in applicants who have various combinations of the other predictors, and to account for the new finding of a negative correlation with albumin level. For instance, an 80-year-old man with no history of heart disease, a BMI of 28 kg/m2, a systolic BP of 130 mmHg, a creatinine of 1.1 mg/dl, and an albumin of 4.5 g/dl would have a predicted NT-proBNP level of 110 pg/ml. Increasing the creatinine to 1.7 mg/dl results in a predicted NT-proBNP of 152 pg/ml. A similar NT-proBNP of 154 pg/ml would be predicted if the creatinine remained at 1.1 mg/dl, but the albumin was lowered to 4 g/dl.
To help medical directors and underwriters with the various influencers of the NT-proBNP level, an interactive application has been created at https://sjrigatti.shinyapps.io/BNPpredict/. This application can take any combination of inputs from the predictors discussed in this paper and returns a prediction for the NT-proBNP level. It also uses prediction errors to predict higher quantiles of NT-proBNP distribution. This can help determine which levels may be abnormally high based on the other characteristics of the individual.