Discussion
In this national cohort of patients, we show that PFT data is independently associated with all-cause mortality above and beyond using the CAN score alone in patients with PFTs recorded. We additionally show both the increase in total PFT data and a concurrent decline in available structured PFT data for clinical use, highlighting significant gaps in data accessibility for real-world healthcare applications. This lack of production usable data is not limited to PFTs. We previously reported using data from the VA EHR merged with ECG data pulled from vendor databases.18–20 Most VA facilities have not configured vendor products (eg, ECG, PFT, cardiac echocardiogram) to migrate data into the Vista EHR despite their proven clinical value. Our findings underscore the need for improved data integration and accessibility to support decision-making in healthcare, especially in large, complex healthcare systems.
This study assesses data from 95 392 individuals demonstrating several clinically relevant findings. We find that among people with FEV1 recorded, FEV1 %pred is independently associated with all-cause mortality at 1 and 5 years even after adjusting for the CAN score. We chose to primarily focus on FEV1 since some forms of airway obstruction show a decline in FEV1 before the FEV1/FVC falls below the lower limit of normal and key guidelines for the management of COPD emphasise FEV1 as a key indicator of disease severity.14 21 22 However, for the sake of completeness, we also show that our findings are replicated with FVC %pred, FEV1 z-score and FVC z-score, demonstrating that all these measures are associated with all-cause mortality at 1 and 5 years above and beyond what was captured by the CAN score. This suggests that PFT testing captures an important dimension of mortality risk that is not reflected in the CAN score. Furthermore, the specific classification criteria (%pred vs z-score) reflect similar associations with all-cause mortality in this cohort.
Additionally, sensitivity analyses show that FEV1 %pred is independently associated with all-cause mortality at 1 and 5 years even after adjusting for the CAN score in both cohorts of individuals with and without lung disease. The CAN-adjusted ORs are higher in the cohort without lung disease. Since the CAN score incorporates clinical diagnoses,2 the CAN score is a more robust predictor of all-cause mortality in patients with an established diagnosis of lung disease, leaving less independent associative value to be captured by the addition of FEV1 %pred. Similar results are seen when FVC %pred, FEV1 z-score and FVC z-score are used as explanatory variables.
Prior studies support our findings, showing the clinical importance of PFT data. In a prospective cohort study, FEV1 %pred was inversely related to all-cause mortality after adjusting for age, body mass index, systolic blood pressure, education attainment and smoking status.5 In the Gutenberg Health Study, another large prospective cohort study in patients without lung disease, both FEV1 %pred and FVC %pred were associated with all-cause mortality after controlling for cardiac function.6 In patients undergoing cardiac surgery, the addition of FEV1 improved performance of the EuroSCORE, an established risk prediction model when predicting mortality.4 Our study adds to the current literature by showing that FEV1 %pred is statistically significantly associated with an increased risk of both 1-year and 5-year mortality, above and beyond a validated mortality risk score, in patients that have had PFTs.
In addition to showing that PFT measures are associated with mortality above and beyond the CAN-score, our study also quantifies the decrease in available structured PFT data in the VA. The ATS recommends migration of PFT data into structured EHR data.23 24 If this is not done, text mining of unstructured and semistructured PFT notes can be performed but is limited by availability of notes and variability in note structure between sites at the VA.25–27 Researchers can acquire data from disparate sources for testing and modelling but to implement or deploy for decision-making in production environments data need to be available in those environments. This report shows a simultaneous increase in PFT data and a decrease in available PFT data in a large health system. The VA is currently in the process of transitioning its EHR system to Cerner, and several sites have already begun this transition. However, this transition was not the reason for the loss of integrated PFT data; the sites that transitioned to Cerner accounted for only 1–1.5% of all PFTs in the 5 years prior to their transition, resulting in only a slight undercount of PFTs performed in the VA between 2020 and 2023.
The lack of production usable data is not limited to PFTs. ECG data has also been shown to predict all-cause mortality above and beyond the CAN score, yet few VA facilities incorporate ECG into the EHR.19 28 This siloing of important clinical data limits our ability to improve care using our increasing computational capabilities.
This retrospective cohort study has some limitations. We used one-time PFTs, rather than repeated or longitudinal PFT values. Additionally, the majority of the cohort was male, which is typical of veteran cohorts. However, previous research has established PFT testing as a long-term predictor of survival in both sexes.5 We assessed a single healthcare system; however, we used national data from the VA healthcare system. It is possible that other healthcare systems are also experiencing similar challenges in integrating vital data into their EHR. Patients with a lower FEV1 %pred were more likely to have certain sociodemographic characteristics and clinical diagnoses at baseline. However, these factors are all included in the CAN score, which was controlled for in all our models.2