Abstract
We sought to explore the technical and legal readiness of healthcare institutions for novel data-sharing methods that allow clinical information to be extracted from electronic health records (EHRs) and submitted securely to the Food and Drug Administration's (FDA's) blockchain through a secure data broker (SDB).
This assessment was divided into four sections: an institutional EHR readiness assessment, legal consultation, institutional review board application submission, and a test of healthcare data transmission over a blockchain infrastructure.
All participating institutions reported the ability to electronically extract data from EHRs for research. Formal legal agreements were deemed unnecessary to the project but would be needed in future tests of real patient data exchange. Data transmission to the FDA blockchain met the success criteria of data connection from within the four institutions' firewalls, externally to the FDA blockchain via a SDB.
The readiness survey indicated advanced analytic capability in hospital institutions and highlighted inconsistency in Fast Healthcare Interoperability Resources format utilitzation across institutions, despite requirements of the 21st Century Cures Act. Further testing across more institutions and annual exercises leveraging the application of data exchange over a blockchain infrastructure are recommended actions for determining the feasibility of this approach during a public health emergency and broaden the understanding of technical requirements for multisite data extraction.
The FDA's RAPID (Real-Time Application for Portable Interactive Devices) program, in collaboration with Discovery, the Critical Care Research Network's PREP (Program for Resilience and Emergency Preparedness), identified the technical and legal challenges and requirements for rapid data exchange to a government entity using the FDA blockchain infrastructure.
In the previous few decades, the world has been challenged by a barrage of public health emergencies (PHEs), from natural disasters to the infectious disease threats of SARS (severe acute respiratory syndrome), H1N1, Zika, Ebola, and the COVID-19 pandemic. We have learned that PHEs are imminent and that the need for preparedness is paramount to our nation's resiliency.1
In the wake of COVID-19, widespread data collection to understand the virus's impact and effectiveness of treatment plans are needed. However, the United States' ability to rapidly collect multisite patient data to understand the impact of a disease and develop a unified and effective response remains a considerable vulnerability despite significant health system and federal investment in electronic health records (EHRs).2,3 The all-hazardscore data set, created in 2015 to characterize serious illness,injuries, and resource requirements to devise a robustresponse to PHEs, remains a challenge to collect giventechnological and regulatory limitations3 in regard to datasharing. This has been observed in the response to COVID-19, where the lack of data to create consensus on effective treatment protocols has been hindered.4–6
Several barriers exist to data sharing in PHEs, including academic competition and inadequate human and technological resources during responses to emergency.7–10 Neither a standard approach to data sharing nor a method to negotiate and enforce the requisite data legal agreements exists.11,12 Moreover, effective methods for addressing deficiencies or advancing data sharing in response to PHEs are lacking.12–14 A clear need exists to explore novel methods to secure data collection to bridge the gap in knowledge sharing during PHEs.
The complexity of data sharing from disparate sources is a problem experienced in other industries. The finance sector requires the highest level of security to manage financial transactions with speed and integrity. Blockchain technology emerged in the finance industry as a disruptive technology aimed at facilitating a decentralized, secure, and distributed ledger of transactions on a global scale.15,16 Blockchain technology works as blocks of information across a computer network; when chained together, these blocks create a single data asset.
Blockchain has been suggested as an information infrastructure that can be used to advance knowledge sharing in the public sector.17 The decentralized nature of blockchain allows for interoperability,15 which is a key functionality needed to enable data sharing among hospital systems. The use of blockchain in medicine has the potential to revolutionize healthcare's approach to data access, storage, and security17–19 by providing a method to share confidential patient information across many sites regardless of the local technical infrastructure. Large-scale data sharing would contribute to more robust medical research, advanced analytics (e.g., artificial intelligence), and the ability to benchmark the quality of care across institutions.
The Food and Drug Administration (FDA) partnered with the Society of Critical Care Medicine's Discovery, the Critical Care Research Network's Program for Resilience and Emergency Preparedness (PREP; referred to as “Discovery PREP” hereafter) to explore the feasibility of using blockchain for multisite healthcare data collection in preparation for the required rapid data sharing during a PHE. Discovery PREP is one of many networks forming the Resilience Intelligence Network (RIN) with a combined focus on the nation's resilience, preparedness, and response.2
Discovery PREP and the FDA Real-Time Application for Portable Devices (RAPID) program20 collaborated to test the use of RAPID's blockchain technology to determine the technical, legal, and resource challenges in the healthcare context. The RAPID program was designed to facilitate the automated extraction of key information from EHR systems needed to respond to adverse events without adding to the burden of data collection on healthcare practitioners.
Objectives
A federal contract was executed across Discovery PREP institutions to (1) assess the readiness of sites for automated EHR extraction, (2) identify the legal implications of data sharing, (3) address challenges to data sharing while protecting human research subjects through institutional review boards (IRBs), and (4) establish a connection between the local institutions and the FDA's RAPID blockchain.
Materials and Methods
This technical assessment was divided into four sections, including a readiness assessment, legal requirement evaluation, IRB application submission, and technical assessment of healthcare data connections and equipment over a blockchain infrastructure. Institutions involved in each section are identified in Table 1.
EHR Readiness Survey
To ensure design of the infrastructure would be viable at a broad range of institutions, an EHR readiness survey was sent to Discovery PREP sites and the seven other associated acute care clinical research networks affiliated with the RIN.2 The survey sought to understand the variance in data extraction capabilities across healthcare institutions and the resources allocated for data extraction.
Legal Impact
The technical assessment included exploring the legal requirements needed for unidirectional, multisite data sharing to the FDA blockchain. Legal entities from four sites were consulted on the data use agreement (DUA) and business associate agreement (BAA) requirements to share healthcare data using a blockchain infrastructure.
Human Research Subject Protection
To understand the stance of local IRBs toward automated data extraction and submission to FDA's secure data broker (SDB), research protocols were submitted to IRBs at local institutions. The IRB protocols described gathering observational influenza data leveraging the infrastructure outlined for automation of data to the FDA blockchain infrastructure.
Infrastructure Design
Four academic medical centers participated in the infrastructure design study (Table 1). A physician principal investigator and an information technology (IT) technical contact were identified at each site. The FDA engaged Booz Allen Hamilton (BAH) as the technology partner to design, build, and host the infrastructure for the SDB.20 SDBs are an alternative approach to blockchain infrastructure that leverages a third-party intermediary to store and transfer assets privately.21 During this study, the SDB facilitated the collection of data from multiple sites and provided it to the FDA GovCloud. The SDB was mandated by FDA security to eliminate ambiguity in data ownership and prevent storage or sharing in the government secure IT environment. To facilitate secure encrypted data sharing, the infrastructure was designed to incorporate a distributed key exchange system, and a decentralized file system called InterPlanetary File System.19 Smart contracts on the blockchain described who owned a file and with whom that file had been shared. The actual data were stored in an encrypted form in the decentralized file system. When and if a file was shared, smart contracts together with the key exchange system made the key available to the receiving party (illustrated in supplemental digital content Figure 1, available at www.aami.org/bit).
Test scenario. To test the feasibility of sending healthcare data to the Food and Drug Administration (FDA) via blockchain infrastructure, each of the four participating institutions were tasked to install a Fast Healthcare Interoperability Resources (FHIR) server and docker container to send data from their local environment to the FDA's RAPID secure cloud environment. Abbreviation used: AWS, Amazon Web Services.
Test scenario. To test the feasibility of sending healthcare data to the Food and Drug Administration (FDA) via blockchain infrastructure, each of the four participating institutions were tasked to install a Fast Healthcare Interoperability Resources (FHIR) server and docker container to send data from their local environment to the FDA's RAPID secure cloud environment. Abbreviation used: AWS, Amazon Web Services.
In the supplemental digital content Figure 1 (available at www.aami.org/bit), the logical model for the SDB is illustrated, beginning with external systems known as the individual institutions. The institutions could use the client, desktop, or mobile to enter data or connect to the BAH cloud environment leveraging a direct connection to the institutions' EHRs in the web zone.
Data were processed through a BAH application layer and data zone and sent to the FDA front end through the BAH web zone. Data passed through to the FDA GovCloud through the FDA front end. Data were available through an admin console in the FDA back end. The blockchain stack that was deployed at individual sites consisted of multiple containerized services that included the blockchain software (Quorum), the decentralized filesystem software, key exchange system, and custom services that were written by the SDB. Docker containers were used to address the potability requirements necessitated by the multiple deployment environments.
Clinical and informatics experts from the four participating institutions, FDA, and BAH selected Fast Healthcare Interoperability Resources (FHIR)22 as the preferred standard format for data on the blockchain. The 21st Century Cures Act specification of a FHIR application programming interface (API) supported the selection of this standard to support broad interoperability needed for future scale.23,24
A test scenario for connectivity and data transfer was defined based on the use of FHIR (Figure 1). Each institution was expected to create a dedicated server and install the RAPID blockchain software within their local firewall. The software was comprised of a collection of docker containers including the FDA'S RAPID software, a querying function, and a virtual FHIR server (HAPI [HL7 API] FHIR version 2.3). The docker container was installed as an executable file on the server, thereby allowing it to read from the data source and send pertinent information through OpenVPN unidirectionally to the SDB (Figure 2 and supplemental digital content Figure 1).
Architectural diagram of deployment. The logical model for the secure data broker (SDB) is illustrated, beginning with external systems known as hospital networks. The hospital networks housed a Fast Healthcare Interoperability Resources (FHIR) server, docker application, and OpenVPN client to transmit data to the Food and Drug Administration's (FDAs) Real-Time Application for Portable Interactive Devices (RAPID) SDB. Data were processed through the SDBs application layer and sent to the FDA GovCloud. Abbreviations used: TBD, to be determined; VPN, virtual private network.
Architectural diagram of deployment. The logical model for the secure data broker (SDB) is illustrated, beginning with external systems known as hospital networks. The hospital networks housed a Fast Healthcare Interoperability Resources (FHIR) server, docker application, and OpenVPN client to transmit data to the Food and Drug Administration's (FDAs) Real-Time Application for Portable Interactive Devices (RAPID) SDB. Data were processed through the SDBs application layer and sent to the FDA GovCloud. Abbreviations used: TBD, to be determined; VPN, virtual private network.
Seasonal influenza, with a focus on severe acute respiratory infections, was used as the clinical case model, commencing with synthetic data structured to replicate EHR data. The synthetic data mirrored the clinical variables (e.g., demographics, diagnosis, lab results, treatment details) used in the 2019 Discovery PREP's multisite observational influenza study.25 Given the novelty of our approach, synthetic patient data were used to simplify the initial assessment, thereby avoiding any IRB or legal barriers that may be posed using real patient data. In other words, the synthetic data reduced the risk of compromising patient privacy while still accomplishing the goal of testing a healthcare institution's connection to the blockchain. Online software libraries were used to create the FHIR server26 and generate the synthetic data.27 Detailed mapping of each variable in relation to the FHIR standard was performed, and the data were then converted and loaded onto the dedicated server.
Test Plan
As the SDB, BAH developed software to read/write individual records from the FHIR server to the FDA's RAPID blockchain. IT engineers at the four participating institutions were presented with the installation procedure, which was performed at each site. After the dockerized blockchain software and required applications were installed within the site's infrastructure, the docker container read from the internal FHIR data set and sent the data to the blockchain infrastructure through the SDB following local IT security protocols for data exchange and storage. Success was defined as the ability to connect and send data variables from the internal networks of the four institutions to the FDA GovCloud via the SDB.
Results
EHR Readiness
The survey was sent on Aug. 20, 2018, and closed on Sept. 17, 2018. The 12 institutions that participated in the readiness assessment reported having the ability to electronically extract data from EHR for research. The EHR software used at each site was distributed across EHR vendors with 50% Epic, 33% Cerner, 8% Allscripts, and 8% other. A majority of participants (83%) had a data warehouse or repository to store EHR data, and 58% reported participation in research where data were electronically extracted from the EHR. However, only 17% of respondents reported being aware of clinical data stored in FHIR format at their institution, 25% did not have the capability to extract data via FHIR, and 58% did not have knowlege of their institution's ability to store data in FHIR format.
The majority of institutions described having advanced analytic capability with 83% currently using electronic decision support tools, 83% using automatic alerts for sepsis and acute clinical deterioration, 58% working directly with hospital IT to create clinical decision support for the intensive care unit, and 50% using EHR tools to conduct screening, randomization, and delivery of interventions to support clinical trials. Despite their advanced IT capability, these sites reported that the following resources were typically used to extract data from the EHR: self 25%, analyst 50%, informatics core 25%, and hospital IT 8%. Data could be extracted in the following formats: 25% CSV file, 42% Excel file, 8% R, 17% Oracle, 8% Structured Query Language (SQL), and 8% Direct to Redcap.
Legal Impact
DUAs and BAAs were deemed unnecessary to the project given the use of synthetic patient data for transfer in the assessment but would be required by all institutions in the event of real patient data transfer. Execution of the requisite federal subaward contracts at the four participating sites identified contractual language pertaining to requirements of the Federal Information Security Management Act (FISMA) and National Institute of Standards and Technology (NIST) that were a significant hindrance at two of the institutions. The resources required to prove FISMA and NIST compliance were estimated to exceed the value and timeline of the subcontracted awards. As a result, written exceptions were obtained from the FDA contracting officers for these subcontracts to be executed without the need to provide FISMA and NIST compliance.
Human Subjects Research Review
IRB approval or exemption for the study protocol highlighting the use of the FDA blockchain infrastructure was obtained at all participating institutions between four and 50 days from IRB application submission. However, one site required patient consent for these data to be collected, indicating that some healthcare institutions may require informed consent from patients for the data to be collected, irrespective of the data extraction modality.
Technical Assessment
Planning, execution of subaward contracts, and successful testing of the infrastructure spanned from September 2017 to November 2019. All sites used an EHR with a data warehouse. Two of the four sites had an Epic EHR system, and the remaining two had Cerner. Data extraction and connection to the FDA blockchain met the success criteria for all four of the participating institutions. However, implementation was met with unanticipated challenges. For example, the technical implementation procedure of the October 2019 software version required at least three hours of dedicated time for local engineers to collaborate with FDA engineers over video conference or email communication.
Moreover, challenges were observed at the individual institutions in regard to infrastructure installation and local technical preference. For instance, the use of OpenVPN in the software was flagged as a security concern at two institutions, even with the use of synthetic data. In addition, the implementation teams were confronted with different data-mapping profiles within the FHIR specification. (For example, “PaO2 available” was present on the clinical case report form, but whether it was arterial or venous PaO2 was not specified.) The FHIR format requires differentiation between arterial and venous PaO2.
Further, some institutions were unable to install a virtual FHIR server locally to convert the data variables into the standard FHIR format. To mitigate this, two institutions batch uploaded non-FHIR data in .txt format and one institution used a third-party platform hosted in Amazon Web Services to automatically format data in the FHIR standard and import data to the docker container internal to the institution.
Each institution had a choice to install a Microsoft or Linux operating system (OS). Using Linux simplified docker container installation steps; therefore, each institution opted to use the Linux OS for their virtual machines. However, docker installation was not a plug-and-play implementation as anticipated using Linux. (For example, the default limit on log generation in CentOS required alterations to be made at the OS level, resulting in additional troubleshooting and delay in deployment.)
Discussion
This technical evaluation was the first step toward exploring the feasibility of applying a proven finance industry infrastructure approach to secure data sharing to healthcare data. The FDA identified the use of blockchain infrastructure as a potential solution to advance the United States' capability to rapidly collect data from multiple institutions during PHEs—when these data are needed most. This study demonstrated that a connection could be made to a blockchain infrastructure from within the networks of four academic medical centers; however, additional evaluation is needed to consider this solution ready for scale in terms of a local EHR readiness, regulation, and technology approach.
EHR Readiness
The survey indicated a general availability of electronic data for consumption but also highlighted that the FHIR format as a standard was not consistently used across institutions. To address this gap, an integration software solution may be required to ingest local data from a variety of internal sources and present it in a standard FHIR format to the FDA's RAPID blockchain infrastructure. Varying methods of connection to reduce the technical barrier to entry would enable sites with a variety of IT capabilities to join the blockchain.
Legal Impact and Human Research Subjects' Review
Data-sharing agreements outlining the terms of use of data leveraging a blockchain infrastructure will be needed for transfer of real patient data. Utility of multidirectional data sharing between institutions (i.e., one-to-many relationship), as opposed to unidirectional data sharing from institutions to FDA SDB (i.e., one-to-one relationship), needs to be discussed to advise on the creation of legal agreements. Prepositioning these agreements and IRB review of the clinical protocol prior to a PHE for prospective or retrospective studies are essential given the variance in time for review observed in our study (as long as two months). The legal agreements should mention the 45 CFR 164.512(b) portion of the HIPAA (Health Insurance Portability and Accountability Act) Privacy Rule to allow release of patient information to the FDA.
Technical Assessment
Although the feasibility of data transfer from healthcare institutions and the RAPID blockchain was successful, key areas of improvement were identified. The first area was the need for a graphical user interface–based installation procedure aimed at reducing the level of technical expertise required at individual sites to connect to the FDA's RAPID blockchain. Second, at two of the four sites, synthetic data were sent via nonproduction environments leveraging their research data infrastructures instead of the patient care infrastructure. In the event of a PHE, response would require real-time data sharing to government entities, necessitating the use of the clinical production environment. The use of synthetic data, as opposed to real patient data, combined with testing from varying infrastructures was a pragmatic and parsimonious approach; however, these factors still consitute significant limitations of the study.
Consideration of data source and which infrastructure will be used to gather data in a PHE is needed to advance the assessment of using blockchain technology for multisite data extraction. Moreover, institutional data security concerns regarding networking requirements need to be addressed through more networking and enterprise security discussions.
Conclusion
This study took a novel approach to addressing clinical data reporting during PHEs by using a large, federally funded, established clinical research network to test data security techniques successful in other industries and applied to healthcare. Thus, this pilot begins to address a key national strategic vulnerability during PHEs that were evident during the COVID-19 pandemic.1 Specifically, our approach demonstrated the potential for secure information transfer between health systems and government agencies using a blockchain-based protocol. The results of this technology evaluation are the foundation to developing a functional infrastructure that could be implemented prior to its need, in order to ensure preparedness for future PHEs.
Given the findings of this study, we recommend the following actions to improve resilience and emergency preparedness:
Annual preparedness exercises are indicated to evolve the technological approach described herein and expand the “warm base” of connected institutions (academic, community, and even military). For example, multisite transfer of authentic patient data into the FDA's SDB can be gathered annually during the influenza season. Annual exercises such as these would improve responses to sporadic threats, such as a pandemic or regional disaster (e.g., annual hurricanes affecting the Gulf Coast and its medical infrastructure). This is essential to real-time data aggregation, analysis, and reporting the setting of a PHE.
Execution of legal agreements needed to facilitate information exchange prior to a PHE would enable rapid data transfer immediately following an adverse event.
Expansion of testing is needed to broaden understanding of technical requirements and varying EHR and data models at multiple organizations.2 Testing across a larger number of academic, nonacademic, and community institutions with various EHRs would help to identify technical and legal barriers that may not have surfaced within this study.
Testing alternate technology infrastructure approaches (e.g., conventional data exchange using JSON (JavaScript Object Notation), common extraction using SQL technologies, or electronic data capture into a REDCap database from local EHRs should be explored. Having multiple modalities for connection and data sharing would promote flexibility and reduce technology barriers for sites with varying IT infrastructure and expertise.
Analysis of the data postconnection was not included. Future efforts to advance this work should incorporate analysis of the data, data integration across sites, the potential need for a master patient index, authentication, and security and access to other institutions' data across the blockchain.
Disclaimers
The authors are solely responsible for the content of this article, which does not necessarily reflect the views, opinions, or policies of the FDA or Department of Health & Human Services. Mention of trade names, commercial products, or organizations does not imply endorsement by the U.S. government.
Some of the material described herein was presented in abstract form at the 49th Annual Critical Care Congress, Feb. 16–19, 2020, Orlando, FL (Crit Care Med. 2020;48[1]:131).
Funding
This work was funded in part by contracts from the FDA and Biomedical Advanced Research and Development Authority (HHSF223201400115C and HHSF223201810034C).
References
Author notes
Joan Brown, EdD, MBA, CCE, is an associate administrator of clinical operations business intelligence in the Keck Hospital at the University of Southern California in Los Angeles, CA. Email: joan.brown@med.usc.edu
Manas Bhatnagar, MS, Director of Analytics, Department of Surgery, Keck School of Medicine of USC, University of Southern California, Los Angeles, California. Email: manas.bhatnagar@med.usc.edu
Hugh Gordon, MD, is the chief technology officer at Akido Labs in Los Angeles, CA. Email: hgordon@gmail.com
Karen Lutrick, PhD, is an assistant professor of family & community medicine in the College of Medicine at the University of Arizona in Tucson. Email: klutrick@arizona.edu
Jared Goodner is the chief product officer at Akido Labs in Los Angeles, CA. Email: jared@akidolabs.com
James Blum, MD, FCCM, is the chief medical information officer in the Department of Anesthesiology at the University of Iowa in Iowa City. Email: james-blum@uiowa.edu
Raquel Bartz, MD, is the division chief of critical care medicine in the Department of Anesthesia and Medicine at the Duke University School of Medicine in Durham, NC. Email: raquel.bartz@duke.edu
Daniel Uslan, MD, MBA, is the clinical chief and a clinical professor in the David Geffen School of Medicine at the University of California Los Angeles in Los Angeles, CA. Email: duslan@mednet.ucla.edu
Ernesto David-DiMarino, MS, is the head of enterprise applications and data at Cortica Advanced Therapies for Autism and Neurodevelopment in Los Angeles, CA. Email: edimarino@corticacare.com
Alfred Sorbello, DO, MPH, is a medical officer in the Office of Translational Sciences at the Center for Drug Evaluation and Research of the Food and Drug Administration in Silver Spring, MD. Email: alfred.sorbello@fda.hhs.gov
Gregory Jackson is a program management officer in the Office of Translational Sciences at the Center for Drug Evaluation and Research of the Food and Drug Administration in Silver Spring, MD. Email: gregory.jackson@fda.hhs.gov
Jeremy Walsh, is a chief technologist in the Strategic Innovation Group at Booz Allen Hamilton in McLean, VA. Email: walsh_jeremy@bah.com
Lauren Neal, PhD, is the vice president of Strategic Innovation Group at Booz Allen Hamilton in McLean, VA. Email: neal_lauren@bah.com
Marek Cyran, is a chief technologist in the Strategic Innovation Group at Booz Allen Hamilton in McLean, VA. Email: cyran_marek2@bah.com
Henry Francis, MD, is an associate director for data mining and informatics evaluation and research in the Office of Translational Sciences at the Center for Drug Evaluation and Research of the Food and Drug Administration in Silver Spring, MD. Email: henry.francis@fda.hhs.gov
J. Perren Cobb, MD, FACS, FCCM, is the director of surgical critical care, a professor, and a clinical scholar in the Departments of Surgery and of Anesthesiology at Keck School of Medicine of the University of Southern California in Los Angeles, CA. Email: jpcobb@med.usc.edu