There is an old joke about whether artificial intelligence (AI) will ever be able to overcome natural stupidity. Yet, I would say that it is time to stop joking about AI and start taking it seriously—for all its good (the benefits it provides) and bad (the unique set of new threats we need to learn to manage). Now, I don't want to write a book on the topic (and neither should I), so this discussion will be narrowed to the use of AI in healthcare and cybersecurity.
Specifically, we will look at three topic areas and how AI is used (1) to improve healthcare, (2) to aid in cybersecurity, and (3) by malicious actors to develop new attack techniques.
I will not dive into the specifics of machine learning (ML) and AI technology, but for those seeking additional information, an excellent overview on the use of ML and AI in healthcare was provided in a recent HIStalk blog.1 (Reference 1 cites the last blog post, as well as contains links to all previous articles in the series.)
First Off, Some Definitions
Some confusion and, granted, some actual overlap commonly exist between what is typically referred to as ML and AI. We refer to ML as an algorithm's capability to learn without explicitly being programmed, whereas AI goes beyond that by providing capabilities of perception, decision making, and autonomy (i.e., the capability of a machine to imitate intelligent human behavior).
In a sense, ML can be considered a subset of AI, meaning that the majority of AI systems use aspects of ML to address the need to obtain knowledge (i.e., the “learning”), but ML systems do not provide AI's advanced features of acting based on sensing, reasoning, and adaptation. I would venture that from a marketing perspective, the preference would be to call it all AI—it just sounds so much more futuristic.
ML and AI are used in areas that share common characteristics and provide certain benefits:
Enable rapid and reliable analysis of large and complex data sets
Derive results from data with no apparent logical or discernable pattern
Allow for the use of all available data elements, regardless of whether they seem relevant
In other words, ML/AI enables us to progress in areas where we have reached the limitations of human capability and capacity, providing reliable and tireless analysis of vast and complex data sets without making any assumptions about rules and logic. ML/AI also has proven useful where traditional software algorithms fail because they were not able to predictively define the correct logic path at the outset.
Examples can be found in all areas where large amounts of data require detailed analytics or offer the opportunity for new knowledge, including clustering, classification, regression, and extraction and predictions, anomaly detection, and pattern detection (e.g., in the areas of image analysis and speech recognition).
ML/AI has proven useful in criminology,2 astronomy,3 finance and investing,4 traffic management,5 and air traffic control.6 Algorithms have even composed music7 and written poetry8—they may not have produced Shakespearean-quality sonnets, but neither have you or I.
And certainly, many of these characteristics and benefits apply to cybersecurity as well as healthcare. Lots of data—check; complex data sets—check; need to derive results where humans fail to see patterns—check; ability to overcome limitations of human attention span—check.
Another interesting aspect of ML/AI systems is that we can no longer separate the algorithm from the data because the data have become part of the algorithm. This will pose a number of challenges. We are used to managing algorithms (as represented by software code) via versioning, and it typically can be assumed that two instances of a software system with the same version will produce identical results. However, when I deploy two identical ML/AI systems to two different hospitals with a different patient population and different users, after a while they will no longer be the same. Is one now better than the other or are both good—but just tuned differently to match their respective environments?
Finally, ML/AI systems also raise a number of regulatory and compliance questions. For example, under several regulations, such as GDPR (General Data Protection Regulation), a customer's data should be removed from a system if it is no longer needed or if the customer requests removal. Now, with an ML/AI system, how do I back out a specific user's data that have become part of the algorithm? (How do I unlearn that particular part of the system?)
AI in Healthcare
ML/AI already has demonstrated benefits in areas of healthcare, including diagnosis, image processing, precision treatment, virtual assistance, drug creation, and cost reduction.9 According to a HIMSS Analytics study from 2017,10 the greatest area of interest for AI in healthcare is population health (24%), followed by clinical decision support and diagnosis (both 20%), and precision medicine (14%). Less opportunity exists for benefiting hospital/doctor workflow (8%), security (6%), revenue cycle (2%), and drug discovery (1%), according to HIMSS Analytics. All of these areas share a common theme: They require analysis of large amounts of data, and that analysis needs to occur with little understanding of the results or patterns we are looking for.
Areas where ML/AI has proven itself useful, not surprisingly, are image interpretation and improved speech recognition. Again, these areas require making sense of large and complex data sets. Several leading hospitals have begun using AI to support diagnosis, as well as to enable predictive analysis for complex situations (e.g., disease outbreaks, propagation).
Will AI be able to replace doctors and nurses? Maybe in some areas, but we can certainly say that it will very much complement humans (e.g., with the reading of radiology images, where it can be used to provide a second opinion). AI also can help overcome human limitations such as attention span, fatigue, distraction, or limitation in the amount of data we can ingest and analyze.
We certainly have a ways to go. Any new technology involves growing pains, and we will need to gather some experience and expend some effort to make it right. Many companies and institutions in the United States and abroad are making serious investments.11 Most recently, the U.K. National Health Service announced a £250-million boost for the use of AI in health services.12 But not all early efforts will succeed. For example, IBM's vision to use Watson to revolutionize cancer diagnosis has thus far not been fulfilled.13 It will, eventually, I am sure, but not yet.
AI/ML-based algorithms in healthcare also create a challenge from a regulatory perspective. Any medical device, including software-based medical devices (Software as a Medical Device [SaMD]), requires some degree of regulatory controls and/or premarket approval for initial release and for product changes. The Food and Drug Administration (FDA) recently proposed a regulatory framework for modifications to AI/ML-based SaMD.14
An AI/ML-based device would require premarket submission when the AI/ML software's ongoing learning process significantly affects performance, safety, or effectiveness, changes the device's intended use, or introduces a major change to the algorithm.
The challenge is that the traditional 510(k) modifications guidance focuses on the risk to users/patients resulting from a software change. For example, a software change may require resubmission when it introduces a new risk, changes risk controls, or involves a change that significantly affects clinical functionality or performance. In that sense, an AI/ML-based device would require premarket submission when the AI/ML software's ongoing learning process significantly affects performance, safety, or effectiveness, changes the device's intended use, or introduces a major change to the algorithm.
This would, obviously, create challenges with any self-modifying algorithm, as is the case with AI/ML systems. The FDA already has approved several AI/ML-based devices, but these have only included algorithms that are “locked” prior to marketing (i.e., the learning happens before market release). Under today's regulatory frameworks, FDA premarket review likely would be required for algorithm changes beyond the original market authorization.
As the FDA points out, the challenge now is creating an updated regulatory approval process that allows continuously learning AI/ML systems to benefit patients while providing reasonable assurance of safety and effectiveness. Following initial market release, these types of continuously learning and adaptive AI/ML algorithms may provide a different output compared with the output that initially received regulatory clearance.
The proposed novel process would be based on a total product life cycle regulatory approach, including establishing clear expectations on quality systems and good ML practices, conducting review for SaMD technologies that require premarket submission, establishing expectations that manufacturers will monitor the AI/ML device, and enabling increased transparency among users and the FDA using postmarket performance reporting to ensure continued safety and effectiveness. Such an approach would include considerations around the healthcare situation and condition to which the algorithm would be applied, as well as the significance of the information provided (e.g., does it drive or provide input to treatment, diagnosis, or management?). More information and helpful links can be found on the FDA's “Artificial Intelligence and Machine Learning in Software as a Medical Device” webpage.15
ML can augment human talent and extend our speed and reach, perform boring tasks reliably (e.g., log analysis), and provide the same attention and reliability, regardless of whether it is 2 p.m. or 2 a.m.
AI in Cybersecurity
Applying AI in cybersecurity involves considerations similar to those for AI's use in healthcare. For example, cybersecurity also involves large, complex data sets that require continual attention and that may contain hidden meaning that is not discernable to externally applied logic. ML and AI can help, especially in today's world where cybersecurity talent is sparse and expensive.
A classic example, and an area where ML has taken an early foothold, is the analysis of security events and log files. Vast amounts of data requiring continual attention and real-time correlation of events across complex data sources and networks are needed for such analyses. In a sense, ML can augment human talent and extend our speed and reach, perform boring tasks reliably (e.g., log analysis), and provide the same attention and reliability, regardless of whether it is 2 p.m. or 2 a.m.
A second area where ML/AI already is being used is threat analysis. Today's threat landscape changes continually and confronts us with vast numbers of emerging exploits that are difficult for humans to analyze in a timely manner. ML/AI can help to reduce these challenges, as well as cover the spectrum from detecting new malware and threats to picking up slight variations in malware and behavior.
For the most part, today's systems continue to fall under the category of supervised or trained ML, but we are on the verge of systems that will be able to deliver unsupervised ML. This means that we are about to leave the realm of systems that need to be trained to learn and will introduce systems that “learn to learn” on their own. Another imminent next step is to move from single, specific tasks (e.g., log or threat analysis) to more complex tasks (e.g., system-wide detection, remediation, and mitigation).
The general objective of using ML/AI in cybersecurity is to improve intelligence, detection, response, and recovery—to become faster and more reliable. Some success has already been demonstrated, as these systems performed well in defending against Wanna-Cry and NotPetya.
AI as a Tool for the Bad Guys
Adversaries also have recognized the opportunity inherent in these new technologies. Their use of ML/AI falls into three main categories: (1) improving attacks and techniques, (2) circumventing or manipulating defensive ML/AI systems, and (3) using ML/AI for novel purposes.
For example, ML/AI can be used to develop more sophisticated attacks and to discover new vulnerabilities faster (including zero-days; i.e., software vulnerabilities that are discovered and used by an adversary before a patch or remediation has been made available) and rapidly craft new exploits targeting them. Adversaries also are aware of our use of ML/AI and are developing attack techniques for the purpose of (1) model extraction (i.e., crafting attacks that allow them to analyze and then circumvent our defenses) or (2) poisoning (i.e., feeding our systems with specific input data so as to create bias and steer the systems in the wrong direction).
Examples for malicious use of ML/AI have been demonstrated by security researchers. An example that is not far-fetched in the least is the creation of a much more realistic political disinformation campaign through fake news, social media posts, or even so-called deepfake16 photos and videos that are impossible (or at least quite difficult) to distinguish from originals. This is already possible with today's systems, as was demonstrated by presenting a (fake, with permission) live video of Tom Perez, chair of the Democratic National Committee, at a recent cybersecurity conference.17
Obviously, the potential implications are significant. Not only will it be difficult for the viewer to differentiate between a real and a fake video/photo/audio recording, it will be equally difficult for the affected individual to prove that the item is not real. We have seen the impact of fake social media posts and other cyberattacks on the 2016 U.S. presidential election and situations abroad.18 Things will become more difficult going forward, especially because many of the nation states with highly sophisticated cyber capabilities are not necessarily friends of the United States.
These attacks will not only harm high-visibility personalities like politicians or celebrities. Unfortunately, we already have examples of women having to fight for their reputation due to somebody imposing their face (or paying to have it done) on actors in deepfake porn videos.19
How close are ML/AI-based security attacks from moving from the concept phase to being legitimate day-to-day concerns? Apparently, it's getting pretty close. For many years, we have been dealing with so-called business email compromise (BEC),20 which essentially is the use of online ploys (including but no longer limited to email) to trick representatives of organizations and companies to perform financial transactions in favor of cyber criminals or to enable an attacker to breach a network for other purposes.
Most commonly, the attacker researches an organization and then contacts specific employees with targeted emails (or other forms of messaging) that pretend to represent a senior executive (e.g., chief executive officer, chief financial officer). The email provides instructions on behalf of the senior executive regarding, for example, a wire transfer, other forms of payments, or release of sensitive data. These emails often are well crafted and use advanced social engineering techniques to accomplish their goal. The Federal Bureau of Investigation estimated that the global financial impact of these types of attacks totaled more than $3 billion in 2017 and is expected to rise to $20 billion by 2020.
Returning to AI, Symantec recently provided evidence of three cases in which deepfake audio clips were used to imitate a senior executive's voice in a phone call supporting a BEC scam. Although this sounds futuristic, unfortunately, the future is here.21
As far as healthcare is concerned, we have recently learned that security researchers have been able to use an AI-based approach to manipulate computed tomography scans by adding or removing cancerous nodules. The resulting images were so convincing that the majority of radiologists who viewed them considered them real, reading benign images as pathological and vice versa.22 The opportunity is obvious. An attack using such technology could target individual patients for a variety of purposes, such as ending a politician's or high-visibility executive's career. Scary times, indeed.
AI-based attacks are complex, require skill and computational power, and are expensive to develop. But as we have seen in the past, commoditization of new attack techniques happens quite quickly.
In a previous CyberInsights article, I discussed the shift in cybersecurity priorities we have been observing from the focus on confidentiality (of information such as health data) toward concerns about availability (of systems and, for example, provided health services).23 We are only now starting to become concerned about information integrity and with the use of AI the risk will only increase. As the above-discussed examples of deepfakes illustrate, it makes this even more concerning because the integrity manipulation via AI is so real that it will be difficult to know or prove that it is fake.
Some security researchers are concerned that after we have entered the realm of not being able to tell facts from fakes, or provide evidence of what is true or false, “reality apathy” will set in and we will enter an information crisis.
ML and AI tools will enable highly targeted and fine-tuned social media misinformation campaigns. Some security researchers are concerned that after we have entered the realm of not being able to tell facts from fakes, or provide evidence of what is true or false, “reality apathy” will set in and we will enter an information crisis. Some are even going as far as calling it a looming information apocalypse.24
Most recently, in an op-ed in the Washington Post, Sen. Ben Sasse (R-NE) wrote, ”I spoke recently with one of the most senior U.S. intelligence officials, who told me that many leaders in his community think we're on the verge of a deep-fakes ‘perfect storm.’”25 In Sasse's characterization, this perfect storm is the confluence of easy-to-use technology, hostile foreign governments, and an American electorate bitterly at odds with itself.
As a result, we may not be able to detect the real being real and the fake being fake. Citizens will start to distrust politicians (even more) and the press and will retreat into their preformed opinion as they stop paying attention and fail in their obligation to stay informed—meaning a very basic requirement for a democracy to function is at risk of failing. Returning to healthcare, if patients can't trust their doctors, or doctors can't trust the data they are looking at (be it real or fake; doubt is already a sufficient disruptor), we have a public health problem of unforeseen proportions.
These concerns are exacerbated by our cyber resilience being far from where it should be and the fact that we remain massively underprepared for a serious cyberattack.26 In addition, business and political interests may be preventing us from seriously tackling these cybersecurity challenges as quickly and as aggressively as we should.27
In a sense, we should be concerned that the terminator wars of the future will not be wars among robots but conflicts of computers waging information warfare.