Natural Language Processing (NLP) in healthcare offers clinicians and patients countless opportunities to improve care. This post will list a few of these within WellSky. To leverage our rich sources of corpora, I will summarize some of the algorithms that can be used and list specific use cases within WellSky Home. However, there are caveats to using NLP in healthcare, such as legal and policy barriers, noisy data, and implicit inequality biases. I will briefly list ways of handling them.
Intro
NLP has made many headlines recently. Even non-practitioners of machine learning have heard about algorithms like generative pre-trained transformers (GPT) and their impressive capabilities1. NLP offers our clinicians many benefits. In this blog post, we will show how NLP can empower our clinicians to make better decisions and provide better care. Despite the advancement of models, even the latest ones have the potential to complicate the care delivery process. It can perpetuate inequality, miss important information, and propose inadvisable actions. At WellSky, these issues are the main reasons why we carefully investigate NLP models before adoption.
Rule-Based NLP
NLP as a field of study started in the 1950s, along the time artificial intelligence (AI) emerged as a field2. In essence, it is a subfield of computer science that is blended with other fields of study: linguistics, cognitive science, psychology, philosophy, and logic in mathematics. There are three types of NLP: rule-based, statistical, and neural NLP3. The early NLP phase comprises rule-based algorithms. Partly due to the computation power at the time, these methods were limited in what they could do, but these methods, such as pattern matching and parsing, are still used by many. These methods are easy to interpret and can be pretty accurate if knowledgeable practitioners establish the proper rules. However, language is flexible and complex. It is impossible to enumerate all the rules and their exceptions. Despite this, it does a good job capturing most of the information. We use it for data quality purposes. Certain fields in our EHRs are entered and stored as text, such as temperature and blood pressure, allowing our users a flexible documentation platform. Conceptually, these fields represent numerical values, but often there is unrelated text found. Parsing and pattern matching quickly serves as the first line of defense towards cleaning up this noisy data.
Statistical NLP
Statistical NLP is another approach. Advancing from the rule-based approach since the 1980s, this approach is considerably more advanced, both in the development and the usage. With statistical and machine learning algorithms, statistical NLP can extract features themselves, removing the need for the explicit direction from previously established rules. These features are typically fed as inputs into other machine learning models. It still requires domain expertise and analysis, but these methods are more hands-off. One of the simplest models is known as bag-of-words (BoW).4 Given a collection of text documents, also known as corpora, we create our vocabulary by extracting all the words that appear throughout the corpora. Then for each document, we count the number of times each word in the vocabulary appears. By the end of this process, the document of words is replaced with numbers.5 This is necessary since models cannot use text, but instead use numbers to extract patterns.
Much of our data comes from the EHRs, like our home health, hospice, and home care solutions, where a variety of health professionals jot down their notes and thoughts about a patient’s or client’s service. From those text fields, we can use bag-of-words and other more advanced models to extract comorbidities, similar to what was done by Dess et. al. (2021)6.
Neural NLP
Since 2013, much of the advancements in NLP have come from deep learning. Neural NLP, the most advanced approach, comprises many different types of neural network architectures: feed forward, convolutional, recurrent, and transformer architectures7. In the current age of NLP, the most discussed and successful models are the transformer models, with BERT and GPT coming to mind. These models are successful for many reasons, such as great design, scrupulous training, gigabytes of data, and the indispensable computation power of modern machines. This comes with a cost, though. The latest GPT-4 cost is at least over $100 million to build8. Thankfully, we can all benefit from the gains these models have made. Now WellSky has the gamut of options to utilize our data, including patient embeddings, visit-level embeddings, de-identification, summarization, and many more.9
Noisy Data
Even though I commended the performance of these latest models, using these models is not “plug and play.” We have a duty to our clients to give them the best solutions possible, because if our solutions are not high quality, then neither is the care of their patients. If a model’s performance suffers, it may likely be due to messy data and the complexities of language. Messiness can come from misspellings, medical jargon, shorthand, and abbreviations. If there’s an external or internal standard to uphold or an agency has a QA department, then these issues are lessened. However, our data comes from all over the healthcare space so this can never go away 100%. Unfortunately, they can lead to costly mistakes. A misspelling like “hyptension” can either mean “hypertension” or “hypotension” and each requires a different treatment.
Another obstacle is the complexity of language itself. At times, it’s explicit, but other times, it’s subtle. An explicit sentence such as “The cancer has worsened” is clearer than the more nuanced “The tumor mass has grown to other regions of the body.” For us to understand, we read and then glean information from the context. We do this naturally without much thought. For computers to understand, the tools from neural NLP are the best at extracting context. This requires further training for the off-the-shelf large language models (LLM) since these models are trained on all types of subjects but less so on health-specific texts due to privacy restrictions. Such a process is known as transfer learning.10
Careful Usage
Once a model has been properly vetted, it is deployed into our solutions and used by our clinicians. Adoption is key for care and outcomes to improve for patients, however, it is not guaranteed. A major impediment to this is interpretability. We do not encourage blanket trust in our models. They are used as decision-support tools. Thankfully, our clients think similarly. They rely on the interpretation of a model’s predictions and pair these tools with their own clinical experience and knowledge. Simple models are generally easy to interpret, but as complexity increases, this becomes more difficult.11 The field of explainable AI (XAI) has made many strides with this, and we keep up with advancements to encourage adoption.
Challenges
The last challenge is necessary: security and safety. Our data follows the legal restrictions placed on the federal and state level, as well as our individual client agreements. Some of these restrictions can limit our data for training.
We uphold a level of integrity for the patient care our clients provide. Models learn from real-world data, but like all data generated in the world today, it can include injustices and biases. We must do our part to debias models so that all patients receive the best care equitably. This is one of the reasons why XAI is so important. If we are not aware of the biases present in our models, then we cannot combat this to the degree that is needed.
Our federal government and certain state governments are aware of these limitations and are currently in the process of passing legislation to combat these issues. The Office of Science and Technology Policy has released a blueprint for the AI Bill of Rights12, whose main purpose is to protect the rights of Americans from the potentially deleterious use of artificial intelligence. Additionally, the Food and Drug Administration (FDA) has published a set of guidelines stating that certain machine learning models should be regulated like medical devices13. Additional oversight of NLP-powered models is inevitable, but this is needed to provide safe and effective solutions.
Conclusion
From the simplest to the most complex NLP models, each has its place in healthcare. Even the inclusion of small models can have a large impact, such as more accurate models and improved data quality. With the use of the more advanced NLP models, the auxiliary tasks of providing care become more efficient. Clinicians and other health care professionals can document more quickly and streamline workflows. These are just a few of the ways NLP can improve health outcomes for patients.