MedDRA and Natural Language Processing (NLP)

 

The Coding Process

chart for the coding process

In the most general terms, "coding" is the process of analyzing unstructured text to extract certain features or characteristics of the text that are of particular interest and assign a standardized code to each feature that is independent of language, spelling, and grammatical variant. The structured data that results, including term codes and standardized term text, can be used to search for similar texts or perform apples-to-apples statistical analyses across large numbers of texts. Generally, a standardized terminology like MedDRA is used to code text for a set of features. MedDRA, for example, is used to code medical history, indications, event, lab tests and cause of death in adverse event report narratives. It can also be used to code other types of text like clinical visit reports, drug labels and social media posts for the same or different features. Other terminologies, like WHODrug, can be used to code text features related to drug treatments like product formulation and dosage.

The Coding Pipeline

MedDRA pipeline process chart

If we take a closer look at the coding process, we see that there are two major sequential sub-processes: term extraction and term normalization.

"Term Extraction" refers to the process of identifying the specific parts of the text that characterize it for a particular purpose and the type of each feature. The text phrases of interest, sometimes referred to as "verbatim" terms, and their corresponding term types are the output of this first step in the process. For example, if an AE case narrative contained the text "The patient had previously experienced bouts of abdominal discomfort of unknown cause", the verbatim term "abdominal discomfort" with a term type of "medical history" might result from the term extraction process.

"Term Normalization" is the process of transforming verbatim terms into specific codes in a standardized terminology. In this case, the verbatim term "abdominal discomfort" has an exact match LLT in MedDRA so the code "10000059" would be assigned. In many cases, however, there will not be an exact match and the dictionary must be searched to find a similar standardized term. For example if the verbatim term had been "tummy upset", coding would require a means to find and then discriminate among several standard alternatives including "stomach upset", "GI discomfort", and "abdominal discomfort". This may be straight-forward but can require judgment based on the context of the narrative for resolution.

Traditional AER Narrative Coding

When a Marketing Authorization Holder (MAH) receives an adverse event report from any source, it must be coded in MedDRA. A medical professional (e.g. nurse, pharmacist) trained in AE case processing reads the case narrative and manually identifies all of the verbatim terms representing the patient's past medical history, indications for treatment, adverse events, lab tests and cause of death if applicable (i.e. term extraction). The list of verbatim terms and term types for each case must then be coded in MedDRA (i.e. term normalization). In smaller organizations, the same person may code the case, but larger organizations often transfer the verbatim terms electronically to a centralized coding group where expert coders perform that task.

Regardless of the specific process, manual coding of AE cases is a time-consuming process that requires the attention of highly trained medical professionals. It is costly and still prone to human error. As a result, organizations constantly seek ways to automate portions of the task to improve productivity and quality.

Common Coding Automation - "Autocoding"

The most common coding automation is a form of term normalization called "autocoding". Simply put, autocoding is the process of looking up verbatim terms in a database to find the correct MedDRA code. The first step in autocoding is to look up verbatim terms in MedDRA to see if there is an exact LLT match. If so, the term is coded automatically to that LLT. Since AERs are generated in many cases by health care providers, the terms in the narrative are often correct medical terms that can be autocoded. It is common for 40-50% of the verbatim terms to be autocoded with MedDRA alone.

A 40-50% productivity improvement is valuable but inadequate for companies processing thousands of AERs each day. Those companies may invest in their own "synonym" or "verbatim term assignment" (VTA) databases for further productivity. Previously coded AE terms are reviewed for accuracy and stored in a company database. If a verbatim term fails to autocode in MedDRA, it is then looked up in the VTA database to determine if it has been coded before. If so, the verbatim term is autocoded to the code in the VTA database. In addition to the productivity boost from VTA autocoding, there is a quality improvement as well. The same verbatim term is always coded to the same MedDRA LLT thereby eliminating intra-coder differences during manual coding.

Use of a VTA database in conjunction with exact matching in MeddRA can autocode more than 80% of verbatim terms in many cases. However, it requires a significant investment to build and maintain the VTA database and must be created by each company for its own use. There is no open source or commercial VTA database available. In a recent MSSO Blue Ribbon Panel on MedDRA and Information Technology, the panelists and audience identified creation of a VTA or synonym database by the MSSO as valuable contribution they should consider. It should be noted that even with the VTA list, verbatim terms still must be extracted manually from case narratives and on the order of 20% of those terms must be coded manually.

Natural Language Processing (NLP) Technology and AER Coding

Ideally, it would not be necessary to manually extract terms from case narratives, maintain VTA lists, or manually code any terms. Instead, information technology would be used to completely automate the coding process. This is the holy grail for AER automation and the focus of research in the field of Natural Language Processing or NLP.

According to Wikipedia, "Natural language processing (NLP) is a subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to process and analyze large amounts of natural language data. Challenges in natural language processing frequently involve speech recognition, natural language understanding, and natural-language generation." The NLP we are discussing falls into the category of natural language understanding. We want the computer to read and understand an adverse event narrative and be able to extract certain features from it just as a person would do it.

NLP is an exceedingly difficult computational task because the human brain does not process language in the linear manner of most computer programs. The human brain is designed for pattern matching and resolution of ambiguous input that allows us to handle slang, misspellings, poor grammar, context meaning and negation easily. Computers, not so much. Efforts at NLP focused initially on rule-based evaluation, then statistical analysis. More recently, artificial intelligence, machine learning and neural nets have been applied to the problem with some success.

Most of the NLP research in adverse event extraction and normalization has focused on analyzing patient posts in social media to detect potential adverse drug reactions. Regulatory agencies like the FDA and EMA have been intensely interested over the past several years in the possibility of "sentinel" systems based on social media for early AE detection, which has sparked the research interest. The results have been less than once hoped. The state-of-the-art is best described by the results of a recent competitive challenge among 19 research groups around the world to determine the best algorithm for AE extraction and normalization in social media[i]. The latest technology could only achieve term extraction efficiency (F1-score) of 65% and term normalization efficiency of 43%. The conclusion was that computers can assist humans with evaluating social media, but full human review is still required.

There have been few published studies looking at term extraction/normalization efficiency in adverse event report narratives as opposed to social media. One study from the University of Verona suggests that much higher coding efficiency is possible in AER narratives than in social media[ii]. This is not surprising. Social media posts are very casual and rife with slang, misspellings, informal abbreviations and emoticons. AER narratives, on the other hand, are often produced by medical professionals, which simplifies the term extraction/normalization task significantly. A spin-off company from the University of Verona, MedBrains, has commercialized the narrative coding software, which is called MagiCoder Pro. On their website, https://medbrains.it, they claim 90% coding efficiency. A more careful look at their paper, however, is not as optimistic. While they were able to achieve nearly 90% coding efficiency is very short narratives only containing MedDRA terms, their efficiency coding longer narratives that contain vernacular terms or misspellings is no better than the results for social media.

Natural Language Search and MedDRA Coding

natural language selection in meddra

While there has only been limited success to date with NLP autocoding, NLP does play an important role in one aspect of MedDRA: searching.

Whether you are finding MedDRA terms for coding, looking up terms to create surveillance queries or other reasons, MedDRA users often need to search MedDRA to find a particular term.  One aspect of NLP – natural language search – offers many benefits to help users find the best term quickly and accurately. This is one of the many advanced features that our MedDRA coding tool (Mtools) is capable of.

Google® is the best example of a natural language search engine.  When you type a phrase into Google, it instantly provides you with a list of web pages that may be relevant to your inquiry.  Unlike traditional database queries, Google easily handles plurals, stems (e.g. “…es”, “…ing”), minor misspellings, word order differences and even synonyms to help you find what you want.  MedDRA natural language search engines are like that except that they search for relevant MedDRA terms instead of web pages.

The diagram above describes the process.  The first step is building a search index.  The search engine reads all the term text in MedDRA and creates a word index linking each word to its corresponding MedDRA term.  Then, when you issue a search query, it looks up each of the words in the query and finds the terms containing the same words.  The lookup is very sophisticated so that related terms – not just exact matches – are found.  The results are presented as a list of MedDRA terms sorted in “relevance” order to help find the best match quickly.

Future Directions

There have been some notable advances in NLP in the past few years most notably the open source release by Google of technology for bidirectional encoder representations from transformers (BERT). The brainiacs at Google developed a generalized language understanding neural network that can be trained for named entity recognition in particular domains. Training BERT to identify MedDRA terms in AER narratives offers an interesting avenue for semi-automated term extraction and normalization. We're trying to obtain a corpus of annotated narratives from FAERS to begin investigating this possibility and would love to collaborate with any interested parties.

 

[i] Weissenbacher, et. al. Overview of the Fourth Social Media Mining for Health (#SMM4H) Shared Task at ACL 2019. Proceedings of the Fourth Social Media Mining for Health Applications Workshop & Shared Task, pages 21-30, Florence, Italy, August 2, 2019. (c) 2019 Association for Computational Linguistics.

[ii] Combi, et.al. From narrative descriptions to MedDRA: automagically encoding adverse drug reactions. Journal of Biomedical Informatics 84 (2018), 184-199.

Categories: General