Safety Analysis of Medicinal Drug Product Use: Implications for a Standard Drug Dictionary

The temporal association of an adverse event following exposure to one or more medicinal drug product(s) naturally raises questions of causality. Did the drug or a combination of drugs cause the adverse event? The determination of causality is a complex analysis and beyond the scope of this post, but suffice it to say that having detailed information about the drug product(s) involved in the exposure is a critical component of the analysis. One relies on basically two major data sources to conduct these analyses: clinical trials information and post-marketing safety information (i.e. both passive and active surveillance).  Also relevant are information from other regulatory agencies and the published literature.  

When the suspected drug is the investigational drug in a trial, sufficient information about the investigational drug product is typically available and the analysis can be fairly straightforward. However, because study subjects typically are on multiple medications, one also examines concomitant medication use.  The analysis of concomitant medication (CM) information in clinical trials is valuable to help identify drug-related safety signals that may not be associated with the investigational drug, but also to help identify clinically significant drug-drug interactions with the investigational drug. CM information can be particularly useful since clinical trial data provide reliable denominator counts to calculate incidences. Furthermore, since many trials are controlled, it permits comparisons of incidences with placebo or the control. Analysis, however, remains challenging because CM information is often incomplete and/or insufficiently standardized. Because drug product information is complex, being made up of various component concepts such as the proprietary name, active moiety, active ingredient, strength, dosage form, pharmacologic class, etc., the opportunity for missing/incomplete data is high. Knowing as much as possible about the CM, including exposure data, is important to support meaningful analysis of this information. Standardization of CM information is also important to maximize its usefulness, particularly when pooling data across clinical trials. Pooling across studies is often desirable to increase power to evaluate safety signals, particularly those with low incidences.

Similar problems exist with drug product information available from spontaneous post-marketing safety reports. The information may be incomplete or insufficiently standardized making data aggregation and analysis challenging. Post marketing cases submitted to the publicly available FAERS (the FDA Adverse Events Reporting System) often contain more than one suspect product, and frequently a number of CMs. In reviewing these cases, it would be most helpful to know all the attributes of the suspect products, to enable identification of instances where a specific formulation of a product is associated with a different risk then its other formulations, or instances where a specific lot is associated with new adverse reactions, not seen for that product overall. However, many product attributes are often not available, making product identification standardization critical. At this time, the Proprietary Name and active ingredient(s) are the two critical elements for identifying products in postmarketing reports, with the application number for the primary suspect, manufacturer, then dosage, route, etc., when available.

One often sees only the ingredient populated in the structured drug field, although the specific Trade name is stated in the narrative. Through informal discussions with companies who use WHO Drug Dictionary Enhanced (WHO-DDE) to code and report suspect products in post-marketing reports, it appears that most record both the Trade name and Active ingredient in their internal databases, but report only the ingredient (‘Preferred Name’ record ending in 001 in WHO-DDE for single ingredient products). Analysis would be improved if standard information for the Trade name (when known) as well as the active ingredient were routinely available.

For historical reasons (space limitation) the WHO-DDE hierarchy for multi-ingredient drugs is different. The Preferred Name (record ending with 001) is the first trade name on the market for that specific composition, not the ingredient combination. Thus reporting on this level may represent the name of a product which was not administered at all. Further, every new salt version of the specific ingredient combination gets a new numeric identifier (Drecno), making data aggregation very challenging. The Uppsala Monitoring Center (UMC), as the WHO-DDE maintenance organization, recognizes these challenges and is proposing changes to the dictionary to facilitate accurate reporting and aggregation of data for WHO-DDE users.

To support the clinical trials and postmarketing use cases, one needs to know the active moiety of each drug product, since important drug-related safety signals typically correlate with this concept (e.g. tacrine-associated hepatotoxicity). The moiety can be derived if the active ingredient is known.  Since some drug products are combination drugs containing two or more active ingredients / active moieties, it is important to identify each active moiety present in any multi-ingredient drug product. Drug-related safety signals sometimes also correlate with the pharmacologic class (“class effect,” e.g. bone fractures associated with proton pump inhibitors); therefore, one also needs to know the pharmacologic class of each product or of its active moiety. Pharmacologic class is itself a complex concept made up of other concepts such as mechanism of action, physiologic effect, or chemical structure. A single active moiety may have more than one recognized or clinically meaningful pharmacologic class. This gives rise to potentially 1:many:many relationships (Product:ActiveMoiety:PharmClass; see Figure below).

Key Product Information from a Drug Dictionary

A standard drug dictionary should support the validation and coding of each reported drug product by its name and active ingredient(s), or by the active ingredient(s) when that is the only available information.  A standard drug dictionary should further support the reporting and/or easy look-up of active moiety and pharmacologic class information for each drug product.

Identifying the pharmacologic class information for each drug product is not always straightforward because there may be multiple classes, and there is no concept of a primary default pharmacologic class. When the indication for a product use is known, this may provide information for selecting a class, but the product’s activity may always be considered as part of another assigned class. It is therefore useful to know all the pharmacologic classes associated with a drug product.


Therapeutic Area Standards

As I mentioned in my previous post, the FDA is engaged in a broad effort called  CFAST (The Coalition for Accelerating Standards and Therapies) to standardize “research concepts” for use in clinical trials for various therapeutic areas. This has led to the idea of a “Therapeutic Area (TA) Standard.”  What exactly is a TA Standard? Here are my personal thoughts on this important topic.

“Standard” vs. “Data Standard”

When discussing TA standards, it’s useful to draw a distinction between a standard and a data standard. A standard is defined in dictionary.com as “something considered by an authority or by general consent as a basis of comparison; an approved model.”  There are many different kinds of standards: manufacturing standards, measurement standards, data standards, etc.  

To understand the distinction, consider a ruler. The ruler can be marked in inches or centimeters. Which ruler one uses to measure length depends on the measurement standard that one has selected for the task. Once one selects a measurement standard, then the data standard provides a consistent approach to document and share the measurement. If the measurement standard is inches, and the measurement is 10 inches, then the data standard describes whether it’s 10”, 10 in, or 10 inches. 

The distinction between a standard and a data standard is important when considering TA standards. How to represent a measurement (i.e. an observation) requires two decisions: a business decision (what to measure, which measurement standard to use), followed by a data standards decision (how to standardize the representation of the measurement, which data standard(s) to use).

What is a Therapeutic Area Standard?

There is no widely established “standard definition” for a TA standard. One working (perhaps prevailing?) definition is that a TA standard is a data standard for a therapeutic area or indication. However, close inspection indicates that a TA standard is not a data standard. Let’s examine the definition of a TA standard more closely.

Let’s consider a clinical observation, specifically a clinical laboratory test:  hemoglobin A1C (HbA1C). The standardization of HbA1C data is straightforward. CDISC provides controlled terminology for the HbA1C lab test (represented by the NCI EVS code C64849). The CDISC SDTM IG describes how to represent lab test data (which includes HbA1C data) using the LB domain. The result is a numeric value, and CDISC terminology provides standard terms for units of measure. Anyone conducting a clinical trial that includes the collection of HbA1C need only look at the SDTM IG and CDISC controlled terminology to understand how to standardize this information. No additional data standards are needed.

Let’s now consider a single therapeutic area: Diabetes Mellitus. Let’s assume that, for the purpose of determining efficacy of a new diabetes drug, only one outcome measure is necessary:  the HbA1C. So what does a Diabetes Mellitus TA Standard then look like? What are we “standardizing” that isn’t already standardized?

One can envision a separate Diabetes TA Standards document that says, “if you’re studying a new drug to treat diabetes, you should collect HbA1C and here is how you should represent HbA1C data using these existing standards: SDTM + CDISC controlled terminology.” For this document to be truly useful, an independent scientific and/or regulatory body should first decide what design features and clinical observations are relevant for diabetes studies. This can be described as a “good clinical research practice guideline” for diabetes. One could consider this a standard but it is not a data standard. Such a guideline is analogous to a manufacturing or building standard. Just as a builder might say: “A hurricane-resistant building must/should contain these materials: ….,” a clinical researcher would say: “A good diabetes study must/should contain HbA1C testing.” 

The FDA publishes such guidelines. These are called indication-specific guidances. During my time at FDA, I was the principal author of the template used to standardize the format and content of indication-specific guidances that CDER issues. These guidances help sponsors design their development programs, including the pivotal clinical trials, to support U.S. approval of new drugs for a given indication. Other organizations may publish similar guidelines: professional societies, other government agencies (e.g. NIH), consortia, etc.

In this simple example, a researcher would only need to read the clinical research guideline for diabetes, and understand how to represent HbA1C data using existing FDA-supported exchange and terminology standards. No additional documentation is necessary. A Diabetes TA user guide is not needed for this simple example. The “standard” for a diabetes trial is the clinical research guideline itself and the existing data standards.

Of course therapeutic areas are much more complicated than this. Each TA has multiple relevant clinical observations, and the observations themselves have additional metadata needed to interpret the observation. In this setting a TA “user guide” is useful to demonstrate how to represent all TA-relevant data and metadata using existing data standards. But the user guide itself is not a data standard. The data standards are the exchange and terminology standards that the user guide references.

So an alternative definition for a TA Standard is a best practice guideline for conducting clinical trials for a specific therapeutic area, with an accompanying illustration (“user guide”; TA “use case”) on how to use existing data standards (exchange and terminology standards) for that TA. If CDER generates the guideline, then it would be an indication-specific guidance, and any available user guide would ideally be incorporated by reference to the guidance.

An analogy would be a best practice specification for a kitchen. The kitchen “standard” would say: it must have cabinets, a refrigerator, a sink and faucet, and a stove and oven. It may have a garbage disposal, dishwasher, and trash compactor. The kitchen TA user guide might say, “this is what your kitchen would look like if you use standard Ikea cabinets and General Electric appliances.”

How should TA Standards Be Managed?

First we should recognize that a TA standard is not a data standard. It is a use case for data standards. The data standards are the exchange and terminology standards that are used to standardize TA-specific data. A TA standard has two components:

  1. A clinical research ‘best research practice guideline’ or standard (i.e. the data requirements)
  2. A description of how existing data exchange and terminology standards can represent the TA data requirements. A user guide may be useful (but not necessary) to illustrate how this is done.
In the trivial example described here, there is really no need for a separate Diabetes TA user guide because it is clear how to represent HbA1C information using existing data standards. The problem arises when the data requirements are complex and the existing data standards do not provide a clear and unambiguous representation of the clinical data. Then, a user guide is helpful and necessary.
For a TA standard to be effective, they must meet FDA’s regulatory needs. We need a process to ensure that:

  1. The TA best practice research guideline is accurate (the TA-specific data requirements). From FDA’s perspective, this is ideally captured in an indication-specific guidance, and
  2. Data standards exist (or have been adequately modified) to represent the data requirements in a standard format


Research Concepts

There is quite a lot of activity going on today to standardize research concepts. These are concepts typically used to assess the efficacy or safety of new drugs in clinical trials for various therapeutic areas.  TransCelerateBiopharma has teamed up with CDISC and the Critical Path Institute and launched CFAST to conduct this work. The FDA is actively engaged. Recently, a friend and colleague wrote about Research Concepts on his blog.

This is extremely important work, which helps standardize data collection and data exchange in clinical trials. My main concern is that the term "Research Concept" is too narrowly focused. These are Clinical Concepts that happen to be used in research. Why is this distinction important? The clinical concepts used in clinical trials to establish safety and efficacy are the same clinical concepts that are used in health care to determine the safe and effective use of the product for any given patient. These clinical concepts are also useful for various other use cases.

My point is, standardization of clinical concepts to support a single use case is not enough. They need to take into account other use cases, and the development process needs to incorporate these standard concepts into EHR systems, so they are available for other use cases.

Currently, the National Institutes of Health is engaged in the standardization of clinical concepts for numerous areas in medicine. These are stored and made publicly available in the NIH Common Data Elements Repository hosted by the National Library of Medicine (NLM). The Office of the National Coordinator (ONC) for Health IT recognizes the importance of incorporating standard common data elements into EHRs. The Structured Data Capture  (SDC) initiative is teaming up with the NIH to do just that.

It is not clear to me how CFAST and the NIH CDE/ONC SDC activities fit together? Are they parallel, duplicative efforts? Is there a mechanism for CFAST clinical concepts to find their way into the NIH/ONC efforts? I'm not sure. If they are parallel efforts then they need to come together somehow.

Then there is CIMI, the Clinical Information Modeling Initiative. How does that fit in?

I welcome your thoughts and perspectives.


Please fill out this form.

It continues to happen. I recently went to see a new health care provider. "Please fill out this form." This time it was six pages: the usual questions...demographic information, past medical history, allergies, current medications, previous surgeries, family history, social history. It went on and on.

This experience serves to remind me how much work remains to improve how we access medical information. We should all have a personal health record that we individually manage and update, and can share with medical personnel. We maintain full control over who sees what.

No more six page forms.

Is this drug safe and effective?

I've been with the FDA since 1996 (except for a one year "sabbatical" -- that's a long story). Much of my work has focused on modernizing how FDA assesses the safety and effectiveness of new drugs.  Pharmaceutical manufacturers ("sponsors") conduct the clinical trials. They send FDA the data, and the Agency conducts an independent analysis of the results.

Analyzing clinical trial data is an extremely tedious and remains largely a manual process. I have recently spent a good deal of my time at FDA on projects that try to make the data more accessiblestandardizing the data (to make the data more useful and to enable automated analytic processes), and developing, identifying, deploying tools (e.g., analytic software) to help physicians, statisticians, and other scientists understand and interpret the data. The goal is to make more efficient and effective safety and efficacy determinations to get the right new drugs to patients more quickly. I have written about these three components of an improved informatics infrastructure for clinical research.

I've now decided to leave the FDA. I feel my work there is done and it's time to pass the torch to the next generation of scientists-informaticist that will continue to advance the ongoing medical informatics revolution. I will continue to monitor and contribute to this revolution wherever I can, and will document my experiences here. 

Why am I here?

So I never I thought I would blog. I routinely read other people's blogs with interest but didn't think I'd have the time or inclination to blog myself. Until recently. Now I think "hey maybe I do have something to say." 

I'm a physician (Neurologist by training) who throughout the course of my career accidentally found myself working in medical informatics. I say "accidentally" because it was never planned. In medical school, I didn't know what medical informatics was. I never knew it existed. When I tell people what I do, they ask "what is that?" Then I explain as best I could. Medical Informatics (for me) is all about finding new and better ways to use information technology (hardware, software, networks) to make better medical decisions and improve people's lives. 

It's only natural for me that I wind up in this field. I've always been a frustrated "techie." As a senior at Princeton many years ago, I was one of the few who used a word processor on the university's main frame computer to write my thesis. Almost everyone else was using a typewriter. I eagerly bought a Commodore 64, and then an Amiga (who remembers that?) and spent hours in front of those screens. As a chemistry major, I learned to program in APL to digitally probe the three-dimensional structure of proteins, looking for empty spaces where water molecules could possibly hide. As a neurology resident, I wrote a program using BASIC to generate call schedules automatically, taking into account individual residents' desire for vacation days or other time off. I became a Neurologist to study how the brain, our incredibly complex biologic computer works. 

In 1983, I did my internship at Letterman Army Medical Center in San Francisco. I remember working on the Oncology ward, taking care of extremely sick cancer patients. Every day, we'd collect blood samples in the morning. Every day, at 3pm, I'd park myself outside the hematology lab waiting for the paper lab slips that would contain the complete blood counts so we could decide who needed a blood transfusion, a six pack of platelets, or who needed their chemotherapy modified. I remember thinking...there's gotta be a better way to access these results!

I also remember sleepless nights on call, where we would admit patients to the ward, and then spend hours at 2, 3, 4 in the morning flipping through pages and pages of outpatient records, (most in unintelligible handwriting) trying to decipher the important clues about a patient's medical history that might shed light on what was going on or what we needed to do. I remember thinking then, there's gotta be a better way to document a patient's medical record.
Not long ago, my elderly mother went to the emergency department at her local clinic. There they did an abdominal CT scan that came back "acute colitis." She was severely dehydrated and had a low serum sodium and was transferred to the local community hospital about a mile away. Two days later, the hospital physician still couldn't get a hold of the original CT scan. So what did he do? He ordered another one. What a waste. What if that original CT had been available over the internet? She's on medicare so you and I paid for that second, unnecessary CT scan. Here again I wonder "there's gotta be a better way."

I now work in Medical Informatics because I passionately believe that better use of information technology will transform medicine in the 21st century, and will also help make it cheaper to deliver higher quality medicine. I want to be part of that transformation. My goal with this blog is to document that journey and share my thoughts on how well I think we're doing, solely from one little perspective in cyberspace. Thanks for reading.