Modeling Adverse Event Information

What is an event? It's an occurrence; something that happens. There are countless examples: a baseball game, a wedding, a picnic. All are examples of an event. Let's also include adverse events. One common feature of events is that they persist in time. They have a beginning and an end. Are they observations? No. But we use observations to determine that an event is taking place: one makes numerous observations: the sunny day, the location in a park, the presence of a cooler full of food and drinks, a blanket to lie on the grass, a charcoal grill. Add them all up and you come up with a picnic. We then observe the absence of many of these observations to conclude the event is over. Pretty straightforward.

Adverse Events are no different. Symptoms or signs (observations) begin on a certain date and end on a certain date. One must interpret multiple observations over time to determine that an AE is taking place, or that an AE has resolved. This interpretation in clinical medicine is known as an Assessment, and it is the "A" in the medical encounter record known as as SOAP note. I wrote about modeling clinical data and the relevance of SOAP in a previous post. Although observations also technically have a beginning and an end (e.g. a venipuncture does not occur instantaneously), they should be considered for practical reasons to occur instantaneously. They are a "snapshot in time" of the subject's well being, or lack thereof.

Another thing to keep in mind is that Adverse Events are Medical Conditions (disease, disorder, injury, transient physiological state that can impair health) that are temporally associated with an intervention of some kind (e.g. drug administration), and if noted for the first time in a Subject, it is called a Diagnosis. It's also important to have a qualified Assessor to establish the presence of the correct Adverse Event. Sometimes the Assessor is the patient, who assesses, for example, their headache patterns and concludes they have tension headaches and self-administers an over the counter analgesic. Being able to self-assess your medical condition is in fact a regulatory requirement for a drug to be sold over the counter. Makes sense. But often one needs a trained Assessor, e.g. physician, nurse, to determine that the correct AE is present. Sometimes that assessment is not done properly (or not documented properly) and then problems occur, and second opinions (re-assessments by new assessors) are necessary. Assessments are often associated with Assessment Criteria. These are rules that describe how observations are analyzed and interpreted to determine the presence and severity of a medical condition. Another useful example is a simple blood pressure measurement that is abnormally high, say 150/100 mmHg. Does a single high BP measurement imply that the person has an underlying medical condition known as hypertension? The answer is clearly NO. The proper assessment requires that serial BP measurements are conducted over a period of time to establish the persistence of a clinical event (in this case a disorder) known as hypertension.

So currently, Adverse Event reporting, whether it's in clinical trials or post-marketing safety monitoring, is fraught with the fact the observations (that are used to assess the presence of an AE) and the AE itself are often mixed together, and the analyst must do his or her own Assessment after the fact. Take, for example, the following report of a patient who takes a dose of drug X and then 2 days later develops a sore throat, runny nose, nasal congestion, cough, sinus pain, and viral nasopharyngitis. Not all of these are AEs. The first five are in fact observations that support the presence of the sixth, the true medical condition at play here. Sometimes the observations don't clearly support the presence of a medical condition, in which case a "differential diagnosis" is developed, which is essentially a list of all the medical conditions that could possible cause the observations, followed by a systematic collection of more observations to identify the correct diagnosis.

There is a strong desire within FDA and elsewhere to automate the detection of adverse events. This is quite a challenging task, but it should be made clear that the following must take place before any system or tool can succeed in adverse event detection.

  1. We need to distinguish observations from events
  2. We need qualified assessors to analyze/interpret the observation results
  3. As much as possible, we need to standardize the assessment process by documenting the assessment criteria necessary to identify an AE with high confidence. 

Adverse Event Identification and Characterization
Here is my proposal for a workable data model that can be used to automate AE detection some day. It should be made clear that it deviates from the SDTM and BRIDG notion of an event, as I don't believe these models have it quite right. Remember that observations must undergo an Assessment to determine if a medical condition / AE is present. Sometimes more than one Assesments are done (e.g. second opinions). Finally observations don't get treated, rather the medical condition(s) that are the cause of the abnormal observations are the targets of treatment.


On Observations in Clinical Trials, or, "Did I get that observation right?"

I live in Florida, a state almost surrounded by water. How long is its coastline? How does it compare with the coastline of other states? So, like many others, I turn to ... Google. In a few seconds, I find these results posted on Wikipedia:

You can predict my reaction. How can the method make such a big difference in the results? The web site provides detailed information about each method and it becomes a relatively easy, though highly manual, task to determine which method is more appropriate for one's use case. The take home lesson is clear: the method of observation may affect the results.

Then there is the famous Heisenberg Uncertainty Principle in Physics, which states that the position and velocity of an object cannot both be measured exactly at the same time, even in theory. For large objects, like an automobile, the uncertainty is negligible, but for sub-atomic particles, this is a big deal. The fundamental reason behind the uncertainty is due to in part to the act of making the observation, i.e. the method of observation. In other words, any attempt to measure precisely the velocity of an electron, for example, "will knock it about in an unpredictable way, so that a simultaneous measurement of its position has no validity."

Just so you don't think this concern is limited to physics and geography, consider this well-known medical school fact. A standard blood pressure cuff, when used on significantly obese individuals will typically provide a falsely high reading when compared to the same observation performed using an over-sized cuff. So take note:

The method of observation may affect the results.

This give rise to another "aha!" moment: Observations are Interventions. The observer must intervene in the subjects normal daily routine and execute a specific method of observation to obtain the observation result. Sometimes the method is innocuous like answering a question on a questionnaire, but sometimes can be quite invasive, like a cardiac catheterization to measure coronary artery diameters. Often times the observation and results are combined with an interpretation of the observation(s) (i.e. an Assessment), to establish the presence and severity of a Medical Condition (e.g. Coronary Artery Disease) and its severity. More and more an Intervention to make an observation is combined with an attempt to alter the natural history of the Medical Condition (i.e. a "Therapeutic Intervention") as in the case of a diagnostic cardiac catheterization during which a drug-eluting stent is inserted.

The bottom line is we need to recognize that observations are interventions whose main purpose is to measure the physical, physiological, or psychological state of an individual, and that the details of the method used to make the observation can be very important and may introduce bias in the results.

The take home message of this blog is: An observation doesn't just happen. Someone intervened to make it happen and the method of intervention can affect the results.

Both the SDTM and BRIDG consider observations (called "findings") as different than interventions. It's time to update that thinking. "Findings" are a type of interventions. Furthermore, SDTM considers findings, interventions, and events as different types of observations. I disagree. Events, for example, are not observations. This last statement is a topic of a future blog.

Thank you for your comments.


Do We Need a Study Data Reviewer's Guide?

As part of a robust study data standardization program, the U.S. FDA publishes the Study Data Technical Conformance Guide. The purpose of this document is to provide "technical specifications for sponsors for the submission of animal and human study data and related information in a standardized electronic format" for investigational and marketing applications. Section 2.2 of the guide recommends the submission of a Study Data Reviewer's Guide to "describe any special considerations or directions or conformance issues that may facilitate and FDA reviewer's use of the submitted data and may help the reviewer understand the relationships between the study report and the data." Although FDA doesn't recommend the use of any specific SDRG template, it references a standard template developed by the Pharmaceutical Users Software exchange (PhUSE). 

Let's take a close look at this template. The Purpose of the document is to provide "context for tabulation datasets and terminology that benefit from additional explanation beyond the Data Definitions document (define.xml).  In addition, this document provides a summary of SDTM conformance findings." 

Here is some of the information suggested for inclusion in the SDRG

  1. Is the study ongoing? If so, describe the data cut or database status?
  2. Were SDTM used as sources for the analysis datasets?
  3. Do the submission datasets include screen failures?
  4. Were any domains planned but not submitted because no data were collected?
  5. A tabular listing of eligibility criteria that are not included in IE domain
Before we tackle the question posed at the top of this post, let's ponder a broader question: why do we need standardized data? This one is easy. Standardized data enable process efficiencies and automation. In the case of clinical trials data, reviewers are instantly familiar with the structure of the data, because it is the same across all SDTM-based study datasets. This immediate familiarity with the data structure certainly leads to review process efficiencies. But it only starts there. A common structure and common vocabularies lead to the development of standard analyses that can be automated and reused across studies.

If standardized data leads to increased familiarity with data structures then this should lead to a decrease in additional materials needed to explain the data. But we now have yet another document to explain the data that we didn't have before. The fact that a document like the SDRG is needed at all implies that there are additional data, or additional meaning behind the data, that are not captured in the datasets. 

If we had a truly semantically interoperable data exchange, there would be no need for an SDRG. The meaning behind the data would be with the data, not locked up somewhere else in a human-readable text document. In other words, the need for a Study Data Reviewers Guide represents a failure of the data standards and/or the implementation of the data standards in achieving an adequate degree of semantically interoperable data exchange.  

Sounds harsh? I believe this last statement is true. Let's look at some examples. The SDRG should describe if the study is ongoing. The data contain a study start date and study end date. If a study is ongoing, the study end date should be null. Because a null value for this variable could be due to other reasons, a separate variable (similar to the HL7 null flavor) can describe why the end date is null. Controlled vocabulary can describe the various possible reasons. This approach provides both a standard machine readable approach and human interpretable way of knowing if the study is ongoing. One could even add a 'ongoing study' flag in the trial summary (TS) domain if desired.

Here's another one: Were SDTM used as the source for analysis datasets? If one described each data point as a resource, each with a unique resource identifier (URI), then a system can easily determine where that resource came from. One could see that a data point in an analysis dataset is the same data point (i.e. resource) as what is in the SDTM. These URIs make traceability/data provenance analyses so much easier.

How about this one: Do the submission datasets include screen failures? Each subject should be linked to an administrative study activity called 'EligibilityTest" (or something similar) and the possible outcomes of which are TRUE or FALSE. A subject with EligibilityTest=TRUE means they passed screening and are eligible to continue in the study. EligibilityTest=FALSE means they failed screening. A quick scan of the data would determine if there are any subjects with EligibilityTest=FALSE indicates screen failures are present in the datasets. (Note that the rules for determining TRUE or FALSE are the eligibility criteria themselves, which have a bearing on the next example)

Another example is: the SDRG should contain a tabular listing of eligibility criteria not found in IE (inclusion/exclusion criteria dataset). All study activities should have well-described start rules. The study activity start rules for determining whether a study activity = Randomization can begin are themselves the eligibility criteria. A description of the Randomization start rule is incomplete without a listing of these rules. Their presence in the data would make it unnecessary to repeat them in an SDRG.

So what is the answer to our initial question? If data standards and their implementation were adequate, we would not need an SDRG. That fact that we need SDRGs today should be a sign our study datasets still lack important meaning that analysts need to interpret/analyze the data. It implies that more standards development and better implementation of standards are needed to increase semantically interoperable data exchange. The SDRG should eventually not be needed and should disappear. I think I'm not alone in wishing for the SDRG's eventual demise.

Please share your thoughts and ideas. 



Rules in Study Protocols

When you read a study protocol, your are bombarded by rules. Some are explicitly stated. Many are implicit and must be teased out. Rules are extremely important in ensuring that protocols are conducted correctly. Rules are critical for a good study outcome. Unfortunately, we don't have a good way to standardize protocol rules. This makes it challenging to automate study conduct activities and quickly analyze if a study "followed the rules."

Let's dissect the components of a rule. A rule basically looks like this:

IF {condition A} THEN {Do B} ELSE {Do C}

the ELSE clause is optional and it is assumed to default to "do nothing" if condition A is not met. 
Rules can be complex :

IF {condition A} THEN {
(IF {condition E} THEN {Do F} ELSE {Do G} }

Evaluating a Rule is an Activity whose outcome is binary:  either the condition(s) is/are met ("true") or not met ("false"). One could argue for a 3rd category, not applicable, for cases where the reason to have a rule in the first place doesn't apply (e.g. when to conduct a pregnancy test in a male) 

In clinical studies, rules often depend on other activities. I call these prerequisite activities (PA). For example:
IF {migraine occurs} THEN {take study drug}
or a more precise way of expressing it:
IF {headache assessment = MIGRAINE} THEN {take study drug} 

In this case the prerequisite activity is a headache assessment and the condition is met when the headache assessment outcome indicates that a migraine is present. 

Regarding how prerequisite activities are evaluated, sometimes it is sufficient that the PA simply has begun (PA status = started) or completed (PA status = complete) or, more commonly, completed with a specific expected outcome (PA expected outcome = migraine). 

When looking at rules more closely, they can be expressed as start rules for other activities. Let's call these target activities (TA). 

Target Activity: Study Drug Administration_01
Start Rule: MigrainePresent_01 -- Prerequisite Activity:  HeadacheAssessment_01
                                                       PA Expected ActivityOutcome:  MIGRAINE

StudyDrugAdministration_01 is a planned activity that just sits there, waiting to be performed. As soon as the headache assessment is performed and whose outcome is a documented MIGRAINE, the rule outcome is set to TRUE and the target activity can begin. 

One can add qualifiers in the rule to describe exactly when the target activity is performed. for example a delay = 30 minutes means wait exactly thirty minutes after the condition is met before starting the activity. maxdelay = 60 minutes means wait no more than 60 minutes before starting the activity. mindelay = 30 minutes means wait a minimum of 30 minutes before starting the activity. 

I have tried this paradigm in multiple scenarios and so far it seems to work (randomization, eligibility determination, delayed start, assessments). 

In a future post, I'd like to explore how these rules can be expressed computationally using the RDF. 


Is Death an Adverse Event?

Throughout the course of a drug's marketing life cycle, it is critical that sponsors and regulatory agencies understand if a drug administration is associated with the death of the patient, but is death an Adverse Event? I argue that death is not an adverse event but instead may be the outcome of an AE. I think that it's important to draw this distinction to improve AE reports and automate death/causality analyses.

Here is my argument. First, let's review some definitions.

The U.S. Code of Federal Regulations (21 CFR 312.32) defines an adverse event as
  • any untoward medical occurrence associated with the use of a drug in humans, whether or not considered drug related. 
The bolded text is mine, as I want to consider how to interpret these terms.

Untoward is an adjective meaning unexpected, inappropriate, inconvenient. (Oxford Dictionary). Synonyms include inconvenient, unlucky unexpected, surprising, unusual.

A medical occurrence is more difficult to define. I take it to mean a medical condition: i.e. disease, disorder, injury, or transient physiological state that impairs or may impair health.

Taking all of this into consideration, my working definition of an Adverse Event is any unexpected medical condition (disease, disorder, injury, transient physiological state) that impairs or may impair health and emerges or worsens after a medical intervention (e.g. drug use).

Notice that I broaden the definition to medical interventions such as medical device use, or medical procedure, because AEs in those scenarios are equally relevant from a public health perspective.

With these definitions in mind,  I think one of the pitfalls in causality assessments is wrong use of the term adverse event. I believe Adverse Events are medical conditions. Medical conditions fluctuate over time. An AE can remain stable, improve, or resolve over time. It can also worsen to the point where the patient dies. All of these are potential outcomes of an AE.

Here is a hypothetical case of crazy conclusions that can emerge if our definitions are not precise. Imagine that one is developing software to automate AE causality assessments. 

Consider an 13 y/o male on chronic treatment with Drug X for seizures. Over the course of the year, he grew 4 inches and gained 25 lbs, yet his dose of anti-convulsant medications did not change to keep pace with his increased body mass. Towards the end of the year, he was having breakthrough seizures with a documented low therapeutic level of Drug X in his blood. A computer program may select the following concepts for further causality assessment:
Suspect Drug:  Drug X

Adverse Events:  Seizures, Weight Gain

A computer program would look at this and ask did Drug X cause weight gain and seizures in this case? On a superficial level, it is a fair question since the subject was on Drug X when those events happened.

But wait a minute. A human assessor immediately recognizes the frivolity of such a question in this case. Clearly the growth and weight gain are consistent with puberty-associated growth spurt. The problem is the dose of Drug X wasn't increased in response to the increased body mass. 

The fact that the causality question even arises is a misinterpretation of an Adverse Event. Applying our working definition of an AE, Seizure is not an Adverse Event in this case. It's in fact the Indication, a medical condition that is the target of a medical intervention (i.e. anticonvulsant therapy).  But, you say, it's also a medical condition that worsened with treatment. The key here is that it didn't unexpectedly worsen. A sub-therapeutic dose of an anticonvulsant can be expected to result in breakthrough seizures. 

How about weight gain? Again, this is not an AE. It's an observation, well technically it's an analysis of two (or more) observations, body weight, over time. Is it indicative of an AE? Not necessarily. In this case, it is the consequence of normal growth. Obesity, on the other hand, would be an AE depending on how it's defined. There is no evidence in this report of obesity.

So what is the causality assessment for this case?  The assessment cannot be done. There is no adverse event.

Here's another case:  A person takes Ambien, gets drowsy and drives a car and is involved in a motor vehicle accident.
Suspect Drug:  Ambien
Adverse Event: Motor Vehicle Accident (MVA)

Based on our definition, an MVA is not an Adverse Event since it is not a medical condtion. The AE here is  drowsiness. The MVA is a consequence/outcome of the AE.

So let's get back to death? It's commonly listed as an AE in safety reports. Is it a Medical Condition? No. Death marks the end of a complex physiologic process we call Life and it happens to all living organisms. Death (like an MVA, or a bad fall) is or may be a consequence/outcome of a Medical Condition. In the case of Death we call it the Cause of Death. The cause of death may or may not be an AE.

This reminds me of the clinical data lifecycle:  Observations lead to Assessments/Adjudications to diagnose/assess Medical Conditions (including AEs) which leads to Interventions to (treat, mitigate, cure, prevent) those conditions, which leads to more Observations.  In turn Medical Conditions have consequences/outcomes sometimes beyond our control, like Death (or Falls, or Motor Vehicle Accidents).

These information buckets are well established and distinct in clinical medicine. We save a lot of confusion and misinterpretation of clinical data when we classify them appropriately. We need to do a better job of distinguishing AEs from observations, and not confusing the consequence(s) of an AE from the AE itself. Recognizing this distinction is important to automate adverse event assessment, including those resulting in death.