Rethinking the Three CDISC General Observation Classes

*Please note this post has been updated to reflect updates to some clinical definitions referenced in this post. The links have been updated as well. These do not change the overall message and conclusions. 

The CDISC Study Data Tabulation Model (SDTM) is now on version 1.4 with future versions already in design to accommodate data for additional therapeutic area requirements. As these data from more and more therapeutic areas are standardized, I see an almost endless cycle of new variables, new domains, version upgrades and their associated implementation costs and challenges. I think it is worth exploring improvements to the model so that new requirements could be incorporated more easily, perhaps as easily as adding new terms to dictionaries and decreasing the need for changes to the model or for nonstandard variables. But what does that improved model look like? Here are some thoughts. I certainly welcome comments.

First let's look at where we are today. The SDTM has been quite consistent over time in defining three general observation classes in clinical studies: Interventions, Findings, and Events. Here is how they are described in SDTM 1.4:

  • The Interventions class ... captures investigational, therapeutic and other treatments that are administered to the subject (with some actual or expected physiological effect) either as specified by the study protocol (e.g., “exposure”), coincident with the study assessment period (e.g., “concomitant medications”), or other substances self-administered by the subject (such as alcohol, tobacco, or caffeine). 
  • The Events class ... captures planned protocol milestones such as randomization and study completion, and occurrences, conditions, or incidents independent of planned study evaluations occurring during the trial (e.g., adverse events) or prior to the trial (e.g., medical history). 
  • The Findings class ... captures the observations resulting from planned evaluations to address specific tests or questions such as laboratory tests, ECG testing, and questions listed on questionnaires. The Findings class also includes a sub-type “Findings About” which is used to record findings related to observations in the Interventions or Events class. 
It turns out these definitions do not completely align with the way clinicians generally think about observations. Furthermore, this categorization does not follow well-established conventions for documenting, storing, and using clinical data in practice. I think it is useful to re-examine these concepts, because I believe it leads to a better and more useful data model.

In another post, I discuss definitions of common clinical terms. Here are two I'd like to revisit. 

Clinical Observation: a measure of the physical, physiological, or psychological state of a Person or individual. 
    Medical Condition:  a disease, injury, disorder, or transient physiologic state that interferes or may interfere with well-being. A medical condition persists in time. 

    How do these definitions work? Health care processes focus on identifying Medical Conditions that afflict patients. Once the Medical Condition is identified, one can then determine how best to deal with it. Sadly, patients don't walk into a clinic or hospital with a sign on their forehead saying "I have Multiple Sclerosis." The clinician acts as a detective, documenting clues that can lead to the correct diagnosis. Those clues are clinical observations. The clues must be put together, like a jigsaw puzzle, to determine the most likely diagnosis (i.e. medical condition) that afflicts the patient. This in turn determines the plan (interventions) to make the patient better or keep them from getting sick. 

    This process gives rise to the clinical data lifecycle that in turn, and over many decades, is routinely documented in patient records. It goes by the mnemonic SOAP. Here are the SOAP components:

    Subjective observations - what are the observations that the patient reports?
      Objective observations - what are the observations that the clinician observes (which may include the use of tools, such as a BP cuff, ophthalmoscope, laboratory diagnostic device, imaging device)
        Assessment - what medical condition is mostly likely associated with the observations? What are important attributes of the medical condition, such as severity, measured at the time the assessment is made?
          Plan - how should the medical condition be treated? This usually involves some interventions (drug administration, surgery, device implantation etc.)

          If one applies the working definition of a clinical observation to the CDISC general observation classes, only the Findings class fits as a true clinical observation that would typically be documented in the "O" section of a SOAP note in a patient's record. 

          Let's now look at Interventions. The word Intervention comes from the verb to intervene, which is defined by Merriam Webster Dictionary as "to become involved in something ... in order to have an influence on what happens."As it is commonly used in health care, an intervention is some activity that intends to change or alter or affect in some way a Medical Condition. Usually the intent is to treat, but other purposes of Interventions can be to prevent, cure, diagnose, or mitigate. Examples of Interventions include a drug administration, surgery, or device implantation. 

          It turns out that observations and interventions are very similar from a process perspective. Both have a performer, are associated with some process for carrying out the activity, both may involve one or more devices, and both may involve collecting and analyzing a biospecimen. Observation and Intervention records therefore need to link to information about these other classes as needed. In fact an observation is a type of intervention because the observation wouldn't occur unless the observer takes some intervening action. From a modeling perspective, it makes sense to treat observations as interventions. They are distinguished by the purpose or intent of the intervention: affect/identify a disease vs. observing the state of an individual. 

          So what about CDISC Events? Except for certain administrative events in the SDTM, events are in fact Medical Conditions. Think of Medical History (MH) data: Hypertension, Diabetes Mellitus, Hypothyroidism. All medical conditions. Think of events that go in the Clinical Events (CE) domain or the Adverse Events (AE) domain, all Medical Conditions (or at least they should be. Sometimes in practice, an observation about an event is confused with the event itself. More on this later.)

          Medical Conditions all share common attributes: They persist in time, i.e. they have a start date and an end date (which is null if the condition is ongoing or its status is unknown). For practical reasons, the start date would be the date of the first clinical observation associated with the condition, although in reality the pathophysiology is usually well underway by the time the first symptom or sign appears. It also has a diagnosis date, which corresponds with the date the assessment was completed that first identified the medical condition. This can be much later than the start date. 

          Let's look at adverse events briefly. An AE is a medical condition that is temporally associated with an Intervention. It is identified after an assessment of one or more clinical observations. The assessor concludes that a medical condition is present. It is not an observation! In reality, the assessment is not often documented so many people don't think of it. When the onset of an AE is critical information, e.g. renal graft failure following transplant, then a formal adjudication process may exist to ensure the right clinical observations were collected and the correct diagnostic criteria were applied during the assessment to conclude that renal graft failure has occurred.

          In other cases, an observation about an AE is mistaken for the AE itself. Take the example of "hyponatremia." This is a clinical disorder characterized by low serum sodium, normal serum osmolality, and possibly a constellation of other clinical observations, such as change in mental status, seizures etc. Does a low serum sodium (observation) mean the patient has hyponatremia (the medical condition)? Maybe, maybe not. It requires an assessment by a qualified assessor to determine that association. Maybe the serum osmolality was abnormal; maybe it's lab error. One doesn't always need details of the assessment, but sometimes it's important.

          Our high level concept maps looks like this:

          Now we're starting to see a picture of what new and improved SDTM domains look like. These were discussed in a previous post
          1. Medical Condition domain...can contain everything you may want to know about the medical conditions that afflicts the subject, e.g. start date, link to the observation record considered the first sign/symptom of the condition, date of diagnosis (with a link to the assessment record that led to the diagnosis), severity, toxicity grade, seriousness, end date, etc. Because medical conditions commonly fluctuate in severity or toxicity over time, a severity or toxicity rating is typically the severity or toxicity measured at the time the assessment is made. It is a finding about the medical condition.   
          2. Assessment Domain: can contain everything you may want to know about an assessment, date of the assessment, assessor (link to qualifications), observations used for the assessment, link to diagnostic criteria used for the assessment, link to the rating scale used, link to the outcome of the assessment, i.e. medical condition record.
          3. New and improved Interventions domain containing everything you may want to know about the intervention: date performed, purpose of the intervention (observe; affect), performer (link to qualifications), device used (link to device info); procedure/process done (link to procedure info); biospecimen collected and/or analyzed (link to biosopecimen information), observation result (if observation). 
          4. Every medical condition is associated with a Plan (i.e. care plan). One can consider a Planned Interventions domain where each record is linked to the Medical Condition for which the planned intervention is intended. There should be an ability to link from the planned intervention to the actual intervention record(s). 
          There are lots of links across these new domains. These would be facilitated by having unique resource identifiers for each record in each new domain. 

          If enough attributes are present for each suggested new domain, I hypothesize that new clinical data requirements can be more easily met using this model. As a next step, I would like to create some domains based on dummy data and explore what adding new data requirements might look like.

          Thank you for reading and for your comments. 


          1. Very insightful post. It is well noted that if clinical research wants to effectively reuse health information, it makes sense to align data and terminologies with healthcare rather than living in a separate silo. I also think SOAP and the conceptual model aligns well with HL7 FHIR, notably the resources for Condition, ClinicalImpression, Observation, Medication & Procedures, and even CarePlan. It's time for clinical research to think more like clinicians.

            1. Thank you, Wayne. I agree with your comments, particularly around terminologies and existing FHIR resources.

          2. Hi,
            Thank you for sharing such an amazing and informative post. Really enjoyed reading it. :)


            Managed Healthcare Solutions

          3. Hey,
            Thanks for sharing such an amazing and informative post. Really enjoyed reading it. :)

            Medical Case Management