Modeling Clinical Data

There is a great deal of interest in how to model clinical data for numerous use cases. I'm very interested in clinical data modeling. I start by saying that I am not a modeler. I have worked with modelers for many years so I understand a little bit about how modeling works. I'm still learning. 

The purpose of this post is to review how clinical data are generated and used, to help inform best clinical modeling practices and facilitate the management and use of clinical data across multiple use cases, such as patient care, clinical research, and public health.

Any medical student will agree: a major component of the standard medical school curriculum is devoted to understanding, organizing, documenting, and using clinical data. Clinical data are grouped by dates/patient encounters. The well-known mnemonic every medical student learns to document those encounters is SOAP:
  1. Subjective Observations
  2. Objective Observations
  3. Assessment
  4. Plan (and its execution)
These 4 steps are then repeated for the next encounter in an almost endless cycle. One can imagine the stopping rules, but I won't go over them here. This cycle describes what I call the “clinical data lifecyle.” The healthcare provider first collects subjective and objective observations on the patient; the provider then analyzes, interprets, or assesses the observations.  This assessment identifies of one or more medical conditions and their important attributes (e.g. severity, change from last assessment). The provider then develops a plan to address the medical conditions, which is then executed. The cycle is then repeated for the next encounter to determine the effect of the plan on the medical condition, and to identify any new conditions if necessary.

The Clinical Data Lifecycle
From an information management perspective, one can state the following.


  • Observations are collected and recorded, ideally without interpretation or bias
  • The target of the observation is the patient or a part of the patient (e.g. biospecimen)
  • The observer can be a person (investigator, patient, caregiver) or a device (EKG, MRI)
  • Sometimes the observer uses a device to perform an observation (e.g. blood pressure)
  • Sometimes information about the observer is important (expertise, reliability, etc.)
  • Sometimes information about the device is important (e.g. accuracy, precision, blood pressure cuff size)
  • The observer always follows a procedure, process, or “protocol” associated with the observation
  • Sometimes the procedure is not important to document (e.g. visually examine the skin)
  • Sometimes information about the procedure is important (e.g. rules for measuring tumor size on a medical image)
  • A more formal “protocol” to conduct the observation is sometimes developed to decrease variability and minimize bias
  • Sometimes a biospecimen is collected and/or processed
  • Sometimes information about the biospecimen is important (e.g. hemolyzed blood sample)


  • Observations are analyzed/interpreted/assessed by qualified entities
  • Often the assessor is a physician, but not always
  • Sometimes information about the entity is important (qualifications, etc.)
  • Entity could be a device (EKG machines can perform automated interpretations)
  • The results of the assessments are generally medical conditions and their properties (severity, change from previous assessment, etc.)
  • The assessment may consist of a formal adjudication process
  • Currently, we sometimes confuse observations and assessments
  • Adverse events (AEs) are not observations, rather they are medical conditions identified following an assessment of observations. A temporal association with an intervention is a necessary component of an AE
  • An abnormal laboratory finding is not an adverse event; an assessment is necessary to make that determination; often it involves looking at other related observations
  • As an example, the observations might be: low serum sodium; previous exposure to drug X; serum osmolality, urine specific gravity, urine sodium.
  • The assessment would be a new Medical Condition (an AE) – Syndrome of Inappropriate Anti-diuretic Hormone Secretion (SIADH), possibly due to drug X


  • In healthcare, this is the patient care plan
  • In investigational studies, this is the protocol
  • The plan is intended to affect in some way the medical conditions identified from the assessment
  • Often it involves the administration of a medical product
  • Often the intent is to treat, but can be prevent, diagnose, mitigate, cure, or even induce medication conditions 

The cycle now repeats. Additional observations are then collected to assess:

  • How well the plan was executed and
  • The effect of the executed plan on the Medical Conditions

The following mind map captures these concepts and relationships. I think they hold true for all patient care settings, including clinical trials.  I suggest it form a core “backbone” of any information model involving clinical data about a patient. In my experience, this captures how clinical data are generated and used in practice.

Clinical Data Concept Map

I recently reviewed the BRIDG 4.0 model, which was balloted in HL7 this past May. I plan to discuss BRIDG in more detail in future posts, but suffice it to say that I identified deficiencies in how clinical data are modeled in BRIDG. For example, the results of assessments are modeled as additional observations, which in clinical medicine are quite distinct. Assessment results lead to interventions in a care plan; observations do not. There is an ongoing ballot reconciliation process to address these and other concerns identified during the ballot. I encourage those interested to participate in those calls.  We particularly need individuals with clinical subject matter expertise. Details of the teleconferences are available on the HL7 website

No comments:

Post a Comment