Distinguishing Clinical Observations and Medical Conditions

As the pharmaceutical industry moves quickly towards standardized clinical study data, I continue to see confusion in how clinical observations and medical conditions are represented. In clinical medicine there is a sharp distinction between these two concepts, yet in standardized study data submissions they are often lumped together.

A clinical observation is just that: someone observes a clinical symptom or sign of a patient. The observer may be the patient (in which case it's called a symptom) or someone else (in which case it's called a sign). Often a medical device is involved in the observation (blood pressure cuff, thermometer, x-ray, MRI, etc.). Often the observer follows a procedure to provide some level of consistency in making the observation. A headache, a rash, a fever, an elevated blood pressure, a low serum sodium, a round hyper intense lesion on an MRI are all examples of clinical observations.

Then we have medical conditions, which are diseases or injuries or other conditions that interfere with well-being. The Free Medical Dictionary defines a medical condition as:  A disease, illness or injury; any physiologic, mental or psychological condition or disorder (e.g., orthopaedic; visual, speech or hearing impairments; cerebral palsy; epilepsy; muscular dystrophy; multiple sclerosis; cancer; coronary artery disease; diabetes; mental retardation; emotional or mental illness; specific learning disabilities; HIV disease; TB; drug addiction; alcoholism). A biological or psychological state which is within the range of normal human variation is not a medical condition.  

This is a reasonable definition, although I disagree with the last sentence in that certain temporary  but normal physiologic states can be considered medical conditions. For example, I think pregnancy is a medical condition because, although it is within the range of normal human variation, it is a state that benefits from medical intervention (pre-natal care, obstetrical care) to minimize complications to the mother and child. 

Herein lies the major distinction between clinical observations and medical conditions. Clinical observations serve as input to an assessment (often conducted by a health care professional) to determine that a medical condition exists. Medical conditions are listed on a patient's problem list and are the focus of medical interventions. Clinical observations are not. 

Take for example an elevated blood pressure (clinical observation). Does this mean the patient has hypertension (medical condition)? Not necessarily. If the patient is obese, perhaps the wrong device was used. A normal sized cuff on an obese patient may give falsely elevated readings. Repeating the blood pressure measurement using a larger cuff may in fact reveal a normal blood pressure. It is also well known that anxiety of visiting the doctor may temporarily raise the blood pressure, so repeated measurements over time in a more relaxed setting could demonstrate normal serial blood pressures. 

Adverse Events are medical conditions temporally associated with a medical intervention. Here again, an assessment is necessary to determine that one or more observations indicate the presence of an adverse event. 

Study data standards continue to mix the two. In an earlier post, I described how BRIDG has the same problem. As another example, I was recently reviewing the Therapeutic Area User's Guide (TAUG) for Multiple Sclerosis. It advocates placing past clinical observations related to the MS diagnosis in the medical history (MH) domain. The past medical history portion of a typical history and physical evaluation is reserved for past medical conditions, not for past clinical observations. I think they should be separate. Unfortunately there is no good place in the SDTM to place past clinical observations.

Why keep them separate? Medical conditions have metadata that differ from clinical observations. For example, one may want to know which observations were used to identify the medical condition, who did the assessment, the date the medical condition began (i.e. the date of the earliest observation associated with the medical condition), the date of diagnosis (when the medical condition was first identified, which is often later than the onset date) etc. Was it associated with a medical intervention, and if so which ones? Is the medical condition considered adverse? From such a "medical condition" domain, one could derive AE, CE and MH.

I also think we need a single unambiguous way to represent all clinical observations, past and present. This is still lacking in clinical research, which results in multiple domains and new domains as data for more and more therapeutic areas are standardized. Key metadata for clinical observations are the date the observation was made, who made it, was it planned or unplanned, what procedure (if any) was used, what device (if any) was used, relevant metadata about the procedure/method/anatomic location, relevant metadata about the device, any associated observations (e.g. diastolic BP and systolic BP). We do not need an imaging domain, as others have suggested. Imaging is simply a method to obtain an observation.


Study Data Exchange: The Unsustainable Path

The current state of study data exchange, based on a tabular representation of study data, is in trouble. The CFAST experience is demonstrating that the requirements for new therapeutic areas (TAs) result in an ever increasing need for new variables and domains. The burden to industry and FDA to adapt to these changes is too great. Rapid changes in implementation guides  (IGs) and the data model itself are quickly exceeding an organization's ability to keep up. As new IG versions emerge, there will be a need to update the TA user guides, not to mention updates to validation rules, databases, etc. The resources needed to do that are quite formidable.

I believe the solution is to adopt a more robust, relational data model for study data exchange that is capable of incorporating new requirements easily, often by simply adding new terms to standard terminologies, rather than adding new variables and domains. The time to do this is now because the current approach is not sustainable. For example, the latest version supported by the most currently published FDA validation rules is SDTM v1.3/SDTM IG 3.1.3 yet the SDTM v1.4 is already published and work is well underway on SDTM v1.5. When I was at FDA, I was the chair of the Change Control Board (CCB) for the validation rules and I can say that it is extremely challenging to keep up with this rate of change, even if the Agency moves towards a paradigm of updating and maintaining just the business rules, as is currently planned.

Let me further illustrate the challenges of sustaining the current path with a simple analogy. Imagine that we want to exchange a person's contact information, similar to what is stored in an electronic address book. What should the exchange standard look like for these data? Here's a simple data model for contact information:

Assume that we add controlled terminology for certain concepts such as State, Zip Code, Area Code, and we have a perfectly reasonable and functional model. However, how do we handle a new requirement, such as exchanging both home and business addresses, phone numbers and email addresses? The current model doesn't support this requirement, so we update the model. We introduce new variables for home and business addresses, phone numbers, and emails. We get something like this:

Problem solved. Requirement met. We can group these data elements into logical groupings or "domains" and suddenly we start seeing the similarity to the SDTM:

However, validation rules, databases, and tools developed under the first model all need to be updated to accommodate the second model. This takes time and expense.

Yet new requirements continue to emerge. Now we want to exchange mobile phone information, so we update the model yet again to add new variables and a new MB domain:

And the cycle repeats for new TAs: update the IG, the validation rules, databases, tools.

There is a better approach: adopt a more robust relational data model as shown here:

Furthermore, we introduce controlled terminology for email type, address type, phone type (e.g. home, work, mobile), and we can accommodate all the requirements described thus far without a change in the model but rather simply adding new controlled terms to these concepts. More importantly, future requirements are also incorporated easily. Let's say many of the contacts are corporate executives with summer homes and we want to capture second home information? We add a new controlled term to the Address Type concept (i.e. second home) and we're done. No changes to IGs, validation rules, databases etc. are necessarily needed.

Does the more robust relational data model solve all our problems? No. New requirements may yet emerge that may necesitate changes to the underlying model (e.g. birth date, marital status), but the goal is to design the relational model as flexible as possible to make the need for such changes as infrequent as possible, to minimize the implementation burden.

Getting back to study data exchange. I believe we are in dire need of a new, more robust data model that represents all clinical observations and assessments in a single standard representation so that new clinical data requirements for additional therapeutic areas can be incorporated easily by adding new terms to a dictionary and without having to change the underlying model. The current approach is not sustainable and I foresee the entire therapeutic area standardization effort collapsing under its own weight. I think it is already starting to happen.