Common Clinical Terms expressed in OWL

One of the goals in establishing precise Aristotelian definitions for common clinical terms is to make them computable, i.e. express them in such a way that computers and information systems can reason across data and "understand" that Thing123 is a Medical Condition and Thing456 is a Symptom and can begin to infer new medical knowledge for us. The Web Ontology Language, OWL, is ideally suited for this task. I'm not an OWL expert, but I think it would be useful to explore what some of these terms look like in OWL and consider the implications of computable definitions.

How does computer-assisted reasoning (also called inferencing) work? A simple example is the property :subClass. The colon here is merely to remind me that this is a resource expressed somewhere on the world-wide-web. I have left out the namespace for simplicity. If you state that :Apple is a :subClass of :Fruit and :McIntosh is a :subClass of :Apple, then a computer can infer that :McIntosh is a subClass of :Fruit (new knowledge). This is trivial inferencing of course, but OWL supports much more sophisticated inferencing capabilities about which I have only begun to appreciate. One very important class in OWL is the owl:Restriction class, a sub-class of owl:Class. A restriction class is one whose membership is restricted based on certain properties that the individual member has. I wrote about Restriction classes in a previous post. We can use this class to make certain definitions computable. Without restriction classes, we have to manually assign members to a class. For example the :Dog class has no meaning to a computer. We have to manually assign Fido, Spot, Rocket, and Buddy to the :Dog class. :Fido is a :Dog is true only because someone said it's true.  But restriction classes create in effect rules that say only Things with certain features/properties are :Dogs. Once we establish these computable definitions, then computers can do the assignments for us.

Let's look at an example from the clinical research domain: Persons that participate in a trial. Consider this taxonomy (i.e. superclass/subclass hierarchy). As one reads this, a member of a lower class is automatically a member of the next higher class, and so forth all the way to the top class. This makes the rdfs:subClassOf property a "transitive" property.

    -- Person

And now some working definitions (taken from various sources and documented here)

Any individual living (or previously living) Entity.
(i.e. an :Entity that is living or was previously living) 

A human being. A BiologicEntity that has species = homo sapiens. 

A :Person that undergoes/is subjected to Study-specified activities as described in the Study Protocol

And then it can follow that: 

A HumanStudySubject who satisfies all Study-specific Eligibility Criteria.

Now assume that every BiologicEntity has a property called :species and only Persons have a :species property value = homo sapiens. I can express the definition of a Person as the following in OWL

:Person  a      owl:Class ;
        owl:equivalentClass  [ a                  owl:Restriction ;
                               owl:hasValue  species:homo_sapiens ;
                               owl:onProperty     :species ; ] ;

Now let's define a class called :StudyActivity, containing any protocol-specified activity belonging to a specific human study. We now define the following

study:HumanStudySubject  a      owl:Class ;
        owl:equivalentClass  [ a                  owl:Restriction ;
                               owl:someValuesFrom  study:StudyActivity ;
                               owl:onProperty     study:participatesIn ;] ;

This says a HumanStudySubject is any Thing that participates in some StudyActivity.
So anywhere there is an RDF Triple that says :Person :participatesIn :StudyActivity_104, then that Person is automatically inferred to be a :HumanStudySubject.

This approach is not without some notable pitfalls. One's logic must be squeaky clean. Take this example: A Dog has Four Legs. Now if we mistakenly convert that to mean a Dog is any Thing with Four Legs (which clearly is wrong in English), you get the following OWL expression:

:Dog  a      owl:Class ;
        owl:equivalentClass  [ a                  owl:Restriction ;
                               owl:hasValue "4"ˆˆxsd:int ;  
                               owl:onProperty     :hasLegs ;] ;

So somewhere in a database is the triple:  :Morris :hasLegs "4"ˆˆxsd:int .

Well, guess what, Morris is Cat. But based on the OWL definition of a Dog, an information system will conclude that Morris is a Dog. So one must be careful which properties one selects to define membership in a restriction class.

The more I think about this approach, the more it makes sense. How do we currently distinguish two Things with the same name, e.g. Mustang (the car vs. the horse)? Easy. By the properties that each Thing has. One is a biological entity, the other is a machine; one has 4 legs and tail, the other has an engine and 4 wheels. It makes sense to define members of a class by describing the properties that each member must have. This principle of making definitions of clinical terms computable is a key component to less ambiguous clinical trial data. Using this same approach, we can enable computers to identify members of other useful and interesting restriction classes, such as :EligibleSubject, :EffectiveDrug,  :DangerousDrug, :PoorQualityDrug.

The possibilities can be very exciting.


Activity Rules in Clinical Trials

I remain very interested in modeling Activities in clinical trials. Activities are performed according to Rules that are specified in the Protocol. I am searching how to best model these rules. Here is one thought.  All Activities have a Start Rule. Rules are Analyses in our mini-study ontology since one Analyzes the data from other Activities ("Prerequisite Activities") to determine if and when the target Activity can take place.

Let's start with the easiest Activity (from a modeling perspective): Obtaining Informed Consent. It has a Start Rule that says "begin at any time." The Rule is automatically satisfied by default so has a RuleOutcome always set to TRUE.   There are no preconditions other than a Subject's willingness to participate in this Activity. The Activity completes with an ActivityOutcome = InformedConsent_GRANTED.  The ActivityStatus is now Complete. The RDF looks something like this:

:Person1 :participatesIn :InformedConsent1 .
:InformedConsent1 :hasStartRule :Rule_DEFAULT .
:Rule_DEFAULT :activityOutcome "TRUE"ˆˆxsd:string
:InformedConsent1 :hasPerformedDate "2017-04-01"ˆˆxsd:date;   (the date the activity was performed)
           :activityStatus :activitystatus_CO;        (the Activity is completed)
           :activityOutcome :informed consent_GRANTED .

Let's assume it's a simple trial that has only three screening activities: DemographicDataCollection, FastingBloodSugar, and a serum PregnancyTest (if female).

We create a new Rule which says :  RuleOutcome is TRUE if the prerequisite activity is complete and has a certain outcome. Generically it looks like this.

:PrerequisiteOutcomeRule  rdfs:subClassOf :Rule .
          :hasPrerequisite    :Activity ;
          :hasPrerequisiteStatus  :activitystatus_CO;    (the prerequisite activity must be complete)
          :hasPrerequisiteOutcome :ActivityOutcome .  (the prerequisite activity must have a certain outcome)
(there are optional properties one can consider here, like :delay , which specifies a duration that one must wait before the target activity can begin after the prerequisite activity ends.)

The instance data looks like this:

:Person1 :participatesIn :DemographicDataCollection1 .
:DemographicDataCollection1 :hasStartRule  :PrerequisiteOutcomeRule1 .
:PrerequisiteOutcomeRule1 :hasPrerequisite :InformedConsent1;
          :hasPrerequisiteStatus :activitystatus_CO;
          :hasPrerequisiteOutcome :informed consent_GRANTED.

Once this information is recorded, and the rule satisfied, the Activity becomes a PlannedActivity.

An embedded spin:rule can check the PrerequisiteActivity and return a value of TRUE if the conditions are met:

 :PrerequisiteOutcomeRule1 :activityOutcome "TRUE"ˆˆxsd:string .

Once the RuleOutcome is evaluated as TRUE, another spin:rule can set the scheduled date of the DemographicDataCollection equal to the date the InformedConsent was done (plus any delay specified in the Rule). The Activity now becomes a ScheduledActivity.

The same Rule can be applied to the FastBloodSugar activity.

PregnancyTest requires a new rule. It can begin with the DemographicDataCollection Activity is Complete and the Outcome is :sex_Female.

Now the very cool part. These same rules can be used to check for Eligibility. Why? Because an eligibility criterion is nothing more than a Start Rule which defines when the RandomizationActivity can begin.

Similarly we can define rules when Visits, Elements, Epochs can begin. The Study now becomes a graph of StudyActivities, all waiting to begin, but only those that have Rules with RuleOutcome=TRUE are ready to go next. I think this paradigm will hold for even the most complex adaptive designs. It is something worth testing.

Finally, an existential question. Is ObtainingInformedConsent a StudyActivity?  If a HumanSubject (a Person of Interest) participatesIn the ObtainInformedConsent Activity, but does NOT grant informed consent, is he/she considered to have participated in the Study? If so, what would their disposition be? Screen Failure doesn't seem right since no screening tests were conducted. Is ObtainingInformedConsent a ScreeningActivity? If so, then failure to obtain informed consent could be considered a screen failure.

Please share your thoughts, and especially if you have other ideas on how model activity rules.