2017-05-18

Common Clinical Terms expressed in OWL

One of the goals in establishing precise Aristotelian definitions for common clinical terms is to make them computable, i.e. express them in such a way that computers and information systems can reason across data and "understand" that Thing123 is a Medical Condition and Thing456 is a Symptom and can begin to infer new medical knowledge for us. The Web Ontology Language, OWL, is ideally suited for this task. I'm not an OWL expert, but I think it would be useful to explore what some of these terms look like in OWL and consider the implications of computable definitions.

How does computer-assisted reasoning (also called inferencing) work? A simple example is the property :subClass. The colon here is merely to remind me that this is a resource expressed somewhere on the world-wide-web. I have left out the namespace for simplicity. If you state that :Apple is a :subClass of :Fruit and :McIntosh is a :subClass of :Apple, then a computer can infer that :McIntosh is a subClass of :Fruit (new knowledge). This is trivial inferencing of course, but OWL supports much more sophisticated inferencing capabilities about which I have only begun to appreciate. One very important class in OWL is the owl:Restriction class, a sub-class of owl:Class. A restriction class is one whose membership is restricted based on certain properties that the individual member has. I wrote about Restriction classes in a previous post. We can use this class to make certain definitions computable. Without restriction classes, we have to manually assign members to a class. For example the :Dog class has no meaning to a computer. We have to manually assign Fido, Spot, Rocket, and Buddy to the :Dog class. :Fido is a :Dog is true only because someone said it's true.  But restriction classes create in effect rules that say only Things with certain features/properties are :Dogs. Once we establish these computable definitions, then computers can do the assignments for us.

Let's look at an example from the clinical research domain: Persons that participate in a trial. Consider this taxonomy (i.e. superclass/subclass hierarchy). As one reads this, a member of a lower class is automatically a member of the next higher class, and so forth all the way to the top class. This makes the rdfs:subClassOf property a "transitive" property.

--BiologicEntity
    -- Person
          --HumanStudySubject
              --EligibleSubject
              --EnrolledSubject

And now some working definitions (taken from various sources and documented here)

BiologicEntity
Any individual living (or previously living) Entity.
(i.e. an :Entity that is living or was previously living) 

Person
A human being. A BiologicEntity that has species = homo sapiens. 

HumanStudySubject
A :Person that undergoes/is subjected to Study-specified activities as described in the Study Protocol

And then it can follow that: 

EligibleSubject
A HumanStudySubject who satisfies all Study-specific Eligibility Criteria.

Now assume that every BiologicEntity has a property called :species and only Persons have a :species property value = homo sapiens. I can express the definition of a Person as the following in OWL

:Person  a      owl:Class ;
        owl:equivalentClass  [ a                  owl:Restriction ;
                               owl:hasValue  species:homo_sapiens ;
                               owl:onProperty     :species ; ] ;

Now let's define a class called :StudyActivity, containing any protocol-specified activity belonging to a specific human study. We now define the following

study:HumanStudySubject  a      owl:Class ;
        owl:equivalentClass  [ a                  owl:Restriction ;
                               owl:someValuesFrom  study:StudyActivity ;
                               owl:onProperty     study:participatesIn ;] ;

This says a HumanStudySubject is any Thing that participates in some StudyActivity.
So anywhere there is an RDF Triple that says :Person :participatesIn :StudyActivity_104, then that Person is automatically inferred to be a :HumanStudySubject.

This approach is not without some notable pitfalls. One's logic must be squeaky clean. Take this example: A Dog has Four Legs. Now if we mistakenly convert that to mean a Dog is any Thing with Four Legs (which clearly is wrong in English), you get the following OWL expression:

:Dog  a      owl:Class ;
        owl:equivalentClass  [ a                  owl:Restriction ;
                               owl:hasValue "4"ˆˆxsd:int ;  
                               owl:onProperty     :hasLegs ;] ;

So somewhere in a database is the triple:  :Morris :hasLegs "4"ˆˆxsd:int .

Well, guess what, Morris is Cat. But based on the OWL definition of a Dog, an information system will conclude that Morris is a Dog. So one must be careful which properties one selects to define membership in a restriction class.

The more I think about this approach, the more it makes sense. How do we currently distinguish two Things with the same name, e.g. Mustang (the car vs. the horse)? Easy. By the properties that each Thing has. One is a biological entity, the other is a machine; one has 4 legs and tail, the other has an engine and 4 wheels. It makes sense to define members of a class by describing the properties that each member must have. This principle of making definitions of clinical terms computable is a key component to less ambiguous clinical trial data. Using this same approach, we can enable computers to identify members of other useful and interesting restriction classes, such as :EligibleSubject, :EffectiveDrug,  :DangerousDrug, :PoorQualityDrug.

The possibilities can be very exciting.






No comments:

Post a Comment