How does computer-assisted reasoning (also called inferencing) work? A simple example is the property :subClass. The colon here is merely to remind me that this is a resource expressed somewhere on the world-wide-web. I have left out the namespace for simplicity. If you state that :Apple is a :subClass of :Fruit and :McIntosh is a :subClass of :Apple, then a computer can infer that :McIntosh is a subClass of :Fruit (new knowledge). This is trivial inferencing of course, but OWL supports much more sophisticated inferencing capabilities about which I have only begun to appreciate. One very important class in OWL is the owl:Restriction class, a sub-class of owl:Class. A restriction class is one whose membership is restricted based on certain properties that the individual member has. I wrote about Restriction classes in a previous post. We can use this class to make certain definitions computable. Without restriction classes, we have to manually assign members to a class. For example the :Dog class has no meaning to a computer. We have to manually assign Fido, Spot, Rocket, and Buddy to the :Dog class. :Fido is a :Dog is true only because someone said it's true. But restriction classes create in effect rules that say only Things with certain features/properties are :Dogs. Once we establish these computable definitions, then computers can do the assignments for us.
Let's look at an example from the clinical research domain: Persons that participate in a trial. Consider this taxonomy (i.e. superclass/subclass hierarchy). As one reads this, a member of a lower class is automatically a member of the next higher class, and so forth all the way to the top class. This makes the rdfs:subClassOf property a "transitive" property.
--BiologicEntity
-- Person
--HumanStudySubject
--EligibleSubject
--EnrolledSubject
And now some working definitions (taken from various sources and documented here)
BiologicEntity
Any individual living (or previously living) Entity.
(i.e. an :Entity that is living or was previously living)
Person
A human being. A BiologicEntity that has species = homo sapiens.
HumanStudySubject
A :Person that undergoes/is subjected to Study-specified activities as described in the Study Protocol
And then it can follow that:
EligibleSubject
A HumanStudySubject who satisfies all Study-specific Eligibility Criteria.
Now assume that every BiologicEntity has a property called :species and only Persons have a :species property value = homo sapiens. I can express the definition of a Person as the following in OWL
:Person a owl:Class ;
owl:equivalentClass [ a owl:Restriction ;
owl:hasValue species:homo_sapiens ;
owl:onProperty :species ; ] ;
Now let's define a class called :StudyActivity, containing any protocol-specified activity belonging to a specific human study. We now define the following
study:HumanStudySubject a owl:Class ;
owl:equivalentClass [ a owl:Restriction ;
owl:someValuesFrom study:StudyActivity ;
owl:onProperty study:participatesIn ;] ;
This says a HumanStudySubject is any Thing that participates in some StudyActivity.
So anywhere there is an RDF Triple that says :Person :participatesIn :StudyActivity_104, then that Person is automatically inferred to be a :HumanStudySubject.
This approach is not without some notable pitfalls. One's logic must be squeaky clean. Take this example: A Dog has Four Legs. Now if we mistakenly convert that to mean a Dog is any Thing with Four Legs (which clearly is wrong in English), you get the following OWL expression:
:Dog a owl:Class ;
owl:equivalentClass [ a owl:Restriction ;
owl:hasValue "4"ˆˆxsd:int ;
owl:onProperty :hasLegs ;] ;
So somewhere in a database is the triple: :Morris :hasLegs "4"ˆˆxsd:int .
Well, guess what, Morris is Cat. But based on the OWL definition of a Dog, an information system will conclude that Morris is a Dog. So one must be careful which properties one selects to define membership in a restriction class.
The more I think about this approach, the more it makes sense. How do we currently distinguish two Things with the same name, e.g. Mustang (the car vs. the horse)? Easy. By the properties that each Thing has. One is a biological entity, the other is a machine; one has 4 legs and tail, the other has an engine and 4 wheels. It makes sense to define members of a class by describing the properties that each member must have. This principle of making definitions of clinical terms computable is a key component to less ambiguous clinical trial data. Using this same approach, we can enable computers to identify members of other useful and interesting restriction classes, such as :EligibleSubject, :EffectiveDrug, :DangerousDrug, :PoorQualityDrug.
The possibilities can be very exciting.
No comments:
Post a Comment