2015-12-01

Aristotle and How Best to Define Things

The great challenge in automating analysis of biomedical data is the fact the people use different words for the same thing and the same word for different things. Having clear, unambiguous definitions, or semantics, is of course critical. I wrote a bit about this in a post on the Interoperability Problem. In today's post, I discuss how to best define things. It turns out Aristotle figured this out long ago. But first I discuss the context in which these definitions matter.

Also in a previous post, I explored using BRIDG as a computable ontology. Since then, I have continued to work on various projects (both within and outside of PhUSE) to represent study information using RDF. I keep running up against the same limitation: the need for a computable study ontology based on clear, unambiguous, computable definitions. For example, in the PhUSE project to represent eligibility criteria using RDF, the criteria need to link to a subject's screening data collected in a study. Where does one link to?

It turns out that the Open Biomedical Ontologies provides an Ontology for Biomedical Investigations (OBI). I'm in the process of evaluating this ontology to see if it meets our needs. I think it holds great promise. One advantage is that the OBI, as are other ontologies that make up the OBO, is based on a single, common reference ontology called the BFO (Basic Formal Ontology). This helps establish interoperability of biomedical data from various sources that are expressed using any of the ontologies in the OBO. I'm in the middle of reading a book on the BFO and I will hopefully have more to say about it in future posts.

In the meantime, to continue our work, we have built a mini human study ontology containing only the sufficient classes and relationships to support these projects. For this I have turned to BRIDG to use existing classes wherever possible. Unfortunately, I keep running into problems with the way many BRIDG classes are defined. I find the current BRIDG definitions don't easily lend themselves to an unambiguous OWL representation.  We have discussed this in previous BRIDG working group meetings. It's clear to me that many BRIDG definitions need to be refined before a useful OWL representation can be developed.

Let me explain with some examples. But before I do so, I ran across an interesting chapter in the BFO book on good practices in developing ontologies. One of them is the principle of applying "Aristotelian Definitions" to concepts. I've never heard of the phrase before and this is what I understand.  An Aristotelian definition is one that has the following form:

S = def. a G that D's.

Where G (for genus) is the immediate parent term of S (for species) in the ontology and D stands for differentia, which is to say D describes what it is about G that makes it an S. Ideally, the differentia itself is described in terms that are already defined in the ontology.

Consider Aristotle's own definition of Human, as described in the BFO book:
human (S) = is an Animal (G) that is rational (D).

As the BFO book points out "...following this Aristotelian definitional structure ensures that the set of definitions in an ontology precisely mirror the hierarchy of greater and lesser generality among its universals."

Now let's take a look at BRIDG definitions relevant to a Study Subject. The latest version 4.0 documents the following:


BiologicEntity
Any individual living (or previously living) being.

Person
A human being.


Subject
An entity of interest, either biological or otherwise.

StudySubject
A physical entity which is the primary unit of operational and/or administrative interest in a study.

For the purposes of discussion, I use the colon (":") to describe a resource in RDF (e.g. an OWL class or property).  

It's clear to me that :Person is a :subClassOf :BiologicEntity. If the :BiologicEntity has a property :species with a value :HomoSapiens then a reasoning engine can conclude that the :BiologicEntity is also a :Person. I see no problem there.  

It's also clear a BRIDG Subject may or may not be a Biologic Entity. But what does it mean for a Subject to be an entity "of interest?" Is a Person that is being considered for recruitment into a trial a Subject? If so, one can define an activity called :RecruitmentActivity and a property called :undergoes and then create an Aristotelian definition that a :Subject is an :Entity that :undergoes a :RecruitmentActivity. This makes sense for a Human Subject but not so much for animals or non-biologic things. How does one resolve this?

One way to sidestep the issue is to define a :subClassOf :Subject called :HumanSubject and create the relationships above only for the HumanSubject class. So the Aristotelian definitions becomes: 

A :HumanSubject is a :Person that undergoes a :RecruitmentActivity. 

So then how is a :HumanSubject related to a :StudySubject? It's not clear, since StudySubjects can be non-human. One way to resolve this is to create a subClassOf :StudySubject called :HumanStudySubject and then one can say that a :HumanStudySubject is a :subClass of :HumanSubject.  More specifically, one can define a property called :participatesIn and say that a :HumanStudySubject is any :HumanSubject that :participatesIn any :Study.  (:participatesIn can be defined as sub-property of :undergoes, i.e. :undergoes any protocol-specified activity, including informed consent or screening). 

So for the purposes of our mini-ontology, we have the following computable, Aristotelian definitions. I'd appreciate feedback on these. 

BiologicEntity
Any individual living (or previously living) Entity.
(i.e. an :Entity (G) that is living or was previously living (D)) 

Person
A human being. A BiologicEntity that has species homo sapiens.

Subject
An Entity, either biologic or otherwise, of interest for investigation in a Study.

StudySubject
An Entity which is the primary unit of operational and/or administrative interest in a study. The StudySubject undergoes (is subjected to) Study-specified activities as described in the StudyProtocol.
(i.e. a :Subject (G) that undergoes/is subjected to Study-specified activities (D) as described in the Study Protocol). 

HumanStudySubject
A StudySubject who is also a Person.

And then it can follow that: 

EligibleSubject
A HumanStudySubject who satisfies all Study-specific Eligibility Criteria.