Everyone involved in planning, conducting, analyzing
clinical trials wants quality data, but what do we mean by “quality” data in
clinical trials? Here is a definition that I ran across that I particularly
like. I don’t know who wrote this. If you know, please leave a comment:
In clinical trials,
quality data support the objectives and analyses described in the protocol and
accurately reflect a subject’s experience related to those objectives.
In the time I have worked helping to standardize clinical
trial data, I continue to run across the misconception that standardizing the
data somehow make it higher quality data. It is important to understand that
Standardized Data does NOT equal quality data. However,
Standardized data makes it easier to assess data quality.
Let’s explore these two statements further.
Quality Data is all about what is collected, available for analysis, reported, and from FDA’s
perspective, submitted for regulatory review. What gets collected, analyzed, reported depends on good science and
regulatory policy.
High Quality data are the result of many factors:
- Good protocol design
- Good study execution
- Good data collection and management processes
- Qualified research staff
- Others ….
Standardized data is all about how to structure the data to make it more useful. In the SDTM, it
means standard domain names, file names, standard tabular structures, standard
terminology, and standard data types.
Well Standardized data are the result of good:
- Data standards
- Understanding of the data standards
- Implementation of the data standards
- Conformance to the data standards
Here are some examples of poorly standardized, high quality
data:
|
|
|
Determining quality is a slow, manual process. It requires
visual inspection of the data. One cannot write a reusable software program to
automatically check that the age of the subject is reasonable.
Here are some examples of well standardized, but poor
quality data:
|
|
|
Because the data are standardized, we can now automate
quality checks. We can write a computer program to check data quality or us:
- If AGE is <missing> then generate quality alert report
- If [AGE > 120 and AGEU = YEARS] then generate quality alert report
Also, these quality checks can be built into the electronic
data capture (EDC) instrument to improve the data collection processes that do
affect quality.
We are surrounded by data quality checks at the point of
data collection. We’ve become used to them. How often do you fill out an online
form and try to submit it, only to be alerted that a particular data field is
not filled out properly?
Data quality check on usps.com |
We need more quality checks in clinical trials data collection
processes.
So what about validation rules? There are two types:
conformance rules (how well the data conform to the data standards) and quality
checks (sometimes also called business rules).
Conformance rules depend on the standard. If the standard
changes, the rules may change.
Quality checks are data standards independent. They make
sense whether the data are standardized or not.
An age less than 0 is a data quality problem, whether we are dealing
with legacy data or standardized data. Standardized data do however enable data quality checks. This is a huge benefit.
Conformance rules are best managed by the standards
development organization that creates and maintains the standard. They understand
what it means to be standards-conformant. The users of the data best maintain
quality checks. They understand the implications if data are missing or of poor
quality.
In the case of SDTM submissions to FDA, the current set ofvalidation rules contains a mixture of conformance rules and quality checks. It
makes sense to move to a governance model where CDISC manages the conformance
rules and FDA manages the quality checks.
No comments:
Post a Comment