It's been two years since I wrote about quality data in clinical trials. As I re-read that post now, I agree with most of what I said, but it's time to update my thinking based on experience gained since then with study data validation processes.
I made the point that there are two types of validation rules: conformance rules (to data standards) and business rules, a.k.a. data quality checks. I had suggested that conformance rules are best managed by the standards development organization, the fact is that sponsors and FDA support multiple standards (SDTM, MedDRA, CDISC Terminology, WHO Drug Dictionary) so it's up to FDA to manage the collective set of conformance standards across the broad data standards landscape.
The division between conformance rules and business rules is still quite important. They serve different functions. Ensuring conformance to standards enable automation. Ensuring data quality enable meeting the study objectives. One can assess data quality on legacy data. It is a slow, manual process. Standardized data enable automated data quality checks than can more easily uncover serious data content issues that can impede analysis and review.
As a former FDA reviewer, and a big proponent of data standards, I can honestly say that FDA reviewers care very little about data standards issues. All they care about is that the data be sufficiently standardized so they can run standard automated analyses on the data. The analyses drive the level of standardization needed by the organization. These analyses include the automated data quality checks. One cannot determine if AGE < 0 (a data quality check), if AGE is called something else or is located in the wrong domain (conformance rule).
It's like driving a car. You want to get from point A to point B quickly (data quality issues), you don't really care what's under the hood (standards conformance issues). That is for mechanics (or data analysts) to worry about.
FDA now has a robust service to assess study "Data Fitness" (being described as data that are fit for use). Data Fitness combines both conformance and business rules. They are not split, and the reviewer is left to deal with data conformance issues, which they care little about as there is generally a manual work-around, along with the data quality issues, which is of most important to them and have the biggest impact on the review. Combining the two is a mistake. I believe Data Fitness as a concept should be retired and the service split into two: Standards Conformance, and Data Quality. The Data Quality assessment service should only be performed on data that have passed the minimum level of conformance needed by the organization. If a study fails conformance testing, it wasn't standardized properly and those errors need to be corrected. In the new era requiring the use of data standards, FDA reviewers should not be burdened with data that do not pass a minimum level of conformance validation.
Consider this hypothetical scenario as an example to drive home my point. FDA requires sponsors to submit a study report supporting the safety and effectiveness of a drug. The report should be submitted digitally using the PDF standard. The report arrives and the file cannot be opened using the review tool (i.e. Acrobat) because of 10 errors in PDF implementation (not realistic in today's day and age, but possible nonetheless). Those 10 errors are provided in a validation report to the reviewer for action. The reviewer doesn't care about the technical details of implementing PDF. They want a file that opens and is readable within Acrobat. All can agree that the reviewer should not be burdened evaluating and managing standards non-conformance issues.
If you replace study report with study data, and PDF with SDTM, this scenario is exactly what is happening today. But somehow that practice remains acceptable. Why? Well because there are other "tools" (akin to a simple text editors in the document world) that allow reviewers to work with non-conformant data, albeit at much reduced efficiency. These "workarounds" for non-standard study data are all too prevalent and acceptable. With time this needs to change to take full advantage of standardized data for both Sponsor and FDA alike.
My future state for standardized study data submissions look like this: study data arrives, they undergo standards conformance validation using pass/fail criteria. Those that pass go to the reviewer and the regulatory review clock starts. Those that fail are returned for correction. (The conformance rules are publicly available so that conformance errors can be identified and corrected before submission.) During the filing period, automated data quality checks are performed and that report goes to the reviewer. Deficiencies result in possible information requests. Serious data quality deficiencies may form the basis of a refuse to file action.
Finally, let's retire use of the term "DataFit" in favor of what we really mean: Standards Conformance or Data Quality. Let's not muddle these two important issues any longer.