Content Interoperability: Achieving Clinical Data Quality Success

by Lin Wan, Ph.D., Chief Technology Officer, Stella Technology


Over the past decade, many of the national efforts around interoperability have focused on messaging and transport.  And rightly so, since the infrastructure needed (and still needs) to be in place before clinical information can flow easily.  Yes, Meaningful Use and initiatives such as the Sequoia Project or the EHR|HIE interoperability Workgroup (which merged with HIMSS and became the ConCert by HIMSS™ vendor certification program) did address payload interoperability through their content specifications (such as the Continuity of Care Document, or more constrained versions of it).  However, those efforts did not solve a major pain point that continues to plague any healthcare organization aggregating data for health information exchange, population health management, quality measures or any analytics-related needs: how good and actually meaningful is the data being exchanged?

Western New York’s clinical information exchange, HEALTHeLINK™, has overcome the clinical data quality challenge by analyzing the content of the information flowing through its clinical network.

We have found that data quality can be defined as a multi-layer pyramid, with increasing complexity levels:

  1. The bottom layer is the syntax layer, which checks the data’s structure and format conformance against standard specifications (e.g. HL7 v2.5, C32 CCD, etc.)
  2. Next up, the completeness layer, checks for the presence of required data elements, e.g. names, addresses, vital signs, etc.
  3. The standardization/semantic interoperability layer ensures that the data has been coded properly, e.g. local labs codes vs. LOINC, standardized postal codes, gender codes, etc.
  4. Standardization is followed by validity checks that verify whether a given data value set is allowed, e.g. invalid ZIP codes, numeric vs. alphanumeric, negative age, etc.
  5. The next layer validates the data’s uniqueness – is it duplicate, or legitimately separate entries (e.g. multiple prescriptions for the same medication from different facilities)?
  6. The consistency layer identifies contradictions within the data set, e.g. two different data sources contradicting a patient’s smoking status, a child having a procedure of an older adult, etc.
  7. Lastly, the top and most advanced layer of the pyramid, accuracy, is able to identify what a given data set should be based on the analysis and understanding of other or previously collected data.  The latter is measured against controlled data samples to assess whether the values are within the expected results.


Stella Technology is one of the presenters at the HIMSS16 Interoperability Showcase Theatre. Please come visit us and let us know your thoughts on the challenges your organization is having around clinical data quality.

About the author

Lin Wan, Ph.D., Chief Technology Officer at Stella Technology, is a seasoned architect and standards expert with over 18 years of experience in healthcare software development, with proven success leading the design, development and deployment of interoperable healthcare solutions.