iSpeak Blog

Data Quality Equation Theory

Christopher H. White

The Data Quality Equation

Regulatory agencies around the globe are focused on assuring patient safety and product quality; if we focus on the latter, product quality, many regulations and guidance identify the elementary expectations to achieve it. Product quality is derived from quality data that supports or gives evidence to the quality of the product. Recent regulatory observations direct industry to the conclusion that there can be severe penalties for not having data quality that leads to product quality. What is Data Quality and how do we achieve it? Data Integrity provides data we can trust and it is also the foundation of the Data Quality Equation. Data Management is the process by which we create, control, manage, utilize and maintain our data’s integrity. The combination of Data Integrity and Data Management results in Data Quality. In other words, Data Quality is mutually dependent on both Data Integrity and Data Management. Subsequently, Data Quality can be represented as:

Data Integrity + Data Management = Data Quality

Another element of this equation that should be considered is that better Data Integrity and/or better Data Management would produce better Data Quality. For example, if a more efficient means for Data Management can be created that eliminates risk and adds value to the process you can in turn realize fewer mistakes and higher Data Quality. Conversely, Data Quality with Data Integrity but without Data Management lacks control of your data. Similarly Data Management without Data Integrity will lack the elements necessary to have Data Quality. Data Integrity can be expressed in ALCOA+ elements, where the acronym stands for Attributable, Legible, Contemporaneous, Original, Accurate1 , complete, consistent enduring and available2 . Data Integrity is achieved when all ALCOA+ elements are present. Data Integrity ALCOA+ elements expectations are described below:

Attributable data must be linked to the individual who created the data.
Legible data is clear, concise, and readable. Changes to legible data must not hide or obscure the original record.
Contemporaneous data must include the date and time of its measurement or action. All electronic data, contemporaneous data must include metadata related to the action or event.
Original data must be the original record or a certified copy.
Accurate data is correct through the system’s lifecycle and indicates the same value and its correct meaning.
Complete data includes all data from actions taken to obtain the final result. Complete data includes all metadata generated for each action taken, including audit trails.
Consistent data shall be created in a manner that can be repeated, following a logical sequence based on the method or procedure. Consistency can be defined throughout the lifecycle of your data.
Enduring data must be protected from loss, damage and/or alteration and must be available throughout the defined retention period.
Available data is readily retrieved throughout the lifecycle of the system, or the appropriate retention period. Data must be available in human readable form.

Data Management is the process controls that assure the data is controlled and that Data Integrity, once established, is maintained.


Data Quality is the sum of both Data Integrity and Data Management. The appropriate starting point is to understand and apply the elements of Data Integrity and implement efficient Data Management; only after which, Data Quality can be achieved and relied upon. Special thanks to Mark E. Newton, QA, Global Quality Laboratories for Eli Lilly and Company, for reviewing this blog. ____________________________________


  • 1“ALCOA” is an acronym for the terms Attributable, Legible, Contemporaneous, Original, and Accurate. See Woollen, S. W. (2010, Summer) Data Quality and the Origin of ALCOA, Newsletter of the Southern Regional Chapter Society of Quality Assurance. Retrieved from
  • 2ALCOA+ adds Complete, Consistent, Enduring, and Available. See GCP Inspectors Working Group (GCP IWG). (2010, June). Reflection paper on expectations for electronic source data and data transcribed to electronic data collection tools in clinical trials. Retrieved from