Guide to Data Quality Management #1 – The 9 Dimensions of Data Quality

Data Quality refers to an organization’s ability to maintain the quality of its data in time. If we were to take some data professionals at their word, improving Data Quality is the panacea to all our business woes and should therefore be the top priority. 

At Zeenea, we believe this should be nuanced: Data Quality is a means amongst others to limit the uncertainties of meeting corporate objectives. 

In this series of articles, we will go over everything data professionals need to know about Data Quality Management (DQM):

Some definitions of Data Quality

Asking Data Analysts or Data Engineers for a definition of Data Quality will provide you with very different answers – even within the same company, amongst similar profiles. Some, for example, will focus on the unity of data, while others will prefer to reference standardization. You may yourself have your own interpretation.

The ISO 9000-2015 norm defines quality as “the capacity of an ensemble of intrinsic characteristics to satisfy requirements”. 

DAMA International (The Global Data Management Community) – a leading international association involving both business and technical data management professionals – adapts this definition to a data context: “Data Quality is the degree to which the data dimensions meet requirements.”

The dimensional approach to Data Quality

From an operational perspective, Data Quality translates into what we call Data Quality dimensions, in which each dimension relates to a specific aspect of quality. 

The 4 dimensions most often used are generally completeness, accuracy, validity, and availability. In literature, there are many dimensions and different criteria to describe Data Quality. There isn’t however any consensus on what these dimensions actually are.

For example, DAMA enumerates sixty dimensions – when most Data Quality Management (DQM) software vendors usually offer up five or six.

 

The nine dimensions of Data Quality

At Zeenea, we believe that the ideal compromise is to take into account nine Data Quality dimensions: completeness, accuracy, validity, uniqueness, consistency, timeliness, traceability, clarity, and availability.

We will illustrate these nine dimensions and the different concepts we refer to in this publication with a straightforward example:

Arthur is in charge of sending marketing campaigns to clients and prospects to present his company’s latest offers. He encounters, however, certain difficulties:

  • Arthur sometimes sends communications to the same people several times,

  • The emails provided in his CRM are often invalid,

  • Prospects and clients do not always receive the right content,

  • Some information pertaining to the prospects are obsolete,

  • Some clients receive emails with erroneous gender qualifications,

  • There are two addresses for clients/prospects but it’s difficult to understand what they relate to,

  • He doesn’t know the origin of some of the data he is using or how he can access their source.

Below is the data Arthur has at hand for his sales efforts. We shall use them to illustrate each of the nine dimensions of Data Quality: