A Guide for Data Quality (DQ) and 6 Data Quality Dimensions

With data, the product we deliver is an instance of a record. We can refer to a data record as a unit product. A table is nothing but a type of unit. AllTotal Units (Total Records)10,000With data, the product we deliver is an instance of a record. We can refer a data record as a unit product. A table is nothing but a type of unit. All the records delivered during a period are the total units.

 

Total Units (Total Records)10,000Assume of these 10,000 records or units 200 records are defective.

 

Total Units with Defects (Failed Records)200Defects Per Unit is calculated by dividing Total Units with Defect by Total UnitsDefect Per Unit Ratio (DPU)=

(Failed Records/Total Records)

200/10,000 =0.02We need to determine the possible opportunities for failure in a unit or in our case a data record.

There are many ways to do it. But we can see that each record has 20 attributes/columns that can have defective values.

Opportunities for Failure per unit or record20The total defects within a sample period divided by the total defect chances.Defects Per Opportunities (DPO)= Defects/ (Units * Opportunities)

 

200/ (10,000*20) = 0.001 = 0.1%We need to know the capabilities of data engineering to produce defect-free recordsYield = (1-DPO)1-0.001 =0.999 =99.9%