The Impact of Poor Data Quality (and How to Fix It) – DATAVERSITY
The dangers of poor data quality can cause significant damage to a business. Poor-quality data can lead to poor customer relations, inaccurate analytics, and bad decisions, harming business performance.
The sources of poor data quality may seem like a small issue, but it can easily become magnified as repeat errors and different types of errors increase and accumulate.
Missing or erroneous details in email communications can result in customers feeling insulted. An accumulation of errors in data being used for research will almost always lead to skewed conclusions. The combination of skewed conclusions and the accidental process of insulting the customer base does seem like a recipe for losing potential profits.
Good-quality data is not only helpful, but is also necessary for managing projects, controlling finances, assessing performance, and delivering services efficiently. Although data quality is considered important on a superficial level, it is often treated as a low priority. By giving data quality a high priority, the business can benefit from improved sales forecasts, more pleasant customer experiences, and better business intelligence.
Business intelligence is only as good as the data supporting it.
The Sources of Poor Data Quality
The reasons for collecting and storing data of poor quality are pretty basic. Generally, the problems have to do with translating data from one format to another, but there are other sources, as well. The basic reasons for having poor-quality data are:
Data integration issues: Conversion errors can happen when data is collected from a variety of databases that don’t integrate with the organization’s database. Converting one data format into another often leads to mistakes. CSV files, as a simple example, are typically separated by commas, and converting a spreadsheet to CSV files can result in data being stored in broken up chunks. Conversion issues can become even more complicated if the data taken from an older legacy system is converted for storage in a NoSQL system.
Data-capturing inconsistencies: Data capture is about taking the information stored on a document and transforming it into data for computer storage. This allows employees to retrieve, search, organize, and store documents quickly and efficiently. An organization having two or more departments that use different formatting processes should be concerned with data quality because of variations and inaccuracies. For example, one department may list the customer’s name as Rocal Ltd., while another lists it under the business owner’s name, Joe Jones.
Poor data migration: This commonly happens when data is moved from a legacy system to a new database, or to the cloud. Moving data into a new system comes with some risks. Some of the data’s values can be missing or irregular. If the data isn’t of good quality upfront, new problems can arise, such as data corruption and missing data.
Data decay: Data decay describes the deterioration of data quality, typically in the marketing and sales departments. Data decay is often an expression of old, outdated information. (For example, approximately 40% of email users will change their email addresses every two years. Has your organization developed a response to that issue?) Some have referred to a database crash as a form of data decay.
Data duplication: Duplicated data may skew business intelligence. It is possible for problems to develop if duplicated data is used for statistical purposes. Also, if duplicated data is correct in one location, and missing parts in another, problems can arise.
The Consequences of Poor-Quality Data
The consequences of using data of poor quality can range from minor to disastrous. It can result in lost income, employees quitting in frustration, and even painful monetary fines. There are several potential consequences:
A loss of income: Losing clients and potential clients will have a negative impact on revenue. In addition, Gartner recently discovered many businesses lose thousands of dollars each year as a result of “lost productivity” stemming from poor-quality data. Poor data quality is considered responsible for costing organizations an average of $15 million per year.
Lost income can be caused both directly and indirectly by poor-quality data:
- Inaccurate personal information, such as mailing addresses, may result in products being shipped to the wrong place
- Incorrect client information may lead to the loss of customers
- Misleading product information may cause complaints and a damaged reputation
Inaccurate analytics: Data analysis or predictive analytics, when based on incomplete and inaccurate data, risks misinterpretation and poor decision-making. Duplicated data, missing fields, etc., can produce an analysis that should not be trusted.
Fines for privacy invasions: Europe, California, Canada, and Brazil have made significant efforts to protect the privacy of their citizens. The United States has tried to protect its citizens’ medical information. If personal data is collected and used “illegally” (things become illegal when organizations and people ignore ethics in favor of profits), the business can expect a hefty fine. Other issues resulting in fines include making people’s personal data available to criminals by using minimal, poor, or no security, not notifying authorities (or customers) of data breaches affecting customers, and collecting personal data on children.
Reduced efficiency: The majority of internal business processes require reliable data to function optimally. If the business’s data is incomplete or inaccurate, then staff will have to spend time researching information to correct the error, and correcting it manually. The time it takes to make these corrections drags down business’s efficiency and profitability. This can be especially damaging to a business if some of its departments use a data silo philosophy (making corrections difficult).
Missed opportunities: Without high-quality data to base your decisions on, your business will miss important opportunities. For example, poor data may mean you miss out on:
- Market trends
- Customer insights
- Product improvements
Inaccurate data can also reduce lead generations, making it more difficult to target potential prospects.
Reputational damage: This often goes hand in hand with lost income and damages the business’ growth. Customers will quickly lose trust in a business if they believe it is mishandling their personal data. (Mishandling personal data can also result in fines.) Customers often share bad experiences online in an effort to warn others of a dangerous situation. Issues that might raise red flags for customers are:
- Inaccurate or missing product specifications
- Incorrect customer information
- Multiple marketing emails sent to the same recipient by accident
- Poor Data Management
Wasted time, wasted money: Cleaning up data after it has been stored is much more time-consuming than using standardized behaviors to collect and process data. Businesses can easily lose profits if their sales teams waste time with bad leads that have been created using low-quality data. Marketing departments can launch expensive marketing campaigns that fail because the wrong demographics.
The destruction of morale: Low-quality data can destroy the morale of sales and marketing teams. They can waste hours and hours searching for new opportunities based on faulty data. If sales staff (or researchers) have to manually identify and correct poor data – time-consuming and tedious – it can damage faith in the organization’s leadership.
Three Basic Methods for Obtaining High-Quality Data
For a business to operate at its best, it needs high-quality data. This helps in making informed, intelligent decisions, and increases efficiency to maximize profits. These three basic methods can be used to improve data quality:
1. Develop a workplace culture that supports the gathering and storage of high-quality data. The culture must support intelligent data rules designed to standardize data formats and eliminate data duplication. Poor-quality data is usually the result of a lack of standardized guidelines and procedures. Implementing rules for handling data helps to ensure high-quality data. Some useful guidelines are:
- Develop a standardized naming process and use consistent formats for such things as times, dates, and addresses. This helps in locating data and minimizes duplication.
- Treat fields (the blank spaces on forms) individually. This helps you identify the critical fields for data completeness and apply appropriate rules to ensure they’re filled out.
- Establish clearly defined responsibilities to individuals for the data (data ownership).
2. Audit and clean the data. It’s important to check the quality of a business’s data on a regular basis. Conducting a data audit reveals problems. Continuous, real-time data maintenance eliminates the use of stagnant and decayed data, and ensures the marketing and sales teams are working with useful data.
3. Apply the following five principles when working with data:
- Accuracy: This focuses on how the data matches up with reality. The better the match with reality, the more accurate the data. (Is the client’s name spelled correctly? Is the mailing address correct?)
- Completeness: Are there blank spaces in the form instead of the needed information?
- Consistency: Differences between copies of the same data are inconsistencies. The correct data should be identified, and the faulty data should be eliminated.
- Uniqueness: The data should reflect the real world, and should not have a duplicate copy. For example, if Rocal Ltd. and its owner, Joe Jones, are both used for billing, one must be eliminated.
- Timeliness: Data has to be current with regard to the business’s needs. There should be a system in place that flags data after it has reached a certain age. Additionally, authorized users should be able to update or change data manually, as needed.
Image used under license from Shutterstock.com