Ensuring data quality should be seen as an ongoing improvement program that is managed throughout data's lifecycle.
The Data Management Association (DAMA) defines common characteristics (dimensions) of data quality as:
- accuracy
- completeness
- consistency
- integrity
- reasonability
- timeliness
- uniqueness/deduplication
- validity
Data quality management is a continuous process which involves managing data from its initial creation to its potential destruction. The quality of your agency's data should always be fit for purpose. You can support this by establishing a data quality strategy that facilitates proactive monitoring and managing of data quality. For example, data quality assessments are embedded in data migration activities.
A data quality strategy should link to your broader data and information governance environment, including your information governance framework.
Data quality assessment
A good data quality strategy defines appropriate standards, requirements and specifications for data quality controls. This includes developing data dimensions relevant to your business needs to monitor, measure, and report on quality levels of your data.
Data quality assessment tells you how effective data is in meeting your stakeholders' requirements and also helps you prioritise remediation on high value datasets.
Data quality is assessed by measuring specific dimensions of your data.
They provide:
- a vocabulary for defining data requirements
- a way to determine data quality assessment results
- a metric for ongoing measurement and improvement.
There are different dimensions that can be used to assess data quality, such as:
- common dimensions of data quality from DAMA's Body of Knowledge
- the Australian Bureau of Statistics (ABS) provides guidance on assessing against ABS dimensions, to determine the quality of statistical data
- ISO 8000-110:2021 is a global standard for data quality and enterprise master data. You can use this to inform your agency’s data quality standards.
Data quality tools
Tools can be used as a guide to understand the different dimensions of data quality and generate data quality statements. An example is the NSW Government data quality reporting tool that can be used to generate data quality statements in various document formats.
Tools that automate data profiling and cleansing are also available and can help your agency enhance large amounts of data.
These tools can:
- profile, clean and monitor data quality over time
- assist in the validation of data
- provide statistics on agencies data
- help to identify patterns and provide direction on future data remediation.
TIP: Data remediation can be achieved through the use of ETL software, which can process data based on business rules and transform it into the required format.
Poor data quality
Common culprits for poor data quality | Outcome |
---|---|
Incorrect data entry validation | Invalid data is entered into the database |
Change in business rules | New rules are not correctly propagated throughout existing data |
Changes to the source data structure | Third-parties implement changes without notifying downstream users; business rules are not updated on systems following notification of changes. |
Requirement for uniqueness of instances | Incorrect identifiers being created |
Incorrect business rules being applied to data | Loss of data |
Incorrect temporal information | Difficulty to identify latest version of information and data, resulting in duplication |
Data quality and metadata
Good metadata is essential in understanding and assessing the quality of your data. Data quality assessments determine if your data meets the expectations of its consumers and metadata plays a key role in clarifying those expectations. For example, you can look at a record’s metadata to see if it meets format requirements or if it has been updated according to business rules.
Metadata can also be used to record data quality assessments. This means metadata repositories can be used for storing and sharing data quality assessment results across your organisation.
Your metadata and data quality teams can work closely together to develop these processes. Their combined expertise can ensure that business rules, measurements or issues related to data quality are documented, developed and managed as per your agency's data strategy.
Related links
- Interoperability: Data migration
- Interoperability: Data profiling
- Interoperability: Data remediation
- Interoperability development phases resources (PDF 2.7MB)
- Interoperability scenarios (PDF 2.54MB)
- Interoperability: Extract, Transform, Load technologies
- Information governance framework
- Metadata for interoperability