Data Quality: The Fundamental of Any Data Migration Project

It's only fair to share...

Why data migration?

Data migration is necessary when an organization decides to use a new application keeping the business story and associated details as is. It is an exercise of transferring data from one system to another without affecting data consistency and business values of an application.

Data migration is a crucial operation within an enterprise and failure can be catastrophic. Reports say that inefficient data migration process can limit an organization’s ability to move forward with new technologies. Many companies, not using a proper migration tool, end up with significant expenditures with little or no value gained from their effort.

data quality

Scenario 1: Say Team A works on tool X as their collaboration platform for automated software delivery and the team B works on tool Y for some other purpose. They want to transfer data from one tool to another, yet continue with their own tools after migration is complete. Since they work on two separate tools, they need to be in sync to increase efficiency and reduce delivery time. Therefore, both the tools need to have easy access to the work items that the teams will use.  This requires bidirectional synchronization of artifacts between the two systems, even during the post-migration stage.

Scenario 2: Say Team A migrates data from tool X to tool Y which is managed by Team B. In a typical case for large enterprises, migration cannot take place overnight and both the tools need to run in parallel during transition period. Once the migration process is complete, tool X is discarded.

In both the data migration scenarios, maintaining data quality is a difficult task.

Challenges in data migration

There are several challenges in data migration projects. Some of them are as follows:

Importance of data quality: As with most data integration efforts, maintaining data quality is one of the biggest challenges in data migration. Data quality issues are amplified when migration happens from a legacy system (with poor data quality) to a newer application that has a set of rich features and strict data model. This necessitates a lot of planning before the migration process commences. Poor quality of data imposes a significant cost to post migration maintenance; with issues ranging from poor business intelligence to delay or disruption in business processes.

What is data quality?

There is no theoretical definition for data quality, but they are considered as ‘good’ or ‘bad’. Quality data fulfills the underlying need of extracting information out of data in use. So, data quality is measured in the context in which it is used.

Issues with poor quality of data  

  • Slippage of project timeline as reconciling migrated data needs additional man hours.
  • Loss of credibility for the migration system to the team and the customer
  • Customer dissatisfaction due to erroneous or duplicate data
  • Compliance problems
  • System integration issues

How to solve data quality problems

There is no unique or straightforward solution to data quality related problems, but you can improve them based on experience and some thumb rules over time. Whether you use proprietary or commercial tools or develop your own tool for migration, following concepts will help.

  • Apply data cleansing/data scrubbing

This exercise requires identification of quality data to be moved keeping the business value same across the system. Therefore, the system needs to have very intuitive query system and interface for filtering data.

  • Correct data based on their data types

Different data types can be handled easily by approximating the closest type from within the target tool to maintain data integrity.

  • Correct data formats

If a source tool supports complex data formats (e.g. sub-record) but the target tool does not, manipulating complex data flow using the migration application is necessary.

  • Remove duplicate records

Typically, when a migration system runs across a series of servers in parallel, data duplication is a common phenomenon. The grave problem arises when the duplicate data is identified and that needs to be removed. Which data to keep and which one to remove, or delete all and re-migrate – is a major decision. A proven data migration tool will never allow duplicate data into the system.

  • Define hierarchies between data sets

It is important that once migrated, interrelationships between data sets and subsets are maintained in destination tool. Ensure that attachments, comments, links also move to the new tool along with proprietary data.

Conclusion

Maintaining data quality is of utmost importance as any business runs on data.  Maintaining the overall quality of data migration system is more difficult than migrating data itself. In fact, assuring quality starts way before the actual migration process starts. The earlier it begins, the minimum your risk of failure is. There are different types of risks and there are different methods to minimize them.

Data migration quality assurance methods can be divided into two groups – testing techniques and project management-based techniques. The former applies to verification of migration process if it follows the right path, while the second one helps to better prepare for the migration project.

Request for Free Live Product Demo from our Engineers!

It's only fair to share...

Joydeep Datta serves as Kovair Omnibus Technical Architect, responsible for all aspects of Omnibus platform. He brings to this role an extensive background in software integration technologies and software development tools. He has more than 12 years of experience. He is responsible for designing and implementing enterprise grade products and specializes in process and data integration.