Design a system which improves the quality of the data.

Medium
Company: Premium
GoogleAmazonUber

Imagine you're working on a data-intensive application. Over time, the quality of the data deteriorates due to various reasons - incorrect user input, data migration errors, inconsistencies during integration with external systems, or just plain old data decay. This dirty data leads to inaccurate reports, faulty decision-making, and ultimately erodes trust in the system.

Your task is to design a system that proactively identifies and improves data quality. This system should be able to:

  • Define Data Quality Rules: Allow users to define rules that capture acceptable data quality standards. For example, "email address must be a valid format," "age must be a positive integer," "product price cannot be negative," or even more complex rules such as "shipping address must be within the service area".
  • Data Validation: Scan existing data against these defined rules and identify violations.
  • Data Correction Suggestions: Propose potential corrections for data quality issues. This could be simple suggestions like "trim leading/trailing spaces" or more complex ones based on domain knowledge (e.g., suggesting a valid zip code based on the city and state).
  • Data Correction Application: Apply corrections to the data, either automatically (with appropriate safeguards) or manually by a user.
  • Reporting & Monitoring: Provide reports on data quality metrics (e.g., percentage of records passing validation, types of violations observed).
  • Extensibility: Allow for easy addition of new data quality rules and correction strategies.

This is an exercise in building a resilient and adaptable data quality framework. The focus is on the core engine for defining, validating, and correcting data. Data source integration and external system interactions are assumed to be handled elsewhere and can be mocked.

Requirements

Think like an Architect

Before revealing the requirements, imagine you're in the interview right now."How would you clarify the scope with your interviewer?"

Premium Content

View detailed solutions.

UNLOCK PREMIUM