Abstract
Data quality is fundamental to ensuring an organization's information remains accurate, consistent, and reliable, especially in master data management (MDM). One of the key challenges organizations face is integrating data from various sources, each with its schema and format, leading to consistency and difficulties in creating a unified data view. Automated data mapping and schema matching are emerging solutions to address these challenges by enhancing the alignment and consistency of data structures across different systems. By utilizing intelligent algorithms and machine learning models, these techniques automate identifying relationships between data fields, significantly reducing the manual effort and errors typically involved. This automation allows organizations to quickly map and integrate data from multiple sources, streamlining the entire process and ensuring more accurate and consistent results. These technologies not only speed up data integration but also reduce the potential for human error, which is especially important when dealing with large, complex datasets. Additionally, automated data mapping and schema matching improve data quality by ensuring that data is consistently structured across systems, leading to improved decision-making and operational efficiency. These techniques also help eliminate redundancies and discrepancies within data, making it easier to maintain a single, reliable source of truth for critical business information. As these methods evolve, they offer an increasingly effective solution for organizations seeking to enhance their data integration processes. Automated mapping and schema matching not only improve data quality but also provide a scalable approach to managing data across diverse platforms, making them a valuable tool for organizations aiming to unlock the full potential of their data. These advancements are revolutionizing how businesses handle data integration, ensuring that data remains a trusted asset that can support better decision-making and drive business growth.
References
Loshin, D. (2010). Master data management. Morgan Kaufmann.
Drumm, C., Schmitt, M., Do, H. H., & Rahm, E. (2007, November). Quickmig: automatic schema matching for data migration projects. In Proceedings of the sixteenth ACM conference on Conference on information and knowledge management (pp. 107-116).
Talburt, J. R., & Zhou, Y. (2015). Entity information life cycle for big data: Master data management and information integration. Morgan Kaufmann.
Shahbaz, Q. (2015). Data mapping for data warehouse design. Elsevier.
Mahanti, R. (2019). Data quality: dimensions, measurement, strategy, management, and governance. Quality Press.
Zhu, Y., & Yang, J. (2019). Automatic data matching for geospatial models: a new paradigm for geospatial data and models sharing. Annals of GIS, 25(4), 283-298.
Geisler, S., Quix, C., Weber, S., & Jarke, M. (2016). Ontology-based data quality management for data streams. Journal of Data and Information Quality (JDIQ), 7(4), 1-34.
Curino, C., Moon, H. J., Deutsch, A., & Zaniolo, C. (2013). Automating the database schema evolution process. The VLDB Journal, 22, 73-98.
Morrison, J. L. (1995). Spatial data quality. Elements of spatial data quality, 202, 1-12.
Gal, A. (2006). Managing uncertainty in schema matching with top-k schema mappings. In Journal on Data Semantics VI (pp. 90-114). Berlin, Heidelberg: Springer Berlin Heidelberg.
Woodall, P., Oberhofer, M., & Borek, A. (2014). A classification of data quality assessment and improvement methods. International Journal of Information Quality 16, 3(4), 298-321.
Loshin, D. (2010). The practitioner's guide to data quality improvement. Elsevier.
Ehrlinger, L., Werth, B., & Wöß, W. (2018). Automated continuous data quality measurement with QuaIIe. International Journal on Advances in Software, 11(3), 400-417.
Dreibelbis, A. (2008). Enterprise master data management: an SOA approach to managing core information. Pearson Education India.
Konstantinou, N., Koehler, M., Abel, E., Civili, C., Neumayr, B., Sallinger, E., ... & Paton, N. W. (2017, May). The VADA architecture for cost-effective data wrangling. In Proceedings of the 2017 ACM International Conference on Management of Data (pp. 1599-1602).
Thumburu, S. K. R. (2022). Data Integration Strategies in Hybrid Cloud Environments. Innovative Computer Sciences Journal, 8(1).
Thumburu, S. K. R. (2022). Scalable EDI Solutions: Best Practices for Large Enterprises. Innovative Engineering Sciences Journal, 2(1).
Gade, K. R. (2022). Data Modeling for the Modern Enterprise: Navigating Complexity and Uncertainty. Innovative Engineering Sciences Journal, 2(1).
Gade, K. R. (2022). Migrations: AWS Cloud Optimization Strategies to Reduce Costs and Improve Performance. MZ Computing Journal, 3(1).
Katari, A., & Vangala, R. Data Privacy and Compliance in Cloud Data Management for Fintech.
Katari, A., Ankam, M., & Shankar, R. Data Versioning and Time Travel In Delta Lake for Financial Services: Use Cases and Implementation.
Komandla, V. Enhancing Product Development through Continuous Feedback Integration “Vineela Komandla”.
Thumburu, S. K. R. (2021). Optimizing Data Transformation in EDI Workflows. Innovative Computer Sciences Journal, 7(1).
Thumburu, S. K. R. (2021). Integrating Blockchain Technology into EDI for Enhanced Data Security and Transparency. MZ Computing Journal, 2(1).
Gade, K. R. (2021). Data-Driven Decision Making in a Complex World. Journal of Computational Innovation, 1(1).