Electronic data exists in all places, in different formats, in different locations, and in increasing volumes. The need to integrate information from different data sources is crucial to almost every modern enterprise since integrated information often presents new information. However, integrating information is not an easy task, and it is one of the most time-consuming data management problems.
To make the task of integrating information easier, the relationships between data sources in an integration system are typically specified declaratively. These declarative mappings are then primary components of an information integration system. Researchers at UC Santa Cruz have embarked on a rigorous study of several fundamental operators for manipulating such declarative mappings, as well as the development of a novel system that would allow an integration designer to understand, debug, refine, and choose among alternative declarative mappings between data sources through data examples. The fundamental operators for manipulating mappings are important for optimizing access across data sources, evolving mappings according to evolving data sources, and reusing existing mappings between existing data sources when integrating new data sources.
Next steps: As part of an ongoing project, Professor Wang-Chiew Tan and collaborators will continue to make progress on the problem of making information integration easier to design and understand through the study of novel frameworks and algorithms for manipulating mappings.