Similarity Flooding: A Versatile Graph Matching Algorithm and Its Application to Schema Matching
Matching elements of two data schemas or two data in- stances plays a key role in data warehousing, e-business, or even biochemical applications. In this paper we present a matching algorithm based on a fixpoint computation that is usable across different scenarios. The algorithm takes two graphs (schemas, catalogs, or other data structures)as input, and produces as output a mapping between corre- sponding nodes of the graphs. Depending on the matching goal, a subset of the mapping is chosen using filters. After our algorithm runs, we expect a human to check and if nec- essary adjust the results. As a matter of fact, we evaluate the 'accuracy' of the algorithm by counting the number of needed adjustments. We conducted a user study, in which our accuracy metric was used to estimate the labor sav- ings that the users could obtain by utilizing our algorithm to obtain an initial matching. Finally, we illustrate how our matching algorithm is deployed as one of several high-level operators in an implemented testbed for managing informa- tion models and mappings.