Matching results in perspective

This example aims at comparing the different flavors of semantic matching techniques. Assume we want to compare the following two extracts university course catalogs as shown in Figure 1. We would want to match these course catalogs in the case of a transfer of a student from one University to another, where the later has to decide which courses to recognize from the former University.

Two example course catalogs to be compared
Figure 1: Two example course catalogs to be compared.

A “default” semantic matcher can look for the semantic similarities between each node in each tree and return a mapping of the types equal, more general or less general. As we can see in Figure 2, the result is a set of all the mappings that exists between each node in the trees.

Default semantic matching result
Figure 2: “Default” semantic matching result.

A more smart processing, especially if the mappings were to be visualized be people, would be to return only those mappings that are most important for each node. For example, if we take a look at all the mappings of Figure 2 for the “Courses” root node in the left tree, we can see that “Courses” is equal to “Course” (the root node of the right tree). All the other mappings for the “Courses” root node to the nodes in the right tree (besides from the root node) are of type more general. This gives us the hint that if two nodes are semantically equal, the child nodes will be more specific compared to the parent. As we can see in Figure 2, this property holds for all the child nodes of two nodes that are related with the equal relation.

The minimal mappings version of semantic matching does exactly this; it collapses the links returning only the mappings that are most important, and which cannot be inferred from other mappings. For the exact set of rules that are used to collapse expand mappings take a look at the minimal mappings page.

Minimal mappings result
Figure 3: Minimal mappings result.

The output of the minimal mappings can be seen in Figure 3. Note how the set of mappings is drastically reduced in comparison to the set of mappings returned by the “default” semantic matcher in Figure 2. This reduced set is more human-readable and in general correspond to what a person will expect to see as the result of the semantic matcher.

The last flavor of semantic matching is Structure Preserving Semantic Matching (SPSM). This matcher is especially designed to work compare functions such as web service descriptions. SPSM can be specially useful for facilitating the process of automatic web services composition, returning a set of possible mappings between the functions their parameters.

Result of Structure Preserving Semantic Matching (SPSM)
Figure 4: Result of Structure Preserving Semantic Matching (SPSM).

The output of SPSM is presented in Figure 4. Note how the set of mappings is also reduced in comparison to the “default” semantic matcher results in Figure 2. Comparing the results of SPSM with the Minimal mappings results in Figure 3 we can see that the set of structural properties of the algorithm were preserved (see SPSM). Namely:

  1. only one correspondence per node is returned. The node “History of Americas” in the right tree has only one mapping to the node “America history”. In this case the strongest relation (equality) was chosen over the other mapping (less general) with “Latin America History”
  2. leaf nodes are matched to leaf nodes and internal nodes are matched to internal nodes. Comparing the results between minimal mappings and SPSM for the node “Earth and Atmospheric Sciences” on the left tree we can see that in Figure 3 (minimal mapping) “Earth and Atmospheric Sciences”, which is an leaf node, is mapped to “Earth Sciences”, which is an internal node. In contrast, in Figure 4, the mapping returned by SPSM is between “Earth and Atmospheric Sciences” and “Geophysics”, both of which are leaf nodes (note that the “Geophysics” leaf node is a child node of the “Earth Sciences” node is the mapping returned by the minimal mappings algorithm). The rationale behind this mapping is that a leaf node would represent a parameter of a function, and an internal node would correspond to a function call. This means that a parameter contains a value, which cannot be passed to a function call. Parameters (leaf nodes) should only be matched to other parameters in order to avoid loss of parameter values.