Table of Contents
Conditions let you specify restrictions on individual items in a query. To place restrictions across different items we use constraints. Constraints compare one vertex or edge in the query to another vertex or edge.
QGraph provides two types of constraints:
Identity constraints compare the identity of the corresponding database objects, for example, making sure that the same object does not match multiple query elements.
Attribute constraints compare attribute values for the corresponding database objects, for example, requiring that objects have the same attribute value.
Both types of constraints are described in more detail below.
Because constraints involve more than one query element, they apply to the query as a whole rather than to a specific vertex or edge. We therefore draw constraints separate from any query element. Figure 5.1 shows an example query with an identity constraint that requires that the object matching vertex A must be different from the object matching vertex C.
Identity constraints are commonly used to ensure that the same database element does not match two different query elements. Consider the database fragment shown in Figure 5.2. This database represents interconnected web pages.
Like many long web pages, page3.html
links to itself, creating a
loop. Notice,
also, that this database does not have any link attributes.
Neither QGraph nor Proximity requires the use of
attributes for either objects or links in a database.
Figure 5.3 shows a query designed to find the cluster of pages (star) linked from each web page in the database.
With no constraints, we get the following results when the query is run on the
database fragment shown in Figure 5.2.
The subgraph with page3.html as the core_page shows how Proximity used the link from page3.html to itself to match the query. As we saw in Chapter 2, Query Basics, QGraph does not require that distinct query elements be matched by distinct database elements. Because this object matches both the core_page and the linked_page vertices in the query, it appears twice in the query results, once for each corresponding query vertex.
To eliminate such duplicated elements, we use an identity constraint that specifies that the object that matches the core_page vertex must not be the same as the object that matches the linked_page vertex.
The revised query with the identity constraint is shown in Figure 5.5.
The query’s constraint,
core_page < > linked_page,
prohibits matching the same object to both the
core_page and
linked_page vertices.
The results of executing this query on the database fragment shown
in Figure 5.2 are shown below:
The subgraph with page3.html as the
core_page is no longer included in the
query results. The
constraint ensures that the same object cannot match both of the
query’s vertices.
Another common use for constraints is to remove equivalent subgraphs from a query’s results. Consider the fragment of a genealogy database shown below:
A query (without constraints) that finds both parents of an individual is shown in Figure 5.8.
As we saw before, a query without constraints can
return matches that include repeated elements as well as equivalent
subgraphs where the same objects match different vertices, but with
the order reversed. We call such subgraphs
mirror matches
because one often looks like the
mirror image of the other when graphed.
The results of executing this query on the database fragment in Figure 5.7 are shown below:
The top two subgraphs are mirror matches; they
differ only in how Tony Curtis and Janet Leigh correspond to
the parent1 and
parent2 vertices.
The bottom two subgraphs contain repeated elements. In one
Tony Curtis matches both the
parent1 and
parent2 vertices; in the other,
Janet Leigh matches both vertices.
As described before, we can add an identity constraint to the query to remove the subgraphs containing duplicated elements:
The constraint,
parent1 < > parent2,
ensures that the same object cannot match both vertices.
The inequality constraint
parent1 < > parent2
eliminates the subgraphs where the same object matches both the
parent1 and parent2
vertices, but the results still include a mirror match. These two
subgraphs include the same database objects and links but matches them
to different query elements.
To remove the mirror match from the results, we modify the
constraint as shown below.
The new constraint,
parent1 < parent2, enforces an
ordering on the object IDs matching these vertices. Because each
object has a unique ID, this ensures that different subgraphs do not
contain the same objects in a different order. The results of
the modified query can be seen below:
As you can see, this container includes just a
single subgraph. The revised constraint removed the mirror match.
This constraint works because the underlying database provides a unique ID for each object and link. Proximity provides unique IDs; therefore, Proximity queries can take advantage of such constraints to eliminate mirror matches.