Chapter 5. Constraints

Table of Contents

Identity Constraints
Attribute Constraints
Multiple Constraints
Evaluating Constraints
Constraints and Annotations
Constraints on annotated elements
Constraints involving negated and optional elements
Implementation in Proximity
Constraint elements
Annotation restrictions
Multiple constraints
Summary

Conditions let you specify restrictions on individual items in a query. To place restrictions across different items we use constraints. Constraints compare one vertex or edge in the query to another vertex or edge.

QGraph provides two types of constraints:

Both types of constraints are described in more detail below.

Because constraints involve more than one query element, they apply to the query as a whole rather than to a specific vertex or edge. We therefore draw constraints separate from any query element. Figure 5.1 shows an example query with an identity constraint that requires that the object matching vertex A must be different from the object matching vertex C.

Example query with an identity constraint

Figure 5.1. Example query with an identity constraint


Identity Constraints

Identity constraints are commonly used to ensure that the same database element does not match two different query elements. Consider the database fragment shown in Figure 5.2. This database represents interconnected web pages.

Database fragment [Const_DB01.xml]

Figure 5.2. Database fragment [Const_DB01.xml]


Like many long web pages, page3.html links to itself, creating a loop. Notice, also, that this database does not have any link attributes. Neither QGraph nor Proximity requires the use of attributes for either objects or links in a database.

Figure 5.3 shows a query designed to find the cluster of pages (star) linked from each web page in the database.

1-dimensional star query [Const_DB01_Q01.qg2.xml]

Figure 5.3. 1-dimensional star query [Const_DB01_Q01.qg2.xml]


With no constraints, we get the following results when the query is run on the database fragment shown in Figure 5.2.

Query results

Figure 5.4. Query results


The subgraph with page3.html as the core_page shows how Proximity used the link from page3.html to itself to match the query. As we saw in Chapter 2, Query Basics, QGraph does not require that distinct query elements be matched by distinct database elements. Because this object matches both the core_page and the linked_page vertices in the query, it appears twice in the query results, once for each corresponding query vertex.

To eliminate such duplicated elements, we use an identity constraint that specifies that the object that matches the core_page vertex must not be the same as the object that matches the linked_page vertex.

The revised query with the identity constraint is shown in Figure 5.5.

Query with identity constraint [Const_DB01_Q02.qg2.xml]

Figure 5.5. Query with identity constraint [Const_DB01_Q02.qg2.xml]


The query’s constraint, core_page < > linked_page, prohibits matching the same object to both the core_page and linked_page vertices. The results of executing this query on the database fragment shown in Figure 5.2 are shown below:

Query results

Figure 5.6. Query results


The subgraph with page3.html as the core_page is no longer included in the query results. The constraint ensures that the same object cannot match both of the query’s vertices.

Another common use for constraints is to remove equivalent subgraphs from a query’s results. Consider the fragment of a genealogy database shown below:

Database fragment [Const_DB02.xml]

Figure 5.7. Database fragment [Const_DB02.xml]


A query (without constraints) that finds both parents of an individual is shown in Figure 5.8.

Query [Const_DB02_Q01.qg2.xml]

Figure 5.8. Query [Const_DB02_Q01.qg2.xml]


As we saw before, a query without constraints can return matches that include repeated elements as well as equivalent subgraphs where the same objects match different vertices, but with the order reversed. We call such subgraphs mirror matches because one often looks like the mirror image of the other when graphed.

The results of executing this query on the database fragment in Figure 5.7 are shown below:

Query results

Figure 5.9. Query results


The top two subgraphs are mirror matches; they differ only in how Tony Curtis and Janet Leigh correspond to the parent1 and parent2 vertices. The bottom two subgraphs contain repeated elements. In one Tony Curtis matches both the parent1 and parent2 vertices; in the other, Janet Leigh matches both vertices.

As described before, we can add an identity constraint to the query to remove the subgraphs containing duplicated elements:

Query [Const_DB02_Q02.qg2.xml]

Figure 5.10. Query [Const_DB02_Q02.qg2.xml]


The constraint, parent1 < > parent2, ensures that the same object cannot match both vertices.

Query results

Figure 5.11. Query results


The inequality constraint parent1 < > parent2 eliminates the subgraphs where the same object matches both the parent1 and parent2 vertices, but the results still include a mirror match. These two subgraphs include the same database objects and links but matches them to different query elements. To remove the mirror match from the results, we modify the constraint as shown below.

Query [Const_DB02_Q03.qg2.xml]

Figure 5.12. Query [Const_DB02_Q03.qg2.xml]


The new constraint, parent1 < parent2, enforces an ordering on the object IDs matching these vertices. Because each object has a unique ID, this ensures that different subgraphs do not contain the same objects in a different order. The results of the modified query can be seen below:

Query results

Figure 5.13. Query results


As you can see, this container includes just a single subgraph. The revised constraint removed the mirror match.

This constraint works because the underlying database provides a unique ID for each object and link. Proximity provides unique IDs; therefore, Proximity queries can take advantage of such constraints to eliminate mirror matches.