Understanding Multiple Annotations

It’s important to understand how vertex and link annotations combine to create the desired query. This section examines several specific examples to see how these annotations work together.

Recall that the annotation [1..] means that the query matches if the database contains one or more of the annotated elements. We’ve seen the common usage of annotating a vertex and adjacent edge with this annotation to find simple clusters (stars) around a central object in the database. But what happens if one or both of these annotations has a different value?

We consider such queries in the context of finding clusters of actors and the movies they’ve appeared in. We know that actors usually appear in more than one movie, and that actors occasionally play multiple roles in a single movie. The database fragment in Figure 4.17 illustrates the type of data we might find in a target database for such queries. As we can see from this fragment, Peter Sellers, Alec Guinness, and Dennis Price have all played multiple roles in a single movie. Alec Guinness played three roles in Kind Hearts and Coronets, Dennis Price played two roles in Kind Hearts and Coronets, and Peter Sellers played multiple roles in two movies, Dr. Strangelove and The Mouse That Roared. Both Peter Sellers and Alec Guinness have also appeared in other movies where they played only a single role, but this fragment doesn’t tell us whether Dennis Price ever did the same.

Database fragment [Annot_DB03.xml]

Figure 4.17. Database fragment [Annot_DB03.xml]


As we’ve seen before, we can use a query like that shown in Figure 4.18 to find actors connected to at least two movies.

Requiring multiple object matches [Annot_DB03_Q01.qg2.xml]

Figure 4.18. Requiring multiple object matches [Annot_DB03_Q01.qg2.xml]


The vertex annotation of [2..] tells the query to match clusters with actors as core objects, where those actor objects are linked to at least two movies. Our use of the standard [1..] edge annotation groups multiple links connecting a particular actor and movie in the same subgraph, as shown in Figure 4.19.

Query results

Figure 4.19. Query results


Two subgraphs are returned, one with Peter Sellers as the core actor object and one with Alec Guinness as the core actor. Both of these actors are linked to two or more movies. The subgraphs surrounding the actors George C. Scott and Dennis Price are not included in the query results because the actor is only linked to a single movie in both of these instances.

In contrast, if we reverse these annotations to create the query shown in Figure 4.20, we now must find subgraphs where each movie is connected to an actor by at least two role links.

Requiring multiple link matches [Annot_DB03_Q02.qg2.xml]

Figure 4.20. Requiring multiple link matches [Annot_DB03_Q02.qg2.xml]


This query identifies subgraphs containing an actor and any movies in which that actor played at least two roles. Any movies in which the actor played only a single role are not included in the query results, shown in Figure 4.21. The [1..] annotation on the movie vertex groups the matching movies for a specific actor into a single subgraph.

Query results

Figure 4.21. Query results


The new annotation results in the elimination of the movie The Pink Panther and The Ladykillers from the subgraph containing Peter Sellers, and the elimination of The Ladykillers from the subgraph containing Alec Guinness. The actors played only one role in these movies. And because he played two roles in Kind Hearts and Coronets, the subgraph containing Dennis Price is now included in the results. This subgraph was omitted from the first query’s results because the first query required that each actor be linked to two or more movies but this database only includes information about one movie in which Dennis Price acted.

If we change the query again to include a numeric annotation of [2..] for both the movie vertex and its adjacent edge, as shown in Figure 4.22, then matching subgraphs must include at least two movie objects connected to an actor object, and each of those movie objects must be connected to their corresponding actor by at least two distinct role links.

Requiring multiple object and link matches [Annot_DB03_Q03.qg2.xml]

Figure 4.22. Requiring multiple object and link matches [Annot_DB03_Q03.qg2.xml]


As we can see in Figure 4.23, this query finds only a single subgraph that satisfies the query’s requirements. Only one actor in this database fragment is linked to at least two movies in which he played two or more roles.

Query results

Figure 4.23. Query results