It’s important to understand how vertex and link annotations combine to create the desired query. This section examines several specific examples to see how these annotations work together.
Recall that the annotation [1..] means that
the query matches if the database contains one or more of the
annotated elements. We’ve seen the common usage of annotating a
vertex and adjacent edge with this annotation to find simple clusters
(stars)
around a central object in the database. But what happens if one or
both of these annotations has a different value?
We consider such queries in the context of finding clusters of actors and the movies they’ve appeared in. We know that actors usually appear in more than one movie, and that actors occasionally play multiple roles in a single movie. The database fragment in Figure 4.17 illustrates the type of data we might find in a target database for such queries. As we can see from this fragment, Peter Sellers, Alec Guinness, and Dennis Price have all played multiple roles in a single movie. Alec Guinness played three roles in Kind Hearts and Coronets, Dennis Price played two roles in Kind Hearts and Coronets, and Peter Sellers played multiple roles in two movies, Dr. Strangelove and The Mouse That Roared. Both Peter Sellers and Alec Guinness have also appeared in other movies where they played only a single role, but this fragment doesn’t tell us whether Dennis Price ever did the same.
As we’ve seen before, we can use a query like that shown in Figure 4.18 to find actors connected to at least two movies.
The vertex annotation of [2..] tells the query to
match clusters with actors as
core objects,
where those
actor objects are linked to at least two movies.
Our use of the standard [1..] edge
annotation groups multiple links
connecting a particular actor and
movie in the same subgraph, as shown in
Figure 4.19.
Two subgraphs are returned, one with Peter Sellers as the core actor object and one with Alec Guinness as the core actor. Both of these actors are linked to two or more movies. The subgraphs surrounding the actors George C. Scott and Dennis Price are not included in the query results because the actor is only linked to a single movie in both of these instances.
In contrast, if we reverse these annotations to create the query shown in Figure 4.20, we now must find subgraphs where each movie is connected to an actor by at least two role links.
This query identifies subgraphs containing an actor and any movies in
which that actor played at least two roles. Any movies in which the
actor played only a single role
are not included in the query results, shown in
Figure 4.21. The [1..]
annotation on the movie vertex groups the
matching movies for a specific actor into a single subgraph.
The new annotation results in the elimination of the movie
The Pink Panther and The
Ladykillers from the subgraph containing Peter Sellers, and
the elimination of The Ladykillers from the subgraph
containing Alec Guinness. The actors played only one role in these
movies. And because he played two roles in Kind Hearts and
Coronets, the subgraph containing Dennis Price is now
included in the results. This subgraph was omitted from the first
query’s results because the first query required that each actor
be linked to two or more movies but this database only includes
information about one movie in which Dennis Price acted.
If we change the query again to include a numeric annotation of
[2..] for both the movie
vertex and its adjacent edge, as shown in
Figure 4.22, then matching subgraphs must
include at least two movie objects connected to an actor object, and
each of those movie objects must be connected to their
corresponding actor by at least two distinct role links.
As we can see in Figure 4.23, this query
finds only a single subgraph that satisfies the query’s
requirements. Only one actor in this database fragment is linked to at
least two movies in which he played two or more roles.