Proximity’s data representation and modeling techniques provide several advantages over traditional methods:
Relational models. Conventional tools cannot exploit the relational structure of data sets. Analysts have to encode the relational structure as propositional features, rather than having the algorithm automatically search over all such features. In addition, such propositional encoding makes it impossible to adjust for relational characteristics of data such as autocorrelation and degree disparity.
Graph query language. Conventional query languages such as SQL make it difficult to retrieve arbitrary subgraphs. Instead, users are limited to retrieving individual records or constructing new records that summarize relational structure. QGraph makes it easy to retrieve and examine arbitrary portions of the graph and thus eases the process of relational knowledge discovery.
Flexible data representation. In a conventional relational database, transforming the schema of a database is a difficult and time-consuming process. A Proximity database does not have a fixed schema. Instead, QGraph queries are used to define the schema for a particular analysis. This can substantially improve the ability to discover knowledge in relational data [Jensen and Neville, KDD, 2002].
Efficient scaling. In a traditional database system, increasing the number of attributes on an object decreases the number of records that can be paged into memory at once. In Proximity, each attribute is stored in its own table. While most operations require a join, MonetDB makes such operations very efficient. As a result, an analyst can create hundreds or even thousands of attributes with little or no impact on query speed.