Knowledge Discovery Laboratory Knowledge Discovery Laboratory
Frequently Asked Questions about PROXIMITY
Home

People

Publications

Projects

PROXIMITY

What's New
About PROXIMITY
FAQ
Downloads
Documentation
License
Acknowledgments
Mailing lists
Contact

Data

News

The following are some common questions about PROXIMITY. If your questions aren't answered here, feel free to ask them directly at proximity-support@kdl.cs.umass.edu.

  • Where did PROXIMITY come from?

    PROXIMITY is the result of over five years of active research in Relational Knowledge Discovery (see our publications for behind-the-GUI specifics of this research). PROXIMITY provides the latest RKD techniques to researchers and analysts in an easy to use, open-source environment.

  • How do I get my data into PROXIMITY?

    There are two approaches currently available:

    1. The standard approach uses an XML file that contains sections for objects, links, attributes, and collections of objects and links. This approach is fast to load and scales to large databases. The Proximity distribution includes a script for converting standard tabular data to PROXIMITY's import XML format. Please see the tutorial for more information on this approach.
       
    2. PROXIMITY also provides a very fast utility to input data from text files stored in a specific format. More information about the text import and export utilities available in the tutorial.
       
    3. For small databases you can use our "Smart ASCII" text file format. This feature is currently undocumented, but may be documented later if there is sufficient interest.
       
  • Why not use an SQL database as the back-end?

    We've implemented earlier versions of PROXIMITY on top of both SQL and object-oriented databases, and we found the performance was too slow, due to intrinsic assumptions about how these databases expect data to be accessed. By using the vertical database MonetDB, we increased performance by at least one and sometimes two orders of magnitude.

  • How do I share PROXIMITY databases?

    There are three ways:

    1. If you are sharing between computers running MonetDB on the exact same platform (e.g., Linux on two 64-bit platforms, Mac OS X on two PPC or Intel platforms), you can tar (or zip) the corresponding directory in MonetDB's “dbfarm.” For example:
       
      	  $ cd /usr/local/Monet/var/MonetDB4/dbfarm/ 
      	  $ tar czf testdb.tgz testdb/
      	  
    2. You can export your data to XML or to text and then re-import (see the tutorial for details).
    3. Repeat the process that you used to get your data into PROXIMITY initially.

  • Why is PROXIMITY written in Java?

    We love Java; it's supported on multiple platforms, makes us productive, and has excellent libraries available.

  • What software-development process does KDL use to produce PROXIMITY?

    Since the start of 2004, we've been adapting Extreme Programming (XP) to developing PROXIMITY. As a result, we've found both the process and the result to be significantly improved—more agile (in adapting to changes resulting from our research), less stressful, improved clarity, fewer defects, simpler code, and generally more happiness. We've been able to adopt most of the XP practices for planning, designing, coding, and testing to our research environment (see http://www.extremeprogramming.org/rules.html), and we're still experimenting to improve the process.

    To those interested in getting started in XP, we've found the following helpful:

  • How do I compile PROXIMITY?

    The distribution comes with a jar file that includes all the compiled classes needed to use PROXIMITY. You only need to recompile if you make any changes in these classes. To facilitate recompilation, the distribution comes with an Ant build script (build.xml), which includes a target for compiling. We recommend version 1.6 or better of Ant. Use Ant's -p option to see all targets.

  • How do I run the unit tests?

    Use the run-tests Ant target in the build file. The tests take between 8 and 20 minutes to run, depending on your machine and OS.

  • Why aren't there more/fancier visualization tools? When will more model visualization tools be available?

    We added an RPT viewer in PROXIMITY 3.1 and an RDN viewer in PROXIMITY 4.0. The latest version, 4.3, includes a new and more detailed RPT viewer, and a powerful tool to visualize the entire database and subgraphs. We believe these tools are adequate for most needs, although there is of course room for improvement. Visualizing large graphs is a research area that is not our principal research focus. Contributions of visualization tools would be very welcome! (Contact us at proximity@kdl.cs.umass.edu.)

  • Why is getting a single item's attributes so slow?

    By their nature, the techniques we need for efficient relational knowledge discovery are column-centric, and we've chosen a database technology (MonetDB) that optimizes such access. However, the trade-off is just what you're experiencing—row-centric operations such as getting all attribute values for a single item are slow. Basically, MonetDB has to load an entire attribute table into memory to access only a few rows, repeating this for every attribute. Thankfully, these kinds of operations occur mostly in the GUI and not in heavy data-mining operations. In addition, we are looking at ways to speed up some row-centric operations.

  • What causes the “!OS: Permission denied” MonetDB error on Windows XP?

    This error has been reported by Windows XP users. The error output looks like this:

      201537    DEBUG org:    commit;
      201911    WARN  org: Mserver: !OS: Permission denied
      201911    ERROR jdbc.JDBCImporter: !ERROR: BBPsync: rename(bat\BACKUP\,bat\DELETE_ME\) failed.
      !ERROR: BBPsync: rename(bat\BACKUP\,bat\DELETE_ME\) failed.
    	  at kdl.prox3.monet.MonetStream.readLine(MonetStream.java:201)
    	  at kdl.prox3.monet.MonetStream.readLine(MonetStream.java:178)
    	  at kdl.prox3.monet.Connection.executeCommand(Connection.java:251)
    	  at kdl.prox3.dbmgr.ProxDBMgr.commit(ProxDBMgr.java:158)
    	  at kdl.prox3.db.ProxDB.commit(ProxDB.java:125)
    	  at org.my.prox.logic.importer.Importer.commit(Importer.java:132)
    	  at org.my.prox.logic.importer.jdbc.JDBCImporter.commit(JDBCImporter.java:107)
    	  at org.my.prox.logic.importer.jdbc.JDBCImporter.convertLinks(JDBCImporter.java:340)
    	  at org.my.prox.logic.importer.jdbc.JDBCImporter.start(JDBCImporter.java:116)
    	  at org.my.prox.apps.Convert.(Convert.java:59)
    	  at org.my.prox.apps.Convert.main(Convert.java:104)
      201926    DEBUG org:    commit;
      201942    WARN  org: Mserver: !OS: Permission denied
              

    The error is caused by the Microsoft Indexing Service. It can be avoided by turning off the Indexing Service for all files and directories under the MonetDB var directory and by tuning the Indexing Service to perform “lazy” indexing (so that it does not attempt to index things like MonetDB temporary files in general).

    We have reported this problem to the MonetDB developers, and have passed along a user-supplied recommendation that MonetDB temporary files be placed in a standard Windows temporary directory or in a non-indexed application-data directory.

  • When will the newer relational models I hear you are developing be available?

    We will be adding new models into the open-source code as they become ready. We will announce the addition of new models on our "What's New" page. You can also sign up for our PROXIMITY announcements mailing list.

  • How should I cite the use of PROXIMITY in publications?

    Published accounts of analyses conducted with PROXIMITY should acknowledge use of the software either in an "Acknowledgments" section or in a footnote or endnote. Such acknowledgments should contain the wording: "Portions of this analysis were conducted using Proximity, an open-source software environment developed by the Knowledge Discovery Laboratory at the University of Massachusetts Amherst (http://kdl.cs.umass.edu/proximity/)."

FeedbackPrivacyDisclaimer