AllegroGraph 2.2 Java Tutorial and Reference

Table of Contents

Installation

Windows

Linux

Mac OS X

Solaris

FreeBSD

Updater

Introduction

The big picture

Accessing AllegroGraph from Java

Preparing the Triple Store

Testing the interface

Stopping the AllegroGraph Server Application

Tutorial

Connecting Java to the Triple Store

Buffered Operations

Simple Database Operations

Optimization notes

The OpenRDF Model

More complex queries using Prolog

More complex queries using SPARQL

How to use text indexing from Java

Reference

The AllegroGraph server application

Setting the location of the AllegroGraph Server application

AllegroGraph Java sources

Installation

Windows

There are 32 and 64-bit versions of AllegroGraph available for Windows. Select the appropriate one for the version of Windows you are running. The 32-bit version of AllegroGraph will run on the 64-bit version of Windows, but the 64-bit version of AllegroGraph will not run on the 32-bit version of Windows.

After downloading the installer executable, just run it. It will guide you through the installation. After AllegroGraph is installed, you can run it by selecting the "Start AllegroGraph Free Java Edition Server" item on the Start | Programs | AllegroGraph Free Java Edition menu. This will start the server with default ports, and wait for connections from your Java client.

After installation, we recommend you read the "Documentation for AllegroGraph Free Java Edition", also on the Start menu in the same place as the above link.

Linux

There are two Linux versions of AllegroGraph: 32-bit (x86) and 64-bit (x86-64). Select the appropriate one for the version of Linux you are running. The 32-bit version of AllegroGraph may run on 64-bit Linux, but the the 64-bit version of AllegroGraph will not run on the 32-bit version of Linux.

On Linux, AllegroGraph is distributed as an RPM. After installation, the files are located in the /usr/lib/agraph/ directory. The server executable is /usr/lib/agraph/AllegroGraphJavaServer.

You will likely need root permissions to install the RPM package. If you do not, then you can install into an alternate directory by doing the following:

First, make a directory for your private RPM database and initialize it:

% mkdir $HOME/rpmdb  
% rpm --dbpath $HOME/rpmdb --initdb 

Then, make a directory for the installation of AllegroGraph and install it into this directory:

% mkdir $HOME/agraph  
% rpm -ivh --nodeps --prefix $HOME/agraph --dbpath $HOME/rpmdb agraph-2.2-1.x86_64.rpm 

After the final command above is finished, you will have a directory $HOME/agraph/agraph/ containing AllegroGraph.

After installation, we recommend you read the Java tutorial below.

Mac OS X

There is one Mac OS X version of AllegroGraph 2.x:

A 64-bit Intel port is underway. If you require PPC support, please contact Franz for the most uptodate information.

On Mac OS X, AllegroGraph is distributed as a DMG file. After installation, the files are located in the /Applications/AllegroGraph directory. The server is called AllegroGraphJavaServer.

After installation, we recommend you read the Java tutorial below.

Solaris

There are two Solaris versions of AllegroGraph: 64-bit SPARC and 64-bit AMD64. Select the appropriate one for the version of Solaris you are running.

On Solaris, AllegroGraph is distributed as a compressed tar archive. You may extract the files anywhere you wish. The tar archive contains a directory name agraph-version, where version is the version of AllegroGraph you downloaded. The server executable is called AllegroGraphJavaServer in this directory.

After installation, we recommend you read the Java tutorial in below.

FreeBSD

On FreeBSD, AllegroGraph is distributed as a compressed tar archive. You may extract the files anywhere you wish. The tar archive contains a directory name agraph-version, where version is the version of AllegroGraph you downloaded. The server executable is called AllegroGraphJavaServer in this directory.

After installation, we recommend you read the Java tutorial in below.

Updater

From time to time we release updates or fixes to the Java Edition. When we do, you can find descriptions of them on this page: http://agraph.franz.com/support/patches/log/.

In the AllegroGraph installation directory there is an application called updater when you can run to download the latest patches. When you do, the output of the program will look something like this:

peep% ./updater  
;; Connecting to http://www.franz.com/ftp/pub/patches/.  
;; Reading CRC cache...done.  
;; Checking for new update.fasl.  
;; Retrieving list of available patches.  
;; Checking which patches need to be downloaded.  
downloading: update/agraph/2.2/DESCRIPTIONS  
downloading: update/agraph/2.2/pfo013.001 (compressed)  
;; Creating CRC cache (please wait)...done.  
  
***** NOTE: You must restart the AllegroGraph server for the newly  
	downloaded updates to be installed.  
  
peep% 

In the above case, the line of interest is the one that references update/agraph/2.2/pfo013.001.

Remember to restart your server after running the updater. Updates are loaded at server start time only.

Introduction

This document introduces AllegroGraph. It assumes that you are somewhat familiar with RDF (Resource Description Framework), RDFS (RDF Schema), and OWL (Web Ontology Language). If you are not very familiar with RDF, RDFS, and OWL, we suggest that you start with A Semantic Web Primer by Grigoris Antoniou and Frank van Harmelen (2001, Cambridge MA, MIT press; available, e.g. from www.amazon.com). It is a very gentle introduction to these new technologies. For a quick introduction, see these Wikipedia entries: OWL, RDF, and RDFS.

The big picture

AllegroGraph is a pure triple store that you can use for storing RDFS/OWL triples but also as an on-disk graph database.

Accessing AllegroGraph from Java

The Java API to the AllegroGraph triple store allows Java applications to access and manipulate triple store databases.

This tutorial introduces some of the Java Allegrograph API objects and methods in simple examples. The full documentation of the Java API is here.

Preparing the Triple Store

The Java API to the AllegroGraph Triple Store is a client-server implementation where the Java application is the client. In the Java-only edition of AllegroGraph, there are two distinct modes of operation possible:

Starting the AllegroGraph server from a Java application

In this mode of operation the Java application calls the startServer() method in the AllegroGraphConnection class. The only preparation needed for this mode of operation is to know where the AllegroGraph server executable was installed.

The Java application can specify the location of the server executable explicitly with a call to setDefaultCommand() or setCommand().

The Java application may also be started with a property setting for the property com.franz.ag.exec.

The most convenient mode is to set a user or system Java Preferences value with the utility in the main() method of the AllegroGraphConnection class. See the section Setting the location of the AllegroGraph Server application for full details. A Preferences setting persists from one session to the next and needs to be set only once in an installation.

Starting the AllegroGraph server as a separate application

The AllegroGraph server application is started from its installation location. The startup parameters specify the port numbers. The Java application must use these same parameters to connect to the server.

The section The AllegroGraph server application describes the AllegroGraph server application in detail.

Testing the interface

We include a sample program, AGExample.java, in the AllegroGraph distribution. This program may be used to verify the installation and to demonstrate that the connection between Java and AllegroGraph is working. Furthermore, the source code provides examples of using Java AllegroGraph. Please take a look at AGExample.java.

Before you run the client Java program, it must be informed about the location of one important file: com.franz.agraph-2-2-5.jar resides in the AllegroGraph installation directory.

The full pathname to this file can be included in the Java classpath, or the files may be copied to a more convenient location. When using Eclipse, it may be specified as a library in the project properties.

Testing the interface on Windows

The first step is to start the AllegroGraph server by selecting the AllegroGraph server item on the AllegroGraph Start Menu entry (or double-click on AllegroGraphJavaServer in the AllegroGraph installation directory).

The second step is to open a command window in the folder where AllegroGraph was installed.

At this point, the following command will start the sample application, but it will terminate immediately with an error message because the program needs the location of the database work area:

java -cp .;com.franz.agraph-2-2-5.jar AGExample 

The full command line parameters of the sample program are described in a comment in the program source. The most important argument is "-d", a required argument which specifies an existing directory to hold the database files:

java -cp .;com.franz.agraph-2-2-5.jar AGExample -d /tmp/ag/ -n tst 

Other command examples:

Load the the Wilbur example OWL ontology:

java -cp .;com.franz.agraph-2-2-5.jar AGExample -d /tmp/ag/ -n tst -r wilburwine.rdf 

The above command assumes you are in the AllegroGraph installation directory, as wilburwine.rdf is distributed with AllegroGraph.

Load the ntriples version of Wilbur OWL ontology:

java -cp .;com.franz.agraph-2-2-5.jar AGExample -d /tmp/ag/ -n tst -t wilburwine.ntriples 

NOTE: when a large data file is specified, there may be a delay before the sample program shows any output.

Testing the interface on Linux and Unix

The first step is to open a shell in the AllegroGraph installation directory.

The second step is to start the AllegroGraphJavaServer executable. You may want to put it into the background and redirect the output from the program to a file.

At this point, the following command will start the sample application, but it will terminate immediately with an error message because the program needs the location of the database work area:

java -cp '.:com.franz.agraph-2-2-5.jar' AGExample 

The full command line parameters of the sample program are described in a comment in the program source. The most important argument is "-d", a required argument which specifies :

java -cp .:com.franz.agraph-2-2-5.jar AGExample -d /tmp/ag/ -n tst 

Other command examples:

Load the the Wilbur example OWL ontology:

java -cp .:com.franz.agraph-2-2-5.jar AGExample -d /tmp/ag/ -n tst -r wilburwine.rdf 

The above command assumes you are in the AllegroGraph installation directory, as wilburwine.rdf is distributed with AllegroGraph.

Load the ntriples version of Wilbur OWL ontology:

java -cp .:com.franz.agraph-2-2-5.jar AGExample -d /tmp/ag/ -n tst -t wilburwine.ntriples 

NOTE: when a large data file is specified, there may be a delay before the sample program shows any output.

More advanced uses of the sample application

The sample application tests several other command-line arguments that modify the behavior of the application. These arguments are described in comments in the source code.

The application can also start the server if the "-x" argument is added to the command.

Stopping the AllegroGraph Server Application

Once the AllegroGraph server application is running, it can be terminated in several ways:

We supply a small Java application that stops the AllegroGraph server. The application is run with a command such as the following.

On Windows:

java -cp .;com.franz.agraph-2-2-5.jar AGStop [-p port] [-h host] 

On Unix:

java -cp '.:com.franz.agraph-2-2-5.jar' AGStop [-p port] [-h host] 

Tutorial

Connecting Java to the Triple Store

The first thing you might have noticed reading through the test program, AGExample.java, is that each Java application must connect to the server before any part of the API can be used. Connect to the server by creating a new instance of the class AllegroGraphConnection. The AllegroGraphConnection class implements methods open(), create(), and others that open databases and return instances of the class AllegroGraph. Each open database is represented by a new instance of the class AllegroGraph.

If the Java application disconnects from the server, all AllegroGraph instances become invalid and must be discarded.

Buffered Operations

The communication between the Java application and the AllegroGraph server takes place through a socket. In order to minimize the delays that may be imposed by operating system overheads, it is good practice to operate on many data items in each interaction between the Java client application and the server.

We facilitate this buffering by providing array operations for most of the database accessors. The array operations create or retrieve many database elements in a single interaction and therefore are much more time efficient.

Simple Database Operations

Opening a database

A database is opened by creating an AllegroGraph instance.

AllegroGraphConnection sv = new AllegroGraphConnection();  
sv.enable();  
AllegroGraph ts = sv.create("test, "/s/ja/temp/"); 

The database is closed with the closeDatabase() method. Once the database is closed, the AllegroGraph instance should be discarded since it cannot be used for further interactions.

To re-open a database, create a new AllegroGraph instance.

Creating triples

Triples can be created one at a time by naming the components with strings in ntriples syntax.

ts.addStatement("<http://www.franz.com/things#Dog>",  
		"<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>",  
		"<http://www.w3.org/2002/07/owl#Class>"); 

The application can also save the details of the newly created triple by creating a new Triple instance with the newTriple() method.

Triple tr2 = ts.newTriple(  
               "<http://www.franz.com/things#Dog>",  
               "<http://www.w3.org/2000/01/rdf-schema#subClassOf>",  
               "<http://www.franz.com/things#Mammal>"); 

When many triples are created, it is more efficient to buffer the operation by grouping the triple components into arrays. The following statement creates three triples from corresponding elements of the arrays.

ts.addStatements(  
new String[]{  
    "<http://www.franz.com/things#Cat>",  
    "<http://www.franz.com/things#Giraffe>",  
    "<http://www.franz.com/things#Lion>" },  
new String[]{  
    "<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>",  
    "<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>",  
    "<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>" },  
new String[]{  
    "<http://www.w3.org/2002/07/owl#Class>",  
    "<http://www.w3.org/2002/07/owl#Class>",  
    "<http://www.w3.org/2002/07/owl#Class>" }  
);                             

When an array consists of identical elements, it can be shortened to a single element. The following statement creates three triples where the predicate and object components are identical.

ts.addStatements(  
    new String[]{  
        "<http://www.franz.com/things#Cat>",  
        "<http://www.franz.com/things#Giraffe>",  
        "<http://www.franz.com/things#Lion>" },  
    new String[]{"<http://www.w3.org/2000/01/rdf-schema#subClassOf>"},  
    new String[]{"<http://www.franz.com/things#Mammal>"}  
); 

Querying for triples

Triples are retrieved from the database with a Cursor instance. The Cursor instance can iterate through all the triples in the search result. The following statement will retrieve the four triples about subclasses of the "Mammal" class created earlier.

String wild = null;  
Cursor cc = ts.getStatements(  
    wild,  
    "<http://www.w3.org/2000/01/rdf-schema#subClassOf>",  
    "<http://www.franz.com/things#Mammal>" ); 

When a Cursor instance is created, it is not positioned at a result. The step() method advances the Cursor instance to the first or next result. When a Cursor has been advanced, the returned value is true. When a Cursor is exhausted, the returned value is false.

if ( cc.step() ) Triple tr = cc.getTriple(); 

When the Cursor is positioned at a result, we can retrieve the component of interest without creating a Triple instance.

Value s = cc.getSubject(); 

We can also retrieve several results in one operation. The following statement retrieves an array of at most 6 elements:

Triple[] trc = cc.step(6);  
int n = trc.length; 

Optimization notes

Maximum Index Chunk Size parameter

This parameter, settable and gettable by the setChunkSize() and getChunkSize() methods, controls the maximum number of records that are sorted at a time during index merging. (Indexing happens by calling indexAll() or indexTriples() methods.)

The initial value of this parameter is believed to be good for machines with 1-2GB of RAM. If your computer has significantly more memory than this, you might improve indexing performance by using larger values (e.g., doubling or more the initial value).

Expected Unique Resources parameter

This parameter, settable and gettable by the setDefaultExpectedResources() and getDefaultExpectedResources() methods, controls the default value for the expected number of unique resources in a new triple store. This number is the expected number of distinct URIs and literals in the triple store database. If the number is too small, performance may suffer during database creation. A rough rule of thumb is to specify a number that is one third of the number of triples.

The OpenRDF Model

We implement most of the interfaces in the OpenRDF model defined at http://openrdf.org/.

The current implementation does not implement the interface Graph.

More complex queries using Prolog

AllegroGraph includes a Prolog implementation that may be used to search a database. The select() and selectValues() methods allow searches that return triples or database nodes and literals.

ValueObject[][] v =  
    ts.selectValues  
        ("(?x ?y ?z) " +  
         "  (and (q ?x " +  
         "      !http://www.w3.org/1999/02/22-rdf-syntax-ns#type " +  
         "      ?y) " +  
         "   (q ?y " +  
         "      !http://www.w3.org/2000/01/rdf-schema#subClassOf " +  
         "   ?z))",  
        new Object[0], ""); 

The result v will be an array of sub-arrays. Each sub-array represents one successful match of the query. Each sub-array will be of length 3: the first element in the sub-array will be the binding of the variable ?x, the second ?y and the third ?z.

It may also be desirable to substitute values from the Java application into the query string. This can be done by simply concatenating the required strings, but we do allow a more convenient option.

URI typePred = ts.addURI("http://www.w3.org/1999/02/22-rdf-syntax-ns#type");  
URI classPred = ts.addURI("http://www.w3.org/2000/01/rdf-schema#subClassOf>");  
ValueObject[][] w = ts.selectValues  
		       ("(?x ?y ?z) (and (q ?x ?a ?y) (q ?y ?b ?z))"  
			     new Object[]{ typePred, classPred },  
			            "?a ?b"); 

This query returns the same result as the previous example, but we have substituted values from the program into the query.

A query can return a mixture of nodes, literals and triples. The query

URI typePred = ts.addURI("http://www.w3.org/1999/02/22-rdf-syntax-ns#type");  
URI classPred = ts.addURI("http://www.w3.org/2000/01/rdf-schema#subClassOf>");  
ValueObject[][] w = ts.selectValues  
		       ("(?x ?y ?z ?t ?u) (and (q ?x ?a ?y ? ?t) (q ?y ?b ?z ? ?u))"  
			     new Object[]{ typePred, classPred },  
			            "?a ?b"); 

returns an array where each sub-array is of length 5. The fourth and fifth elements in the sub-array are the triples that satisfied the query. The lone question marks in the pattern skip the graph position of each triple to allow unification with the triple ids.

If all the results of interest are triples, a select() method can be used to return a Cursor instance. The Cursor instance is an iterator that returns the triples in order.

Cursor tv = ts.select  
              ("(?t ?u) (and (q ?x " +  
			      "  !http://www.w3.org/1999/02/22-rdf-syntax-ns#type " +  
			      "	?y ? ?t) " +  
			   "  (q ?y " +  
                 "   !http://www.w3.org/2000/01/rdf-schema#subClassOf " +  
			     "	?z ? ?u))",  
                new Object[0], ""); 

The cursor in variable tv will return triples t1, u1, t2, u2,... where t1 is the triple matching ?t in the first match of the query, and u1 is the triple matching ?u in the first match of the query.

If query variables not bound to triples are included in the query variables, they are ignored. Thus the query

Cursor tw = ts.select  
              ("(?t ?x ?u) (and (q ?x " +  
		                 "  !http://www.w3.org/1999/02/22-rdf-syntax-ns#type " +  
				 "  ?y ? ?t) " +  
				" (q ?y " +  
                   " !http://www.w3.org/2000/01/rdf-schema#subClassOf " +  
				  "  ?z ? ?u))",  
                new Object[0], ""); 

returns exactly the same value as the previous query. Additional select() methods are provided to allow data to be substituted into the query.

More complex queries using SPARQL

AllegroGraph includes a SPARQL implementation that may be used to search a database. The methods twinqlAsk(), twinqlSelect, twinqlFind, and twinqlQuery allow searches that return a true/false result, an array of objects, a Cursor instance or a result serialized into an XML string.

For notes on twinql's conformance to the W3C specification please see this document.

How to use text indexing from Java

If you want to know how this all works it is worthwhile to look at the tutorial after this section. The Javadocs also describe all the main methods.

The main methods:

public Cursor getFreetextStatements(String pattern) 

will return a cursor of all the triples that match pattern.

The input pattern for getFreetextStatements is described in the JavaDocs but here is a summary of the syntax for the input patterns.

_pattern_ -> _string-pattern_ | _composite-pattern_  
_string-pattern_ -> _string_ | _phrase-string_  
_string_ -> _char_"  
_char_ -> *?*    -- denotes a wild card that matches any single character  
_char_ -> *\**   -- denotes a wild card that matches any sequence of characters  
_char_ -> _any_  -- most other characters denote themselves  
_phrase-string_ -> `'this is a phrase'`   no wild cards allowed  
_composite-pattern_ -> (and _pattern_\*) | (or _pattern_\*)  
  
  
public ValueObject[] getFreetextUniqueSubjects(String pattern) 

will return a ValueObject that contains all the unique triple-subjects that match pattern.

public String[] getFreetextPredicates() 

returns a string array of the predicates that you registered for freetext indexing.

public void registerFreetextPredicate(Object predicate) 

register a predicate for indexing. Freetext indexing predicates must be registered before any triples are added to the triple store. We will relax this constraints in future versions.

Reference

The AllegroGraph server application

The AllegroGraph server application is started with the command

AllegroGraphJavaServer -port port -port2 port2  
                   -limit limit -users users -lease lease  
                   -log logfile -init initfile  
                   -exres expected-resources -index index-chunk-size  
                   -quiet -verbose -debug -standalone -stop  
		   -nojava  
		   -http http-port  -hinit http-init-file 

where all the arguments are optional. The arguments may be supplied to modify behavior as follows:

On UNIX platforms the default behavior (without the -debug command line argument) is to run in the background, as a daemon. In this case, it is strongly recommended that you use the -log command line argument to specify a log file for the various messages printed by the server.

On Windows, the server runs in a window that is normally minimized.

Setting the location of the AllegroGraph Server application

The main() method of the AllegroGraphConnection class is a utility that sets the Java Preferences value used by the subsequent application.

java -cp '.:com.franz.agraph-2-2-5.jar' com.franz.ag.AllegroGraphConnection [-user uuu] [system sss] 

If the method is run without any arguments, it simply lists the current settings on the console.

The -user argument sets a user preference; the -system argument sets a system preference.

Setting a system preference normally requires administrator permission.

The value of each argument is the absolute pathname of the AllegroGraphJavaServer executable distributed with AllegroGraph.

We have not tested Preferences settings with all possible Java and OS combinations. On Windows XP, both user and system preferences are set reliably with Java 1.4.2 and Java 5. On Linux (Fedora 5), Java 5 sets user preferences but GNU Java 1.4.2 did not.

AllegroGraph Java sources

The Java code for the AllegroGraph Java API is open source under the terms of the Mozilla Public License Version 1.1. The source code is distributed with AllegroGraph and is installed with the other AllegroGraph files. The main source files are in agsrc-2-2-5.jar. The file agsrctbc-2-2-5.jar contains additional classes that are used by TopBraidComposer.