Saturday, May 30, 2009

Using Neo4J to load and query OWL ontologies

I've written previously about modeling, storing and navigating through ontologies (you can see them here, here, here and here). These were all based on ideas on how I could improve upon ontology systems I had previously encountered at work. As I have no formal background in Semantic Web Programming, most of these implementations were based on tools that I was already familiar with or wanted to get familiar with.

I recently bought a book on Semantic Web Programming (see my review on Amazon here), and I must say it opened up a whole new world for me. Among other things, the book has a very good coverage of Jena, a Semantic Web Framework for Java, something I had been meaning to take a look at for a while.

Somewhat unrelated, I also came across Neo4J, a graph database, and it seemed to be a good fit as a data store for an ontology. Prior to this, the ontologies I have seen were stored in a relational database, which was then converted into an in-memory graph, then serialized out to disk using Java serialization for use by applications. This means that the serialized version is a point-in-time snapshot, not a true copy of the ontology. Depending on how frequently the ontology is updated, this may not be a big deal. But if the ontology is stored in a graph database to begin with, then the backend could continue to update the database, and the application would always see the current ontology. Makes things much cleaner in my opinion.

So I decided to take the OWL file for a sample Wine and Food ontology, and parse it using Jena, then load it into the Neo graph database, and run a few queries against it, to familiarize myself with the Jena and Neo APIs. This post is a result of that effort.

Load Phase

The code for the data loader is shown below. It uses Jena to parse the wine.rdf and food.rdf files and write it out into a Neo graph database. The Jena parser parses the files into a Collection of Statement objects, and exposes an Iterator to get at them. Each statement is a (subject, predicate, object) Triple, which correspond to the start node, relationship and end node in a graph database.

In keeping with the best practices described in the Neo4J Guide (PDF), I also added a pseudo-node representing the start node (also known as reference node) of the graph, and a pseudo-node for each OWL file. The reference node points to the OWL file pseudo nodes, and each of the file nodes point to the nodes from the statements extracted from that file.

To query the database given a node name, I used Neo's LuceneIndexService to create a lookup table, which points to the Node. In addition, I wanted to assign weights to each relationship, so I added in a property.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
// Source: src/main/java/net/sf/jtmt/ontology/graph/loaders/Owl2NeoLoader.java
package net.sf.jtmt.ontology.graph.loaders;

import net.sf.jtmt.ontology.graph.OntologyRelationshipType;

import org.apache.commons.lang.StringUtils;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.neo4j.api.core.Direction;
import org.neo4j.api.core.EmbeddedNeo;
import org.neo4j.api.core.NeoService;
import org.neo4j.api.core.NotFoundException;
import org.neo4j.api.core.Relationship;
import org.neo4j.api.core.Transaction;
import org.neo4j.util.index.IndexService;
import org.neo4j.util.index.LuceneIndexService;

import com.hp.hpl.jena.graph.Node;
import com.hp.hpl.jena.graph.Node_URI;
import com.hp.hpl.jena.graph.Triple;
import com.hp.hpl.jena.rdf.model.Model;
import com.hp.hpl.jena.rdf.model.ModelFactory;
import com.hp.hpl.jena.rdf.model.Statement;
import com.hp.hpl.jena.rdf.model.StmtIterator;

/**
 * Parses an OWL RDF file and populates a graph database directly.
 */
public class Owl2NeoLoader {

  private static final String FIELD_ENTITY_NAME = "name";
  private static final String FIELD_ENTITY_TYPE = "type";
  private static final String FIELD_RELATIONSHIP_NAME = "name";
  private static final String FIELD_RELATIONSHIP_WEIGHT = "weight";
  
  private final Log log = LogFactory.getLog(getClass());
  
  private String filePath;
  private String dbPath;
  private String ontologyName;
  private String refNodeName;
  
  public void setFilePath(String filePath) {
    this.filePath = filePath;
  }
  
  public void setDbPath(String dbPath) {
    this.dbPath = dbPath;
  }
  
  public void setOntologyName(String ontologyName) {
    this.ontologyName = ontologyName;
  }
  
  public void setRefNodeName(String refNodeName) {
    this.refNodeName = refNodeName;
  }
  
  public void load() throws Exception {
    NeoService neoService = null;
    IndexService indexService = null;
    try {
      // set up an embedded instance of neo database
      neoService = new EmbeddedNeo(dbPath);
      // set up index service for looking up node by name
      indexService = new LuceneIndexService(neoService);
      // set up top-level pseudo nodes for navigation
      org.neo4j.api.core.Node refNode = getReferenceNode(neoService);
      org.neo4j.api.core.Node fileNode = getFileNode(neoService, refNode);
      // parse the owl rdf file
      Model model = ModelFactory.createDefaultModel();
      model.read("file://" + filePath);
      // iterate through all triples in the file, and set up corresponding
      // nodes in the neo database.
      StmtIterator it = model.listStatements();
      while (it.hasNext()) {
        Statement st = it.next();
        Triple triple = st.asTriple();
        insertIntoDb(neoService, indexService, fileNode, triple);
      }
    } finally {
      if (indexService != null) {
        indexService.shutdown();
      }
      if (neoService != null) {
        neoService.shutdown();
      }
    }
  }

  /**
   * Get the reference node if already available, otherwise create it.
   * @param neoService the reference to the Neo service.
   * @return a Neo4j Node object reference to the reference node.
   * @throws Exception if thrown.
   */
  private org.neo4j.api.core.Node getReferenceNode(NeoService neoService) 
      throws Exception { 
    org.neo4j.api.core.Node refNode = null;
    Transaction tx = neoService.beginTx(); 
    try {
      refNode = neoService.getReferenceNode();
      if (! refNode.hasProperty(FIELD_ENTITY_NAME)) {
        refNode.setProperty(FIELD_ENTITY_NAME, refNodeName);
        refNode.setProperty(FIELD_ENTITY_TYPE, "Thing");
      }
      tx.success();
    } catch (NotFoundException e) {
      tx.failure();
      throw e;
    } finally {
      tx.finish();
    }
    return refNode;
  }

  /**
   * Creates a single node for the file. This method is called once
   * per file, and the node should not exist in the Neo4j database.
   * So there is no need to check for existence of the node. Once
   * the node is created, it is connected to the reference node.
   * @param neoService the reference to the Neo service.
   * @param refNode the reference to the reference node.
   * @return the "file" node representing the entry-point into the
   * entities described by the current OWL file.
   * @throws Exception if thrown.
   */
  private org.neo4j.api.core.Node getFileNode(NeoService neoService,
      org.neo4j.api.core.Node refNode) throws Exception {
    org.neo4j.api.core.Node fileNode = null;
    Transaction tx = neoService.beginTx();
    try {
      fileNode = neoService.createNode();
      fileNode.setProperty(FIELD_ENTITY_NAME, ontologyName);
      fileNode.setProperty(FIELD_ENTITY_TYPE, "Class");
      Relationship rel = refNode.createRelationshipTo(
        fileNode, OntologyRelationshipType.CATEGORIZED_AS);
      logTriple(refNode, 
        OntologyRelationshipType.CATEGORIZED_AS, fileNode);
      rel.setProperty(
        FIELD_RELATIONSHIP_NAME, 
        OntologyRelationshipType.CATEGORIZED_AS.name());
      rel.setProperty(FIELD_RELATIONSHIP_WEIGHT, 0.0F);
      tx.success();
    } catch (Exception e) {
      tx.failure();
      throw e;
    } finally {
      tx.finish();
    }
    return fileNode;
  }

  /**
   * Inserts selected entities and relationships from Triples extracted
   * from the OWL document by the Jena parser. Only entities which have
   * a non-blank node for the subject and object are used. Further, only
   * relationship types listed in OntologyRelationshipTypes enum are 
   * considered. In addition, if the enum specifies that certain 
   * relationship types have an inverse, the inverse relation is also
   * created here.
   * @param neoService a reference to the Neo service.
   * @param indexService a reference to the Index service (for looking
   * up Nodes by name).
   * @param fileNode a reference to the Node that is an entry point into
   * this ontology. This node will connect to both the subject and object 
   * nodes of the selected triples via a CONTAINS relationship. 
   * @param triple a reference to the Triple extracted by the Jena parser.
   * @throws Exception if thrown.
   */
  private void insertIntoDb(NeoService neoService, 
      IndexService indexService,
      org.neo4j.api.core.Node fileNode, 
      Triple triple) throws Exception {
    Node subject = triple.getSubject();
    Node predicate = triple.getPredicate();
    Node object = triple.getObject();
    if ((subject instanceof Node_URI) &&
        (object instanceof Node_URI)) {
      // get or create the subject and object nodes
      org.neo4j.api.core.Node subjectNode = 
        getEntityNode(neoService, indexService, subject);
      org.neo4j.api.core.Node objectNode =
        getEntityNode(neoService, indexService, object);
      if (subjectNode == null || objectNode == null) {
        return;
      }
      Transaction tx = neoService.beginTx();
      try {
        // hook up both nodes to the fileNode
        if (! isConnected(neoService, fileNode, 
            OntologyRelationshipType.CONTAINS, 
            Direction.OUTGOING, subjectNode)) {
          logTriple(fileNode, 
            OntologyRelationshipType.CONTAINS, subjectNode);
          Relationship rel = fileNode.createRelationshipTo(
            subjectNode, OntologyRelationshipType.CONTAINS);
          rel.setProperty(FIELD_RELATIONSHIP_NAME, 
            OntologyRelationshipType.CONTAINS.name());
          rel.setProperty(FIELD_RELATIONSHIP_WEIGHT, 0.0F);
        }
        if (! isConnected(neoService, fileNode, 
            OntologyRelationshipType.CONTAINS, 
            Direction.OUTGOING, objectNode)) {
          logTriple(fileNode, 
            OntologyRelationshipType.CONTAINS, objectNode);
          Relationship rel = fileNode.createRelationshipTo(
            objectNode, OntologyRelationshipType.CONTAINS);
          rel.setProperty(
            FIELD_RELATIONSHIP_NAME, 
            OntologyRelationshipType.CONTAINS.name());
          rel.setProperty(FIELD_RELATIONSHIP_WEIGHT, 0.0F);
        }
        // hook up subject and object via predicate
        OntologyRelationshipType type = 
          OntologyRelationshipType.fromName(predicate.getLocalName());
        if (type != null) {
          logTriple(subjectNode, type, objectNode);
          Relationship rel = subjectNode.createRelationshipTo(
              objectNode, type);
          rel.setProperty(FIELD_RELATIONSHIP_NAME, type.name());
          rel.setProperty(FIELD_RELATIONSHIP_WEIGHT, 1.0F);
        }
        // create reverse relationship
        OntologyRelationshipType inverseType = 
          OntologyRelationshipType.inverseOf(predicate.getLocalName());
        if (inverseType != null) {
          logTriple(objectNode, inverseType, subjectNode);
          Relationship inverseRel = objectNode.createRelationshipTo(
            subjectNode, inverseType);
          inverseRel.setProperty(
            FIELD_RELATIONSHIP_NAME, inverseType.name());
          inverseRel.setProperty(FIELD_RELATIONSHIP_WEIGHT, 1.0F);
        }
        tx.success();
      } catch (Exception e) {
        tx.failure();
        throw e;
      } finally {
        tx.finish();
      }
    } else {
      return;
    }
  }

  /**
   * Loops through the relationships and returns true if the source
   * and target nodes are connected using the specified relationship
   * type and direction.
   * @param neoService a reference to the NeoService.
   * @param sourceNode the source Node object.
   * @param relationshipType the type of relationship.
   * @param direction the direction of the relationship.
   * @param targetNode the target Node object.
   * @return true or false.
   * @throws Exception if thrown.
   */
  private boolean isConnected(NeoService neoService, 
      org.neo4j.api.core.Node sourceNode,
      OntologyRelationshipType relationshipType, Direction direction,
      org.neo4j.api.core.Node targetNode) throws Exception {
    boolean isConnected = false;
    Transaction tx = neoService.beginTx();
    try {
      for (Relationship rel : sourceNode.getRelationships(
          relationshipType, direction)) {
        org.neo4j.api.core.Node endNode = rel.getEndNode();
        if (endNode.getProperty(FIELD_ENTITY_NAME).equals(
            targetNode.getProperty(FIELD_ENTITY_NAME))) {
          isConnected = true;
          break;
        }
      }
      tx.success();
    } catch (Exception e) {
      tx.failure();
      throw e;
    } finally {
      tx.finish();
    }
    return isConnected;
  }

  private org.neo4j.api.core.Node getEntityNode(NeoService neoService,
      IndexService indexService, Node entity) throws Exception {
    String uri = ((Node_URI) entity).getURI();
    if (uri.indexOf('#') == -1) {
      return null;
    }
    String[] parts = StringUtils.split(uri, "#");
    String type = parts[0].substring(0, parts[0].lastIndexOf('/'));
    Transaction tx = neoService.beginTx();
    try {
      org.neo4j.api.core.Node entityNode = 
        indexService.getSingleNode(FIELD_ENTITY_NAME, parts[1]);
      if (entityNode == null) {
        entityNode = neoService.createNode();
        entityNode.setProperty(FIELD_ENTITY_NAME, parts[1]);
        entityNode.setProperty(FIELD_ENTITY_TYPE, type);
        indexService.index(entityNode, FIELD_ENTITY_NAME, parts[1]);
      }
      tx.success();
      return entityNode;
    } catch (Exception e) {
      tx.failure();
      throw e;
    } finally {
      tx.finish();
    }
  }
  
  /**
   * Convenience method to log the triple when it is inserted into the
   * database.
   * @param sourceNode the subject of the triple.
   * @param ontologyRelationshipType the predicate of the triple.
   * @param targetNode the object of the triple.
   */
  private void logTriple(org.neo4j.api.core.Node sourceNode, 
      OntologyRelationshipType ontologyRelationshipType, 
      org.neo4j.api.core.Node targetNode) {
    log.info("(" + sourceNode.getProperty(FIELD_ENTITY_NAME) +
      "," + ontologyRelationshipType.name() + 
      "," + targetNode.getProperty(FIELD_ENTITY_NAME) + ")");
  }
}

The relationship types are listed in the OntologyRelationshipType enum below. The types were found manually by first parsing the Statement objects and finding unique relationships. So it is likely that this enum will need to be expanded if other OWL files need to be parsed.

In addition, I also added in inverse relationships which are not available in the OWL file. Here is the code for OntologyRelationshipType.java.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
// Source: src/main/java/net/sf/jtmt/ontology/graph/OntologyRelationshipType.java
package net.sf.jtmt.ontology.graph;

import org.neo4j.api.core.RelationshipType;

/**
 * Relationships exposed by the taxonomy.
 */
public enum OntologyRelationshipType implements RelationshipType {
  CATEGORIZED_AS(null, null),  // pseudo-rel
  CONTAINS(null, null),        // pseudo-rel
  ADJACENT_REGION("adjacentRegion", "adjacentRegion"),
  HAS_VINTAGE_YEAR("hasVintageYear", "isVintageYearOf"),
  LOCATED_IN("locatedIn", "regionContains"),
  MADE_FROM_GRAPE("madeFromGrape", "mainIngredient"),
  HAS_FLAVOR("hasFlavor", "isFlavorOf"),
  HAS_COLOR("hasColor", "isColorOf"),
  HAS_SUGAR("hasSugar", "isSugarContentOf"),
  HAS_BODY("hasBody", "isBodyOf"),
  HAS_MAKER("hasMaker", "madeBy"),
  IS_INSTANCE_OF("type", "hasInstance"),
  SUBCLASS_OF("subClassOf", "superClassOf"),
  DISJOINT_WITH("disjointWith", "disjointWith"),
  DIFFERENT_FROM("differentFrom", "differentFrom"),
  DOMAIN("domain", null),
  IS_VINTAGE_YEAR_OF("isVintageYearOf", "hasVintageYear"),
  REGION_CONTAINS("regionContains", "locatedIn"),
  MAIN_INGREDIENT("mainIngredient", "madeFromGrape"),
  IS_FLAVOR_OF("isFlavorOf", "hasFlavor"),
  IS_COLOR_OF("isColorOf", "hasColor"),
  IS_SUGAR_CONTENT_OF("isSugarContentOf", "hasSugar"),
  IS_BODY_OF("isBodyOf", "hasBody"),
  MADE_BY("madeBy", "hasMaker"),
  HAS_INSTANCE("hasInstance", "type"),
  SUPERCLASS_OF("superClassOf", "subClassOf");

  private String name;
  private String inverseName;
  
  OntologyRelationshipType(String name, String inverseName) {
    this.name = name;
    this.inverseName = inverseName;
  }
   
  public static OntologyRelationshipType fromName(String name) {
    for (OntologyRelationshipType type : values()) {
      if (name.equals(type.name)) {
        return type;
      }
    }
    return null;
  }
  
  public static OntologyRelationshipType inverseOf(String name) {
    OntologyRelationshipType rel = fromName(name);
    if (rel != null && rel.inverseName != null) {
      return fromName(rel.inverseName);
    } else {
      return null;
    }
  }
}

The loader operates on a single OWL file at a time. To run it, I use the following JUnit test class.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
// Source: src/test/java/net/sf/jtmt/ontology/graph/Owl2NeoLoaderTest.java
package net.sf.jtmt.ontology.graph;

import net.sf.jtmt.ontology.graph.loaders.Owl2NeoLoader;

import org.apache.commons.io.FilenameUtils;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.junit.Test;

/**
 * Test case for Owl2NeoLoader.
 */
public class Owl2NeoLoaderTest {

  private static final String ROOT_NAME = "ConsumableThing";
  
  private final Log log = LogFactory.getLog(getClass());
  
  private static final String[][] SUB_ONTOLOGIES = new String[][] {
    new String[] {"wine.rdf", "Wine"},
    new String[] {"food.rdf", "EdibleThing"}
  };
  
  @Test
  public void testLoading() throws Exception {
    for (String[] subOntology : SUB_ONTOLOGIES) {
      log.info("Now processing " + subOntology[0]);
      Owl2NeoLoader loader = new Owl2NeoLoader();
      loader.setRefNodeName(ROOT_NAME);
      loader.setFilePath(FilenameUtils.concat(
        "/home/sujit/src/jtmt/src/main/resources", subOntology[0]));
      loader.setDbPath("/tmp/neodb");
      loader.setOntologyName(subOntology[1]);
      loader.load();
    }
  }
}

The loader also prints out the triples as it writes them. A partial log (minus the date/time/source data) is shown below.

1
2
3
4
5
6
7
8
...
(CorbansPrivateBinSauvignonBlanc,HAS_SUGAR,Dry)
(Dry,IS_SUGAR_CONTENT_OF,CorbansPrivateBinSauvignonBlanc)
(Wine,CONTAINS,Corbans)
(CorbansPrivateBinSauvignonBlanc,HAS_MAKER,Corbans)
(Corbans,MADE_BY,CorbansPrivateBinSauvignonBlanc)
(Wine,CONTAINS,NewZealandRegion)
...

Query Phase

To test out the loading, I used the same queries that I did previously, using JGraphT against a Prevayler backed in-memory graph. I decided to build a query class which encapsulates the Neo4J query code. Here is the code for the query component.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
// Source: src/main/java/net/sf/jtmt/ontology/graph/NeoOntologyNavigator.java
package net.sf.jtmt.ontology.graph;

import java.util.ArrayList;
import java.util.Collection;
import java.util.Collections;
import java.util.Comparator;
import java.util.Iterator;
import java.util.List;
import java.util.Map;
import java.util.SortedMap;
import java.util.TreeMap;

import org.apache.commons.collections15.MultiMap;
import org.apache.commons.collections15.multimap.MultiHashMap;
import org.neo4j.api.core.Direction;
import org.neo4j.api.core.EmbeddedNeo;
import org.neo4j.api.core.NeoService;
import org.neo4j.api.core.Node;
import org.neo4j.api.core.Relationship;
import org.neo4j.api.core.ReturnableEvaluator;
import org.neo4j.api.core.StopEvaluator;
import org.neo4j.api.core.Transaction;
import org.neo4j.api.core.Traverser;
import org.neo4j.api.core.Traverser.Order;
import org.neo4j.util.index.IndexService;
import org.neo4j.util.index.LuceneIndexService;

/**
 * Provides methods to locate nodes and find neighbors in the Neo
 * graph database.
 */
public class NeoOntologyNavigator {

  public static final String FIELD_ENTITY_NAME = "name";
  public static final String FIELD_RELATIONSHIP_NAME = "name";
  public static final String FIELD_RELATIONSHIP_WEIGHT = "weight";

  private class WeightedNode {
    public Node node;
    public Float weight;
    public WeightedNode(Node node, Float weight) {
      this.node = node;
      this.weight = weight;
    }
  };
  
  private String neoDbPath;
  
  private NeoService neoService;
  private IndexService indexService;
  
  /**
   * Ctor for NeoOntologyNavigator
   * @param dbPath the path to the neo database.
   */
  public NeoOntologyNavigator(String dbPath) {
    super();
    this.neoDbPath = dbPath;
  }
  
  /**
   * The init() method should be called by client after instantiation.
   */
  public void init() {
    this.neoService = new EmbeddedNeo(neoDbPath);
    this.indexService = new LuceneIndexService(neoService);
  }
  
  /**
   * The destroy() method should be called by client on shutdown.
   */
  public void destroy() {
    indexService.shutdown();
    neoService.shutdown();
  }
  
  /**
   * Gets the reference to the named Node. Returns null if the node
   * is not found in the database.
   * @param nodeName the name of the node to lookup.
   * @return the reference to the Node, or null if not found.
   * @throws Exception if thrown.
   */
  public Node getByName(String nodeName) throws Exception {
    Transaction tx = neoService.beginTx();
    try {
      Node node = indexService.getSingleNode(FIELD_ENTITY_NAME, nodeName);
      tx.success();
      return node;
    } catch (Exception e) {
      tx.failure();
      throw(e);
    } finally {
      tx.finish();
    }
  }

  /**
   * Return a Map of relationship names to a List of nodes connected
   * by that relationship. The keys are sorted by name, and the list
   * of node values are sorted by the incoming relation weights.
   * @param node the root Node.
   * @return a Map of String to Node List of neighbors.
   */
  public Map<String,List<Node>> getAllNeighbors(Node node)
      throws Exception {
    MultiMap<String,WeightedNode> neighbors = 
      new MultiHashMap<String,WeightedNode>();
    Transaction tx = neoService.beginTx();
    try {
      String nodeName = (String) node.getProperty(FIELD_ENTITY_NAME);
      for (Relationship relationship : node.getRelationships()) {
        String relName = 
          (String) relationship.getProperty(FIELD_RELATIONSHIP_NAME);
        Float relWeight = 
          (Float) relationship.getProperty(FIELD_RELATIONSHIP_WEIGHT);
        if (relWeight == 0.0F) {
          continue;
        }
        Node neighborNode = relationship.getEndNode();
        // if self-loop, ignore
        String neighborNodeName = 
          (String) neighborNode.getProperty(FIELD_ENTITY_NAME);
        if (nodeName.equals(neighborNodeName)) {
          continue;
        }
        neighbors.put(relName, new WeightedNode(neighborNode, relWeight));
      }
      tx.success();
    } catch (Exception e) {
      tx.failure();
      throw e;
    } finally {
      tx.finish();
    }
    // sort each collection of weighted nodes
    for (String relName : neighbors.keySet()) {
      List<WeightedNode> nodes = 
        (List<WeightedNode>) neighbors.get(relName);
      Collections.sort(nodes, new Comparator<WeightedNode>() {
        public int compare(WeightedNode w1, WeightedNode w2) {
          return w2.weight.compareTo(w1.weight);
        }
      });
    }
    // finally sort the keys and upcast WeightedNodes to Nodes
    SortedMap<String,List<Node>> neighborMap = 
      new TreeMap<String,List<Node>>();
    for (String relName : neighbors.keySet()) {
      Collection<WeightedNode> weightedNodes = neighbors.get(relName);
      List<Node> nodes = new ArrayList<Node>();
      for (WeightedNode weightedNode : weightedNodes) {
        nodes.add(weightedNode.node);
      }
      neighborMap.put(relName, nodes);
    }
    return neighborMap;
  }
  
  /**
   * Returns a List of neighbor nodes that is reachable from the specified
   * Node. No ordering is done (since the Traverser framework does not seem
   * to allow this type of traversal, and we want to use the Traverser here).
   * @param node reference to the base node.
   * @param type the relationship type.
   * @return a List of neighbor nodes.
   */
  public List<Node> getNeighborsRelatedBy(Node node,
      OntologyRelationshipType type) throws Exception {
    List<Node> neighbors = new ArrayList<Node>();
    Transaction tx = neoService.beginTx();
    try {
      Traverser traverser = node.traverse(
        Order.BREADTH_FIRST, 
        StopEvaluator.DEPTH_ONE, 
        ReturnableEvaluator.ALL_BUT_START_NODE, 
        type, 
        Direction.OUTGOING);
      for (Iterator<Node> it = traverser.iterator(); it.hasNext();) {
        Node neighbor = it.next();
        neighbors.add(neighbor);
      }
      tx.success();
    } catch (Exception e) {
      tx.failure();
      throw(e);
    } finally {
      tx.success();
    }
    return neighbors;
  }
}

The query client is represented by the JUnit class shown below. Notice that the query client operates at the abstraction of an application, ie there is no Neo4J code in "client code".

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
// Source: src/test/java/net/sf/jtmt/ontology/graph/NeoOntologyNavigatorTest.java
package net.sf.jtmt.ontology.graph;

import java.util.List;
import java.util.Map;

import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.junit.AfterClass;
import org.junit.BeforeClass;
import org.junit.Test;
import org.neo4j.api.core.Node;

/**
 * Test case for NeoDb Navigator.
 */
public class NeoOntologyNavigatorTest {
  
  private final Log log = LogFactory.getLog(getClass());
  private static final String NEODB_PATH = "/tmp/neodb";
  private static NeoOntologyNavigator navigator;
  
  @BeforeClass
  public static void setupBeforeClass() throws Exception {
    navigator = new NeoOntologyNavigator(NEODB_PATH);
    navigator.init();
  }
  
  @AfterClass
  public static void teardownAfterClass() throws Exception {
    navigator.destroy();
  }
  
  @Test
  public void testWhereIsLoireRegion() throws Exception {
    log.info("query> where is LoireRegion?");
    Node loireRegionNode = navigator.getByName("LoireRegion");
    if (loireRegionNode != null) {
      List<Node> locations = navigator.getNeighborsRelatedBy(
        loireRegionNode, OntologyRelationshipType.LOCATED_IN);
      for (Node location : locations) {
        log.info(
          location.getProperty(NeoOntologyNavigator.FIELD_ENTITY_NAME));
      }
    }
  }
  
  @Test
  public void testWhatRegionsAreInUsRegion() throws Exception {
    log.info("query> what regions are in USRegion?");
    Node usRegion = navigator.getByName("USRegion");
    if (usRegion != null) {
      List<Node> locations = navigator.getNeighborsRelatedBy(
        usRegion, OntologyRelationshipType.REGION_CONTAINS);
      for (Node location : locations) {
        log.info(
          location.getProperty(NeoOntologyNavigator.FIELD_ENTITY_NAME));
      }
    }
  }
  
  @Test
  public void testWhatAreSweetWines() throws Exception {
    log.info("query> what are Sweet wines?");
    Node sweetNode = navigator.getByName("Sweet");
    if (sweetNode != null) {
      List<Node> sweetWines = navigator.getNeighborsRelatedBy(
        sweetNode, OntologyRelationshipType.IS_SUGAR_CONTENT_OF);
      for (Node sweetWine : sweetWines) {
        log.info(
          sweetWine.getProperty(NeoOntologyNavigator.FIELD_ENTITY_NAME));
      }
    }
  }

  @Test
  public void testShowNeighborsForAReislingWine() throws Exception {
    log.info("query> show neighbors for SchlossVolradTrochenbierenausleseRiesling");
    Node rieslingNode = 
      navigator.getByName("SchlossVolradTrochenbierenausleseRiesling");
    Map<String,List<Node>> neighbors = 
      navigator.getAllNeighbors(rieslingNode);
    for (String relType : neighbors.keySet()) {
      log.info("--- " + relType + " ---");
      List<Node> relatedNodes = neighbors.get(relType);
      for (Node relatedNode : relatedNodes) {
        log.info(
          relatedNode.getProperty(NeoOntologyNavigator.FIELD_ENTITY_NAME));
      }
    }
  }
}

The output of the queries is shown below. As you can see, first three are similar to the MySQL/Prevayler/JGraphT version described in my earlier posts. The last one is a dump of a named node, may be useful if we want to build a browsing tool.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
query> where is LoireRegion?
FrenchRegion

query> what regions are in USRegion?
TexasRegion
CaliforniaRegion

query> what are Sweet wines?
WhitehallLanePrimavera
SchlossVolradTrochenbierenausleseRiesling
SchlossRothermelTrochenbierenausleseRiesling

query> show neighbors for SchlossVolradTrochenbierenausleseRiesling?
--- HAS_BODY ---
Full
--- HAS_FLAVOR ---
Moderate
--- HAS_MAKER ---
SchlossVolrad
--- HAS_SUGAR ---
Sweet
--- IS_INSTANCE_OF ---
SweetRiesling
--- LOCATED_IN ---
GermanyRegion

I have barely scratched the surface of the Jena API with this, but I think I have exercised quite a bit of the Neo4J API, and I was quite impressed with the latter. One thing I would have liked to have is support for weighted relationships in the Traverser mechanism, so I could sort the relationships by weight, in case of multiple relationships.

My dataset is too small for me to form any opinion about performance and stability, but now that I am familiar with the API, I plan to use Neo4J to hold a (much) larger dataset to see how it compares against our current architecture of RDBMS and serialized graph.