Saturday, September 08, 2007

Spring loaded Jackrabbit

So far I haven't been very enthusiastic about Jackrabbit, and yet I keep writing about it. My lack of enthusiasm stems from the fact that it would quite an effort to move our existing content to any content repository, which is stored as a combination of flat files, database tables and Lucene indexes, as well as keep up with the steady flow of new content we are licensing. We also have tools and gadgets which require more granular access than that provided through Jackrabbit's standard query API.

However, of late, almost everything I do seems to be driven by whether I can apply it readily, which, in retrospect, seems to be a bit short-sighted. This was driven home to me recently when I was asked to implement an idea I had suggested (and developed a proof of concept for my own understanding) about a year ago. So what seems to be impractical today may not be so a year from now, so it may be worth spending time on some technology today in the hope that maybe the knowledge would be useful down the line. In fact, that's one reason I started with this blog in the first place. And there is no doubt that Jackrabbit is cool technology, and while there are still warts, I expect it to mature enough to justify production-quality use by the time I am ready to use it.

That said, one of the things which make a particular software attractive to me is its ability to be integrated with the Spring Framework, only because I find Spring's IoC/dependency injection useful and hence tend to use it everywhere, from web applications to standalone Java projects. The Spring Modules project has built code to integrate with various other popular software, and one of them is JCR. Within the springmodules-jcr project, there is support for Jackrabbit and Jeceira, another open source CMS based on the JCR specifications.

Based upon an InfoQ article "Integrating Java Content Repository and Spring", written by Costin Leau, one of the developers on the Spring Modules project, I decided to rewrite my Content Loader and Retriever implementations that I described in my blog post two weeks ago, to use JcrTemplate and JcrCallback provided by springmodules-jcr, as well as let Spring build up my Repository and other objects using dependency injection.

First, the applicationContext.xml so you know how its all set up:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:util="http://www.springframework.org/schema/util"
       xsi:schemaLocation="
       http://www.springframework.org/schema/beans 
       http://www.springframework.org/schema/beans/spring-beans-2.0.xsd
       http://www.springframework.org/schema/util 
       http://www.springframework.org/schema/util/spring-util-2.0.xsd">

  <bean id="repository" class="org.springmodules.jcr.jackrabbit.RepositoryFactoryBean">
    <property name="configuration" value="classpath:repository.xml"/>
    <property name="homeDir" value="file:/tmp/repository"/>
  </bean>
  
  <bean id="jcrSessionFactory" class="org.springmodules.jcr.JcrSessionFactory">
    <property name="repository" ref="repository"/>
    <property name="credentials">
      <bean class="javax.jcr.SimpleCredentials">
        <constructor-arg index="0" value="user"/>
        <constructor-arg index="1">
          <bean factory-bean="password" factory-method="toCharArray"/>
        </constructor-arg>
      </bean>
    </property>
  </bean>
  
  <bean id="password" class="java.lang.String">
    <constructor-arg index="0" value="password"/>
  </bean>
  
  <bean id="jcrTemplate" class="org.springmodules.jcr.JcrTemplate">
    <property name="sessionFactory" ref="jcrSessionFactory"/>
    <property name="allowCreate" value="true"/>
  </bean>

  <bean id="fileFinder" class="com.mycompany.myapp.FileFinder">
    <property name="filter" value=".xml"/>
  </bean>
  
  <bean id="someRandomDocumentParser" 
      class="com.mycompany.myapp.SomeRandomDocumentParser"/>
  
  <bean id="myRandomContentLoader" class="com.mycompany.myapp.ContentLoader2">
    <property name="fileFinder" ref="fileFinder"/>
    <property name="jcrTemplate" ref="jcrTemplate"/>
    <property name="contentSource" value="myRandomContentSource"/>
    <property name="parser" ref="someRandomDocumentParser"/>
    <property name="sourceDirectory" value="/path/to/my/random/content"/>
  </bean>
  
  <bean id="myRandomContentRetriever" class="com.mycompany.myapp.ContentRetriever2">
    <property name="jcrTemplate" ref="jcrTemplate"/>
  </bean>
    
</beans>    

The ContentLoader2.java is a version of ContentLoader.java which uses the springmodules-jcr API to work with Jackrabbit:

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
package com.mycompany.myapp;

import java.io.File;
import java.io.IOException;
import java.util.List;

import javax.jcr.Node;
import javax.jcr.PathNotFoundException;
import javax.jcr.RepositoryException;
import javax.jcr.Session;

import org.apache.log4j.Logger;
import org.springframework.beans.factory.annotation.Required;
import org.springmodules.jcr.JcrCallback;
import org.springmodules.jcr.JcrTemplate;

public class ContentLoader2 {

  private static final Logger LOGGER = Logger.getLogger(ContentLoader2.class);
  
  private FileFinder fileFinder;
  private String sourceDirectory;
  private String contentSource;
  private IParser parser;
  private JcrTemplate jcrTemplate;
  
  @Required
  public void setFileFinder(FileFinder fileFinder) {
    this.fileFinder = fileFinder;
  }

  @Required
  public void setJcrTemplate(JcrTemplate jcrTemplate) {
    this.jcrTemplate = jcrTemplate;
  }

  @Required
  public void setContentSource(String contentSource) {
    this.contentSource = contentSource;
  }

  @Required
  public void setParser(IParser parser) {
    this.parser = parser;
  }

  @Required
  public void setSourceDirectory(String sourceDirectory) {
    this.sourceDirectory = sourceDirectory;
  }

  public void load() throws Exception {
    jcrTemplate.execute(new JcrCallback() {
      public Object doInJcr(Session session) throws IOException, RepositoryException {
        try {
          Node contentSourceNode = getFreshContentSourceNode(session, contentSource);
          List<File> filesFound = fileFinder.find(sourceDirectory);
          for (File fileFound : filesFound) {
            DataHolder dataHolder = parser.parse(fileFound);
            if (dataHolder == null) {
              continue;
            }
            LOGGER.info("Parsing file:" + fileFound);
            Node contentNode = contentSourceNode.addNode("content");
            for (String propertyKey : dataHolder.getPropertyKeys()) {
              String value = dataHolder.getProperty(propertyKey);
              LOGGER.debug("Setting property " + propertyKey + "=" + value);
              contentNode.setProperty(propertyKey, value);
            }
            session.save();
          }
        } catch (Exception e) {
          throw new IOException("Exception parsing and storing file", e);
        }
      }
    });
  }

  /**
   * Our policy is to do a fresh load each time, so we want to remove the contentSource
   * node from our repository first, then create a new one.
   * @param session the Repository Session.
   * @param contentSourceName the name of the content source.
   * @return a content source node. This is a top level element of the repository,
   * right under the repository root node.
   * @throws Exception if one is thrown.
   */
  private Node getFreshContentSourceNode(Session session, String contentSourceName) throws Exception {
    Node root = session.getRootNode();
    Node contentSourceNode = null;
    try {
      contentSourceNode = root.getNode(contentSourceName);
      if (contentSourceNode != null) {
        contentSourceNode.remove();
      }
    } catch (PathNotFoundException e) {
      LOGGER.info("Path for content source: " + contentSourceName + " not found, creating");
    }
    contentSourceNode = root.addNode(contentSourceName);
    return contentSourceNode;
  }
}

The ContentRetriever2.java, like the ContentLoader2.java, is a version of the original ContentRetriever.java file that works with Jackrabbit using the springmodules-jcr API:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
package com.mycompany.myapp;

import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

import javax.jcr.Node;
import javax.jcr.NodeIterator;
import javax.jcr.Property;
import javax.jcr.PropertyIterator;
import javax.jcr.RepositoryException;
import javax.jcr.Session;
import javax.jcr.query.Query;
import javax.jcr.query.QueryManager;
import javax.jcr.query.QueryResult;

import org.springmodules.jcr.JcrCallback;
import org.springmodules.jcr.JcrTemplate;

public class ContentRetriever2 {

  private JcrTemplate jcrTemplate;

  public void setJcrTemplate(JcrTemplate jcrTemplate) {
    this.jcrTemplate = jcrTemplate;
  }

  @SuppressWarnings("unchecked")
  public List<DataHolder> findAllByContentSource(final String contentSource) {
    return (List<DataHolder>) jcrTemplate.execute(new JcrCallback() {
      public Object doInJcr(Session session) throws IOException, RepositoryException {
        List<DataHolder> contents = new ArrayList<DataHolder>();
        Node contentSourceNode = session.getRootNode().getNode(contentSource);
        NodeIterator ni = contentSourceNode.getNodes();
        while (ni.hasNext()) {
          Node contentNode = ni.nextNode();
          String contentId = contentNode.getProperty("contentId").getValue().getString();
          contents.add(getContent(contentSource, contentId));
        }
        return contents;
      }
    });
  }
  
  public DataHolder getContent(final String contentSource, final String contentId) {
    return (DataHolder) jcrTemplate.execute(new JcrCallback() {
      public Object doInJcr(Session session) throws IOException, RepositoryException {
        DataHolder dataHolder = new DataHolder();
        QueryManager queryManager = session.getWorkspace().getQueryManager();
        Query query = queryManager.createQuery("//" + contentSource + 
          "/content[@contentId='" + contentId + "']", Query.XPATH);
        QueryResult result = query.execute();
        NodeIterator ni = result.getNodes();
        while (ni.hasNext()) {
          Node contentNode = ni.nextNode();
          PropertyIterator pi = contentNode.getProperties();
          dataHolder.setProperty("contentSource", contentSource);
          while (pi.hasNext()) {
            Property prop = pi.nextProperty();
            dataHolder.setProperty(prop.getName(), prop.getValue().getString());  
          }
          break;
        }
        return dataHolder;
      }
    });
  }
  
  public DataHolder getContentByUrl(final String contentSource, final String url) {
    return (DataHolder) jcrTemplate.execute(new JcrCallback() {
      public Object doInJcr(Session session) throws IOException, RepositoryException {
        DataHolder dataHolder = null;
        QueryManager queryManager = session.getWorkspace().getQueryManager();
        Query query = queryManager.createQuery("//" + contentSource + 
          "/content[@cfUrl='" + url + "']", Query.XPATH);
        QueryResult result = query.execute();
        NodeIterator ni = result.getNodes();
        while (ni.hasNext()) {
          Node contentNode = ni.nextNode();
          String contentId = contentNode.getProperty("contentId").getValue().getString();
          dataHolder = getContent(contentSource, contentId);
          break;
        }
        return dataHolder;
      }
    });
  }
}

If you compared the code above to my older post, there is not much difference. The old code is now encapsulated inside of a JcrCallback anonymous inner class implementation, which is called from a JcrTemplate.execute() method. The other thing that has changed is that I no longer build my JCR Repository and Session objects in my code anymore. Also there is no Repository.login() calls in my code, because Spring already logged me in. However, one of the most important differences is the absence of checked Exceptions being thrown from the code. JcrTemplate converts the checked RepositoryException and IOException raised from the calls to JCR code into unchecked ones.

There is obviously a lot about Jackrabbit, JCR and springmodules-jcr that I don't know yet. From my limited knowledge, it looks like a product with lots of promise, even though I don't think its useful to me right now. I plan to keep looking some more, and over the next few weeks, write about the features I think will be useful to me if I ever end up setting up one in a real environment.

6 comments (moderated to prevent spam):

Anonymous said...

Hi,
I am trying to fetch the list of all contents using
Query query = queryManager.createQuery("//" + contentSource +
"/content", Query.XPATH);

It doesn't work it returns 0 nodes, do you know what might be the reason.

Sujit Pal said...

Hi, I've only worked briefly with Jackrabbit and its been a while, but I looked at the code again, and noticed that to get by content source (see findAllByContentSource), I've used node traversal, so I'm guessing that XPath queries may not have worked for me. I suspect that the //foo moves the record pointer to the specific node in the tree, and the [@bar="baz..."] part is the actual filter used by the iterator, so the QueryManager may (possibly for performance reasons) interpret the absence of a filter as a negation rather than no filtering. But I am only guessing here, I think it may be worth asking on the Jackrabbit mailing list - if you get an answer, would appreciate you posting it here.

nickman said...

FYI: Spring-Modules went inactive but was forked and is now here: https://github.com/astubbs/spring-modules

Sujit Pal said...

Thanks Nickman.

Goutham P N said...

while working with the above code am getting the following error:


org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'repository' defined in ServletContext resource [/WEB-INF/dispatcher-servlet.xml]: Invocation of init method failed; nested exception is java.lang.NoClassDefFoundError: org/apache/jackrabbit/util/Text

Caused by: java.lang.NoClassDefFoundError: org/apache/jackrabbit/util/Text

Sujit Pal said...

Hi Goutham, points to a classpath error, you are probably missing some of Jackrabbit's JARs in your application. Its been a while since I used Jackrabbit (we ended up going with Drupal and customizing the publishing aspect to talk to a Java publisher over XMLRPC instead). Quick way to check if you have the relevant jar is to use findjar.