Saturday, May 05, 2007

Generic commons-collections

The Apache commons-collections library provides many interesting data structures that are quite useful and huge time-savers. Before I started using them, I would routinely build MultiMap or BidiMap implementations in my application code. I did not even know that there was a thing such as a Bag, something I used quite recently. In other words, a great library which I love and have used very heavily over the last few years.

However, ever since Java 1.5 came out with its support for Generics and generic Collection data structures in java.util, I find myself using the generic form to declare and work with these data structures. In addition to the obvious advantages of type-safety and compile-time checking generics brings to the code, I find the self-documenting feature of generic code very attractive. So for example, if you wanted to create a Map with BigDecimal keys and some application object as the value, you could simply state:

1
2
3
4
5
public static final Map<BigDecimal,AppObject> myMap = new HashMap<BigDecimal,AppObject>();
...
public Map<BigDecimal,AppObject> getMyMap() {
  ...
}
rather than:
1
2
3
4
5
6
7
8
public static final Map myMap = new HashMap(); // key=BigDecimal,value=AppObject
...
/**
 * @returns a Map of {BigDecimal,AppObject}
 */
public Map getMyMap() {
  ...
}

Since commons-collections is non-generic (as of the 3.2 version), and the decision to make these classes generic is being impacted by backward compatibility considerations, it is likely that code using commons-collections will have to continue to mix generic and non-generic calls in the forseeable future. This is not a huge deal, since the code will still work, but it does negate the advantages of using generics in Java code.

The collections15 project from Larvalabs, provides a generic version of the commons-collections project. This seems to be a fairly well kept secret, since I stumbled upon it accidentally when browsing through the commons-collections mailing list, looking for when (and if) there will be a generic version available for this library.

This post describes a small example that converts application code from using a TransformIterator from commons-collections to using its generic replacement TransformIterator<I,O> from the collections15 project. Hopefully, it will illustrate how easy the conversion is, and how much more readable and type-safe the code becomes as a result.

My example is creating a dynamic SQL query with an IN clause. For example, to find all employees with first name 'Bob' in Engineering, Finance and Sales, we could write a query which looks something like this:

1
2
3
4
select * from employee
  where first_name = 'Bob'
  and dept_name in ('Engineering','Finance','Sales')
  order by last_name;

The list of departments are provided to the method generating the dynamic SQL string as a List<String>. So the original code looked like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
import org.apache.commons.collections.Transformer;
import org.apache.commons.collections.iterators.TransformIterator;
...
public class MyClass {
  ...
  private TransformIterator quotingIterator = new TransformIterator();
  quotingIterator.setTransformer(new Transformer() {
    public Object transform(Object input) {
      return "'" + StringEscapeUtils.escapeSql((String) input) + "'";
    }
  });
  ...
  private String buildSql(String firstName, List<String> departments) {
    quotingIterator.setIterator(departments.iterator());
    String sql = "select * from employee where first_name = " + 
      "'" + StringEscapeUtils.escapeSql(firstName) + "' " +
      "and dept_name in (" +
      StringUtils.join(quotingIterator, ',') + 
      ") order by last_name";
    return sql;
  }
}

The generic version of the code is as simple as dropping in the TransformIterator replacement from the collections15 project. To do this, I needed to comment out the following dependency from my Maven2 pom.xml file and add the new declaration,

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
    <!--
    <dependency>
      <groupId>commons-collections</groupId>
      <artifactId>commons-collections</artifactId>
      <version>3.2</version>
      <scope>compile</scope>
    </dependency>
    -->
    <dependency>
      <groupId>net.sourceforge.collections</groupId>
      <artifactId>collections-generic</artifactId>
      <version>4.01</version>
      <scope>compile</scope>
    </dependency>

then regenerate my Eclipse project files using mvn eclipse:eclipse. This causes the collections15 jar files to be downloaded to my local repository and make it visible to my project. The new code now looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
import org.apache.commons.collections15.Transformer;
import org.apache.commons.collections15.iterators.TransformIterator;
...
public class MyClass {
  ...
  private TransformIterator<String,String> quotingIterator = new TransformIterator<String,String>();
  quotingIterator.setTransformer(new Transformer<String,String>() {
    public String transform(String input) {
      return "'" + StringEscapeUtils.escapeSql((String) input) + "'";
    }
  });
  ...
  private String buildSql(String firstName, List<String> departments) {
    quotingIterator.setIterator(departments.iterator());
    String sql = "select * from employee where first_name = " + 
      "'" + StringEscapeUtils.escapeSql(firstName) + "' " +
      "and dept_name in (" +
      StringUtils.join(quotingIterator, ',') + 
      ") order by last_name";
    return sql;
  }
}

As you can see, the actual buildSql() method is unchanged. However, the Transformer in the generified code explicitly expects a String and returns a String in its transform() method, instead of throwing a ClassCastException if it did not find a String input in the non-generified code.

Also, to get started, it may just be a matter of replacing the imports from org.apache.commons.collections to org.apache.commons.collection15 using a sed script on all the source code. This will allow us to get rid of the compile time dependency on the commons-collections jar, and then we could, at our convenience, generify the code using the new classes from the collections15 project.

In this particular application, there is no third-party or framework code that use commons-collections, but it would be short-sighted to conclude that they don't. In fact, because these libraries are so useful, I suspect that my case is the odd one here. However, dependencies from framework or third party code will be runtime dependencies, which can be satisfied by linking in the Apache commons-collections at runtime in addition to the collections15 jar file. Maven2 allows us to do this quite easily, by setting up the former as a runtime dependency and the latter as a compile time dependency.

No comments:

Post a Comment

Comments are moderated to prevent spam.