Thursday, June 29, 2006

Algorithms for Java (Programmers)

I recently finished reading Robert Sedgewick's two volume book - Algorithms for Java and I must say that it is very informative and well-written. The book is in two volumes and five parts - the first volume, consisting of parts 1-4, deals with basic data structures such as List, Stack, Queue, Tree, Set (and all their variations, some of which I had read about earlier and some which were totally new) and various sorting and searching algorithms built using these structures. The second volume (part 5) deals with Graphs and algorithms for searching and traversing these Graphs.

But what does Java programming have to do with understanding data structures, you ask? We, who have our Data Structure implementations built for us in the form of the Java collection classes, and who (rightly or wrongly) believe that if a data structure cannot be implemented in terms of a Map or a List, then it is probably not worth implementing at all? Why should we have to learn about different sorting techniques when we already have Collections.sort(), which requires us only to provide it with elements that implement the Comparable interface? Besides, business application programming rarely needs data structures more complicated than Maps, Lists (and occasionally Sets) anyway.

I would argue that Java programmers tend not to use advanced data structures and algorithms because they dont know them. The reason that they dont know them is that they almost never need to think in terms of implementing data structures and they are too lazy (or pragmatic) to waste time learning something they would never need. The question is usually what structure to use, rather than how to implement and then use a structure. Business problems that would be most elegantly addressed by using one of these advanced data structures tend to get addressed with a one-off hack custom-built for the purpose. Not a bad thing in itself, since the business problem got solved. However we just invented a wheel which may have been invented multiple times, and there may be better wheels out there that we could have chosen had we just looked.

The Java collections classes are quite good, but they contain two glaring omissions - Trees and Graphs and their various implementations. A possible reason could be that there are too many variations to implement and support. Or it could be because, given their complexity, they are less likely to be used, and hence was not considered useful enough to include in the collection classes. However, this means that, when you have a business problem that requires a Tree or Graph, you are pretty much on your own. Generally, the approach I have taken in the past is to come up with a custom implementation that generally is not very reusable.

Although I am sold on the need for programmers to have a solid understanding of formal data structures and algorithms, I did not feel like I was getting what I was really looking for out of Professor Sedgewick's books. Dont get me wrong. I feel like I really learned a lot from these books, there were data structures I did not know about that I know of now as a result of the reading. There were algorithms that I just did not understand (or understand the need for) when I was in college, which I do now after reading these books. Based on my new found knowledge, I can see these being applied to various aspects of my work.

So what exactly am I looking for in a book of this sort? First, I am looking for applicability. A lot of the data structures and algorithms in the book are ones that a Java programmer will never use, or never have to use. A C or C++ programmer would probably find these data structures and algorithms more useful. The general feeling I get is that these books are a quick rehash of the C and C++ versions (by the same author), with the C/C++ code cut out and replaced with similar looking code in Java.

Not to say that one should not know about them, but the book could have been a lot smaller and easier to read if the "non-essential" (to Java programmers) data structures and algorithms were not covered in so much depth. It would have been helpful, on the other hand, if the author highlighted what algorithm is used within the Sun JDK, and in what situations one would not want to use that algorithm and prefer a custom one instead.

The next thing I am looking for is the general coding style. Notwithstanding the language similarities between C/C++ and Java, the objectives, programmer mindset and therefore the coding style of these languages are quite different. Java by design strives harder for readability than C/C++. The examples look like C/C++ code that have been modified to use Java syntax instead.

As for the specific style, I prefer the K&R style myself, but have used the Allman style before. This book uses the GNU style, which I dont think is very popular among Java programmers, so I found the code a little difficult to read. The K&R style is definitely the most popular style used in Java today, as evidenced by the default style in the Eclipse IDE, and just as compact in terms of vertical whitespace as the GNU style, so it may help if the code was presented in that manner. This blog entry describes some popular coding styles in C/C++/Java like languages.

Another code style issue is the cryptic C-like variable naming in the code examples. Having gotten used to Spring's descriptive and rather long variable and class names, I find long variable names easier to map in my head when reading through a program.

But if Java programmers are too lazy (or pragmatic) to learn what they would never use, what happens when they are faced with implementing a structure or algorithm that is not available in pre-built form? Obviously books such as these, with its wealth of working code examples, can help. However, what would help even more, would be if they were implemented into real working ADTs (Abstract Data Types) and perhaps contributed to some place like the Apache Commons Collections package. That way, the book could become even more readable and concentrate on only the usage scenarios where a particular ADT implementation should be used and why.

I realize that much of this is heretical in a sense, and I apologize if I have offended the data structure and algorithm purists among you. My objective in this article was to highlight how excellent books such as this can be made even more useful for professional Java programmers. Java programmers generally dont care about the how to implement data structures as much as C/C++ programmers do, having been spoilt by the availability of built-in ADTs in the base JDK, they are more concerned about the question of which one to use in what situation.

Saturday, June 10, 2006

RoR style URLs with Spring MVC

One of the things that I find impressive about Ruby on Rails (RoR) is the simplicity of URLs used in RoR applications, and how they map back to the Controllers and View components. So in the RoR world, a URL of the form:

http://localhost:8080/app/entity/action/1234

means that the request would be forwarded by the web-application named "app" listening on port number 8080 on localhost to the Ruby EntityController class, which would then invoke the action() method on it with "1234" as an argument, then forward the request to the view component at app/entity/action.rhtml for presentation under the webserver's docroot.

This is, of course, both good and bad. It is good because it makes the application easier to understand and debug, both for the user and developer, and removes the need for some configuration, which can be a point of failure. It is bad for applications relying on security through obscurity, because malicious users can understand your application better too, so your application itself must be more security-concious for applications facing the outside world.

I have been thinking about how to do this using the Spring MVC framework, and it turns out to be quite simple. Here is how I did it. The flow in Spring is identical to the flow in RoR. However, I have changed the URL structure somewhat, to mimic what most Java developers and users are used to (including myself). So here is the new URL structure:

http://localhost:8080/app/entity/action.do?id=1234

would send the request to the DispatcherServlet configured in the "app" web application, which will forward it to the EntityController and call its action() method. The action() method would optionally consume the parameter id from the request. Lot of people associate the .do suffix with Struts, but I think its a nice convention to indicate that the URL points to some kind of Controller component, as opposed to static content indicated by .html, for example.

The DispatcherServlet is configured within the application's web.xml file. Currently, it is configured to respond to URLs ending with the .do suffix. The reference to the Spring Application Context, which contains references to beans that will be used by the DispatcherServlet, is set up by the ContextLoaderListener, as shown below:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
<web-app>
  <listener>
    <listener-class>org.springframework.web.context.ContextLoaderListener</listener-class>
  </listener>
                                                                                
  <servlet>
    <servlet-name>app</servlet-name>
    <servlet-class>org.springframework.web.servlet.DispatcherServlet</servlet-class>
    <load-on-startup>1</load-on-startup>
  </servlet>
  <servlet-mapping>
    <servlet-name>app</servlet-name>
    <url-pattern>*.do</url-pattern>
  </servlet-mapping>
                                                                                
</web-app>

The DispatcherServlet looks for a file called ${servlet.name}-servlet.xml (in our case app-servlet.xml), which typically contains references for the beans that the DispatcherServlet needs. I like to keep it in applicationContext-*.properties in my classpath WEB-INF/classes to make the application unit test friendly and include it from app-servlet.xml. There are three beans that really need to be customized to support RoR style URLs - the HandlerMapping to route the incoming URL, the MethodNameResolver to get the method name to invoke on the Controller which was routed to, and the ViewResolver to do the actual presentation. We also create an abstract class ActiveController which has some convenience method and extends the Spring MultiActionController, but that is just so as to enforce that all Controllers in such applications should be MultiActionControllers.

The HandlerMapping: ActiveControllerUrlHandlerMapping

This is a drop in replacement for standard handler mappings such as SimpleUrlHandlerMapping. Unlike the SimpleUrlHandlerMapping, which reads its mapping configuration once at startup, the ActiveControllerUrlHandlerMapping computes the controller, method and view names each time it is passed a request. It does this by parsing the request URI and pulling out the entity name and adding a "Controller" suffix. It will look this bean up in the ApplicationContext and complain if it cannot find it, so it is important to remember to configure each Controller according to the pattern ${entityName}Controller. Since it is parsing the URL anyway at this stage, it also computes and validates the method name and the view names to use, and sticks them into request attributes.

The requirement to have the Controller bean reference named in a certain way is not there in a RoR app, since it is an interpreted language, so a Controller and a Controller method becomes visible as soon as you drop the new code in the docroot. We could probably set this up to auto-detect a Controller as soon as it becomes visible in a certain package, but the approach would not be totally platform agnostic until Java comes out with a Package.getClasses() method.

The configuration for the handlerMapping looks like this:

1
2
3
4
  <bean id="handlerMapping" class="cnwk.prozac.utils.controllers.ActiveControllerUrlHandlerMapping">
     <property name="defaultHandler" ref="defaultHandler" />
   </bean>
  <bean id="defaultHandler" class="cnwk.prozac.utils.controllers.ActiveControllerDefaultHandler" /> 

Notice that there is no explicit URL pattern to controller mappings here. The defaultHandler is not strictly necessary, but can help to return user friendly results if the ActiveControllerUrlHandlerMapping does not find a valid controller or method to go to. In such cases, rather than throw a 404, it executes the ${defaultHandler}.info() method.

Heres the code for the ActiveControllerUrlHandlerMapping

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
import java.lang.reflect.Method;
 
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
 
import org.apache.commons.beanutils.MethodUtils;
import org.apache.log4j.Logger;
import org.springframework.web.servlet.ModelAndView;
import org.springframework.web.servlet.handler.AbstractUrlHandlerMapping;
 
/**
 * Sets up mappings between the URL pattern and the corresponding Controller
 * beans. The convention for the URL is as follows:
 * <pre>
 * http://host:port/${webAppName}/${entityName}/${methodName}?(${arg}=${value})* * </pre>
 * If no controller bean is found in the application context, then the
 * lookupHandler method returns null.
 * The defaultHandler can be set as it is a property of AbstractHandlerMapping.
 */
public class ActiveControllerUrlHandlerMapping extends AbstractUrlHandlerMapping {
     
    private static final Logger log = Logger.getLogger(ActiveControllerUrlHandlerMapping.class);
     
    public ActiveControllerUrlHandlerMapping() {
        super();
    }
 
    /**
     * Returns a configured ActiveController bean that the URL resolves to.
     * If the URL is malformed, or the ActiveController for the specified URL
     * is not configured, or if the handling method is not available in the
     * resolved ActiveController instance, the ActiveControllerDefaultHandler
     * is returned, with the appropriate error message in the request attribute.     
     * @param request the HttpServletRequest object.
     * @return the ActiveController for this request.
     */
    @Override
    protected Object getHandlerInternal(HttpServletRequest request) {
        String urlPath = request.getRequestURI();
        String[] entityAndMethod = parseEntityAndMethodNamesFromUrl(urlPath);
        if (entityAndMethod[0] == null && entityAndMethod[1] == null) {
            request.setAttribute(ActiveController.ATTR_ERROR_MESSAGE,
                "Malformed URL:[" + urlPath + "], must be of the form " +
                "/${webapp}/${entity}/${method}?[${arg}=${value}&...]");
            return null;
        }
        String requestedController = entityAndMethod[0] + "Controller";
        Object handler = getApplicationContext().getBean(requestedController);
        if (handler == null) {
            request.setAttribute(ActiveController.ATTR_ERROR_MESSAGE,
                "The ActiveController instance " + requestedController + 
                " is not configured in the ApplicationContext");
            return null;
        } else if (!(handler instanceof ActiveController)) {
            request.setAttribute(ActiveController.ATTR_ERROR_MESSAGE,
                "The bean " + requestedController + " is not a ActiveController");
            return null;
        } else {
            Method requestedMethod = MethodUtils.getAccessibleMethod(
                handler.getClass(), entityAndMethod[1],
                new Class[] {HttpServletRequest.class, HttpServletResponse.class});
            if (requestedMethod == null) {
                request.setAttribute(ActiveController.ATTR_ERROR_MESSAGE,
                    "The method " + entityAndMethod[1] + 
                    "(HttpServletRequest, HttpServletResponse):ModelAndView is not defined in " + 
                    requestedController);
                return null;
            } else {
                String returnTypeClassName = requestedMethod.getReturnType().getName();
                if (!(ModelAndView.class.getName().equals(returnTypeClassName))) {
                    request.setAttribute(ActiveController.ATTR_ERROR_MESSAGE,
                        "The method " + entityAndMethod[1] + " has incorrect return type " + 
                        returnTypeClassName + 
                        ", should be org.springframework.web.servlet.ModelAndView, " +
                        "check your code");
                    return null;
                }
            }
        }
        request.setAttribute(ActiveController.ATTR_VIEW_NAME, entityAndMethod[0] + 
            "/" + entityAndMethod[1]);
        request.setAttribute(ActiveController.ATTR_METHOD_NAME, entityAndMethod[1]);
        return handler;
    }
     
    /**
     * Returns the entity and method names from the URL.
     * @param urlPath
     * @return
     */
    private String[] parseEntityAndMethodNamesFromUrl(String urlPath) {
        String[] parts = urlPath.split("[\\/|\\&]");
        if (parts.length < 4) {
            return new String[] {null, null};
        }
        String[] entityAndMethodNames = new String[2];
        entityAndMethodNames[0] = parts[2];
        if (parts[3].indexOf(".") > -1) { // remove any trailing suffix
            entityAndMethodNames[1] = parts[3].split("\\.")[0];
        } else {
            entityAndMethodNames[1] = parts[3];
        }
        return entityAndMethodNames;
    }
}

The MethodNameResolver: ActiveControllerMethodNameResolver

The ActiveControllerMethodNameResolver simply retrieves the value of the request attribute that was set by the HandlerMapping bean when it parsed and validated the incoming URL. If there was an error parsing the URL, the Spring framework will pass it off to the defaultHandler and the method attribute would be null in the incoming request. So all this bean does is to check if the method attribute is null, and if so, set it to the string "info".

Here is how it is configured:

1
  <bean id="methodNameResolver" class="cnwk.prozac.utils.controllers.ActiveControllerMethodNameResolver" />

and here is the code for the ActiveControllerMethodNameResolver

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
import javax.servlet.http.HttpServletRequest;
 
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.springframework.web.servlet.mvc.multiaction.AbstractUrlMethodNameResolver;
import org.springframework.web.servlet.mvc.multiaction.MethodNameResolver;
import org.springframework.web.servlet.mvc.multiaction.NoSuchRequestHandlingMethodException;
 
/**
 * Resolves the method name for the specified controller. If the method does
 * not exist, then it returns an error.
 */
public class ActiveControllerMethodNameResolver implements MethodNameResolver {
     
    private static Log log = LogFactory.getLog(ActiveControllerMethodNameResolver.class);
     
    public ActiveControllerMethodNameResolver() {
        super();
    }
 
    /**
     * Returns the method name that will be executed.
     * @see org.springframework.web.servlet.mvc.multiaction.MethodNameResolver#getHandlerMethodName(javax.servlet.http.HttpServletRequest)
     */
    public String getHandlerMethodName(HttpServletRequest request) 
            throws NoSuchRequestHandlingMethodException {
        // check to see if its already populated by ActiveControllerUrlHandlerMapping
        String methodName = (String) request.getAttribute(ActiveController.ATTR_METHOD_NAME);
        if (methodName == null) {
            return "info";
        }
    }
}

The ViewResolver: InternalResourceViewResolver

For the ViewResolver, we just use one of the standard view resolvers provided with Spring. The HandlerMapping already populates the view name as ${entityName}/${methodName}, and the ViewResolver is configured with a prefix and suffix as shown below:

1
2
3
4
5
  <bean id="viewResolver" class="org.springframework.web.servlet.view.InternalResourceViewResolver">
    <property name="prefix" value="/" />
    <property name="suffix" value=".jsp" />
    <property name="viewClass" value="org.springframework.web.servlet.view.JstlView" />
  </bean>

So in this case, a view name of the form "person/list" would be resolved by the JSP file person/list.jsp under the web application's docroot.

The Controller superclass: ActiveController

Finally, since all our Controllers need to be MultiActionControllers to avail of this RoR style URL mappings, we enforce this by requiring that all our Controller subclass the ActiveController class. If they do not, the HandlerMapping would refuse to resolve the URLs. The ActiveController does provide one useful method getDefaultViewName() which pulls out the view name attribute off the request. Here is the configuration.

1
2
3
  <bean id="activeController" class="cnwk.prozac.utils.controllers.ActiveController">
    <property name="methodNameResolver" ref="methodNameResolver" />
  </bean>

Notice that it needs a reference to the methodNameResolver, which is our ActiveControllerMethodName resolver. The code is here:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
 
import org.apache.log4j.Logger;
import org.springframework.context.ApplicationContextException;
import org.springframework.web.servlet.ModelAndView;
import org.springframework.web.servlet.mvc.multiaction.MultiActionController;
 
/**
 * Base class for our RoR URL handling controllers.
 */
public class ActiveController extends MultiActionController {
 
    private final static Logger log = Logger.getLogger(ActiveController.class);
     
    public static final String ATTR_REQUEST_OBJECT = "_request";
    public static final String ATTR_ERROR_MESSAGE = "_error";
    public static final String ATTR_METHOD_NAME = "_methodName";
    public static final String ATTR_VIEW_NAME = "_viewName";
    public static final String DEFAULT_METHOD_NAME = "info";
     
    /**
     * Default ctor.
     * @throws ApplicationContextException if one is thrown.
     */
    public ActiveController() throws ApplicationContextException {
        super();
    }
 
    /**
     * Alternate ctor.
     * @param delegate the Object to delegate to.
     * @throws ApplicationContextException if one is thrown.
     */
    public ActiveController(Object delegate) throws ApplicationContextException {
        super(delegate);
    }
 
    /**
     * The default info() method which is called whenever a method cannot be
     * resolved. The info() method contains information about the URL sent, any
     * error messages, and (optionally) other information useful for debugging.
     * @param request a HttpServletRequest object.
     * @param response a HttpServletResponse object.
     * @return a ModelAndView
     * @throws Exception if one is thrown during processing.
     */
    public ModelAndView info(HttpServletRequest request, HttpServletResponse response) 
            throws Exception {
        ModelAndView mav = new ModelAndView();
        mav.addObject(ActiveController.ATTR_REQUEST_OBJECT, request);
        mav.setViewName(getDefaultViewName(request));
        return mav;
    }
     
    /**
     * Returns the method name from the request. The method name is populated
     * by the ActiveControllerUrlHandlerMapping. The method name is the view
     * name in our "convention over configuration" world. The exact mechanics
     * of view resolution (eg whether it goes to a JSP or a Tile) is controlled
     * by the ViewResolver class injected into the DispatcherServlet.
     * @param request the HttpServletRequest object.
     * @return the view name to forward to.
     */
    protected String getDefaultViewName(HttpServletRequest request) {
        return (String) request.getAttribute(ActiveController.ATTR_VIEW_NAME);
    }
}

Usage

Once the HandlerMapping, MethodNameResolver and the ViewResolver are configured, and these classes added to the system classpath, Controller classes will need to extend the ActiveController, and provide methods with the following signature:

public ModelAndView ${methodName}(HttpServletRequest request, 
  HttpServletResponse response) throws Exception;

and once the Controller class itself is configured as per the convention ${entityName}Controller, these methods would be automatically accessible to the web application at the URL /${webapp.name}/${entityName}/${methodName}.do.

In the course of doing some googling, I also found that the author of the blog MemeStorm has written an article here - Convention over Configuration in Spring MVC which describes an approach very similar to mine. It also contains many more articles on Spring MVC so its a good place to look for more insights about Spring.

Saturday, June 03, 2006

Ravin' about Maven 2

I was starting a new (personal) project, so I thought it may be a good idea to set it up using Maven2. Up until now, I have been a happy user of Ant. The first I heard about Maven was from Michael Neale, one of the core developers of the JBoss Rule (formerly Drools) project, who liked it because he did not have to manually write build.xml files anymore. More recently, my ex-colleague Debasish Ghosh and author of the blog "Ruminations of a Programmer", and someone whose opinions I hold in high regard, blogged about how Maven helps with Project Geometry.

Apart from this, Maven (and Maven2) has many cool features. Maven2 is also a very large, complex and poorly documented software which therefore has a very steep (almost vertical) learning curve. It also requires you to be connected to the Internet most of the time. I realize that given the abundance of articles about Maven in the Java press lately, I may sound like another Java dev who has seen the light or drank the Maven Kool-Aid (depending on your point of view), but till about day before yesterday, until I finally got everything working, I was cursing about these deficiencies myself.

In a nutshell, the advantages I see with Maven2 are as follows. Some of them can be addressed by other tools, which do not require you to completely restructure your build system as Maven does. But the nice thing about Maven2 is that these are all supplied by a single package, so if you spend the time to understand how Maven2 works, you should be able to leverage its built in support for third party build tools to produce better quality software.

  • Standardized Directory Structure
  • Simplified Build Process
  • No bundling of third-party JAR files
  • Transitive Dependency handling
  • Built in reporting
  • Support for creating project website
  • Support for APT (Almost Plain Text) format for user documentation
  • Automatic generation of IDE artefacts

Standardized Directory Structure

Maven2 comes with a set of built in project archetypes (or templates), which auto-generates different type of project directory structures for you. Because each of these directory structures are standard, Maven "knows" where to find the .java files when it has to run a "compile" goal (similar to Ant's targets). Unfortunately, the standard directory structure for web applications (with the Java sources embedded within the application itself) is not one of them. Maven2 has the concept of one artefact for one application, so the kind of web application I am used to would produce a JAR file for the Java stuff and a WAR file for the webapp itself. To set up this kind of application, we need to set up a multi-module project under Maven2, with one standalone Java project for the Java code, and a simple web application project for the JSP and HTML pages, with the standalone Java project as a dependency. This turned out to be a mixed blessing, since IDEs such as Eclipse are better at running Java compiles than parsing and validating XML pages, so having them in separate projects actually speeds up clean Java compiles.

Simplified build process

Once the standard directory structure is set up, you need to build a POM (Project Object Model) with a pom.xml file. The pom.xml file will tell Maven about the project, such as the type of project, the library dependencies, the compiler level, etc, based on which Maven2 will now "know" how to run various standard goals. The Maven2 team calls this style declarative, in contrast to Ant's imperative model, where you need to tell Ant how to do a compile, test and so on. To do non-standard things, there are ways to enhance Maven's build phases with enhanced goals, which seems fairly simple to do, but requires more understanding than I have of the Maven POM. The only benefit I see is that I dont have to create a build.xml file for each project, and that project targets are standardized across applications - not a major benefit, since I can always look up the available targets using "ant -projecthelp" and I usually copy and paste content from build.xml files from other projects when I create a new one anyway.

No bundling of third-party JARs

I do web application development with Spring and Hibernate, which come with a large number of JAR files in their standard distributions. Additionally, I may also use other miscellaneous JAR files, such as log4j and the jakarta-commons libraries, so each project contains quite a lot of libraries. Maven2 downloads these JAR files, based on the contents of your POM, to a central local repository (under $HOME/.m2/repository), so it can be shared across multiple projects. However, there are other solutions, such as Savant, which does the same thing, without requiring you to give up your Ant build files.

Transitive Dependency handling

This simply means that if I need the Spring JAR file in my webapp, I specify Spring as my dependency, not say, for example the AOP Alliance JAR file which spring depends on. Maven2 will consult the Spring POM in the remote repository, and figure out the dependencies that Spring needs and download them too. This was not supported in Maven 1.x, which was one of my motivations for starting with Maven with Maven2, even though Maven 1.x is better documented. In my opinion, this is a major benefit, and the old way of copying everything that I needed manually is like doing software installs on Linux using RPM rather than RedHat's Yum or Debian's APT. But this feature is not unique to Maven. I know that at least Savant supports this feature.

Built in reporting

Maven2 has an open plugin architecture, which allows different tool vendors to write plugins which Maven can download and use. Maven supports a large number of standard reports that get generated when you generate your project website. These reports can provide quite a lot of information about your project that can be useful to developers, such as the test coverage report. I have not run any of the reports myself yet, but I liked what I saw in the documentation.

Support for creating project website

Building a standard project website is just a matter of running the mvn site command. It creates a functional and standard site that can be used to store various project artifacts and providing information about the project to the end users. Beats having to build a project website manually.

Support for APT format for user documentation

This is another big thing that attracted me to use Maven2. It supports a Wiki-like formatting syntax called APT (Almost Plain Text) that can be formatted to HTML automatically. This can be very useful for creating user documentation. In the past, I have used MS-Word documents, HTML and Docbook XML documents (both coded by hand) for writing documentation. Lately, I have been using the corporate wiki to write system and user documentation, and the time savings are quite awesome. APT is similar, except that you will not need the corporate wiki infrastructure to host your documentation, it is part of your project website itself. Another feature I have not used yet, but one that I will definitely use.

Automatic generation of IDE artefacts

And of course, the nicest thing for Java IDE users, which is probably 90% of Java programmers today. Maven has goals for generating IDE artifacts for Eclipse, IDEA, NetBeans and even EMACS JDEE. This is pretty much a requirement if you are working with a Maven repository, since the JAR files are nested too deep in the repository directory structure, and selecting each of them to set up your IDE build path is going to be difficult and time consuming. However, with the IDE artefact generation goals, you can create the artefacts with a single command, and you will have a fully configured project when you step into your IDE to work. I have tried this with Eclipse, and it works great.

However, there are things that I did not like about Maven2. Here is a list.

  • Need Continuous Internet Connection
  • All or Nothing approach
  • Steep Learning Curve and Scarce Documentation
  • New features of Maven2 not available to Maven 1.x

Need Continuous Internet Connection

A colleague notes - "The Network is NOT the Computer, its the Network", but its amazing how so many modern software programs seem to assume that the user is going to be always on the Internet. Maven2 is one such. When you download Maven2, it downloads some small bootstrap component, which allows it to start itself and to download whatever plugin is required for the goal you requested. Once the plugin is downloaded, you can use it in when you are disconnected. But during the initial setup, it is mandatory for you to be always connected to the Internet. This was a problem for me, since this type of work (picking up new frameworks and build tools and learning them) usually occurs on my commute on my laptop when I do not have an internet connection. I suspect a large number of Maven2 users also are not always connected to the Internet. A separate download or Maven2 goal that allows all available or most commonly used plugins to be downloaded at one go would be very helpful.

All or Nothing Approach

Maven2 is a huge and very featureful piece of software, but it is not possible to phase it into an organization by only using specific modules. One popular module, in my opinion, would be the APT to HTML translator. Believe it or not, switching build tools which are used by all developers in a corporate environment, especially when you transition from something as intuitive as Ant to Maven, is no trivial task, and is unlikely to happen, except perhaps within small teams which are free to set their own standards, and open-source projects.

Steep Learning Curve and Scarce Documentation

Maven2 is a totally new way of doing your builds. It makes easy things trivial, and hard things easy, but only if you know how - and its quite a challenge to figure out how the first time. Some things are quite non-intuitive, for example to make Maven2 work with Java 1.5, I had to add the following snippet to my POM:

1
2
3
4
5
6
7
8
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-compiler-plugin</artifactId>
        <configuration>
          <source>1.5</source>
          <target>1.5</target>
        </configuration>
      </plugin>

Dont know about you, but this is not something I would have thought up myself. There are other little things too, which are hard to figure out the first time, but you will totally get it once you know how.

Like other Apache projects, finding documentation for solutions to problems on the Maven2 website is difficult, but the articles are good and quite detailed. As far as books go, I did not find "Maven - A Developer's Notebook" by Vince Massol very useful, since it deals with Maven 1.x, and Maven2 is quite different. A better choice is Mergere's free e-book "Better Builds with Maven" by Vincent Massol and Jason Van Zyl which works great as a tutorial and a general background reference on Maven2. Download it, you will need it. You will still have to look up discussion forums on the Internet, but at least its a start.

No backporting of new features

This is probably just a temporary issue, but there is a general feeling among the Maven user community that Maven2 is unfinished. Yet, Maven2 offers many useful features, such as Transitive Dependency handling, which are never going to be backported to Maven 1.x. Which means that users who want to use these features have to stay on the bleeding edge and face the possibility of builds being broken due to bugs in Maven2 code. At the moment this is annoying, but is likely to go away as Maven2 gets more mature.

So, to set up Maven, get yourself hooked up to the Internet and stay on it until you are done with all the goals you are likely to execute. Resist the temptation to do local installs from JARs on your hard disk if you cannot find them on ibiblio or other mirrors. Maven2 needs both the JAR files and the POM file for that third-party project to resolve dependencies. And enjoy yourself coding while Maven2 takes care of your build once you are past the learning curve.