Saturday, July 29, 2006

AJAX Component with DWR and Velocity

I have been meaning to give the DWR (Direct Web Remoting) AJAX toolkit a shot for some time now. I consider myself a first generation AJAX programmer (as someone who has used XmlHttpRequest), but since I dont know any of the newer AJAX toolkits such as Prototype and DOJO, I just dont get no respect from the hotshot AJAX types (that was a joke BTW). Apart from that, considering that any session remotely related to AJAX played to overflowing crowds in this year's JavaOne, this was something I should have looked at quite some time ago. But since I dont do too much front end development, this was not something I needed to know, so I let it slide. So I finally got around to looking at DWR, and in this article, I describe an AJAX component using DWR and Velocity that can be served up within a portal-style page.

The component provides CRUD (Create, Retrieve, Update and Delete) functionality for a business object. The component is modelled as a state machine. The state diagram is shown below, where the nodes are views provided by the component, and the edges are the operations that are permitted on it.

DWR works by creating Javascript proxies for Java beans that are available to the servlet context. Methods can be called on the proxies in Javascript just as if they were regular Java objects. Each Javascript method call needs to provide an additional callback method parameter, which defines what to do with the result once it is available from the backend. This is because the calls are asynchronous (the A in AJAX) and the Javascript method call does not wait for the backend to respond. The callback method typically parses the return value of the method call and pops it into a span tag in the page.

In the example, I have used a BookReview bean (since I had some test data from one of my previous projects) but this strategy can be extended to allow any object to be exposed. Also, in my example, the service that provides data to the state machine is a local JDBC service, but could just as well have been a client talking to a remote webservice to get data.

I used Spring for the MVC framework, integrating DWR with it based on the instructions in Bram Smeet's weblog, and the DWR-Spring integration page on the DWR site. Unlike Bram Smeet's example however, where he uses Javascript to pull apart the bean returned from the backend service and populate the span tag, I went with the approach of using Velocity templates on the server to create HTML snippets and return the HTML, which the callback function then popped into the span tag. You would probably guess that I am no hotshot Javascript coder (and you'll be right), but the reason for this is more than just to avoid writing Javascript. So the reasons are, in no particular order:

  • Avoid having to write any more Javascript than absolutely necessary.
  • It is easier to unit test at the Java layer than the Javascript layer.
  • Java has had better tool support than Javascript (although thats changing).
  • Ability to cache Velocity templates on the server for performance.

Why Velocity? Well, since we are bypassing the standard request-response cycle using DWR, I could not use JSPs, since JSPs need to have a pageContext populated by the controller at the end of the request-response cycle. The other option was to have generated the HTML directly in the service classes using System.out.println() calls, but that would have taken us back to the dark ages of web programming. Velocity templates provide a clean separation of the view from the model without forcing us to participate in the HTTP request-response cycle. In the case of the webservice setup, the templates can live on the front end application, and the component look and feel can be tweaked without any changes to the webservice client. Even in the case of the basic setup, having templates is more maintainable, since we can change the presentation without affecting the underlying service layer.

Configuration

The configuration is based on information in Bram Smeet's weblog and the DWR-Spring integration pages, so there is nothing new here, I am just including it in here for completeness. I list below the contents of the web.xml, dwr.xml and the Spring comp-servlet.xml.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
<!DOCTYPE web-app PUBLIC "-//Sun Microsystems, Inc.//DTD Web Application 2.3//EN" "http://java.sun.com/dtd/web-app_2_3.dtd" >
<!-- WEB-INF/web.xml -->

<web-app>
  <display-name>DWR/Velocity Component Test</display-name>

  <listener>
    <listener-class>org.springframework.web.context.ContextLoaderListener</listener-class>
  </listener>

  <servlet>
    <servlet-name>comp</servlet-name>
    <servlet-class>org.springframework.web.servlet.DispatcherServlet</servlet-class>
    <load-on-startup>1</load-on-startup>
  </servlet>

  <servlet>
    <servlet-name>dwr-invoker</servlet-name>
    <servlet-class>uk.ltd.getahead.dwr.DWRServlet</servlet-class>
    <init-param>
      <param-name>debug</param-name>
      <param-value>true</param-value>
    </init-param>
  </servlet>

  <servlet-mapping>
    <servlet-name>comp</servlet-name>
    <url-pattern>*.do</url-pattern>
  </servlet-mapping>

  <servlet-mapping>
    <servlet-name>dwr-invoker</servlet-name>
    <url-pattern>/dwr/*</url-pattern>
  </servlet-mapping>

</web-app>
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
<!DOCTYPE dwr PUBLIC "-//GetAhead Limited//DTD Direct Web Remoting 1.0//EN" "http://www.getahead.ltd.uk/dwr/dwr10.dtd">
<!-- WEB-INF/dwr.xml -->
<dwr>
  <allow>
    <create creator="new" javascript="JDate">
      <param name="class" value="java.util.Date" />
    </create>
    <create creator="spring" javascript="BookReviewService">
      <param name="beanName" value="bookReviewService" />
      <param name="location" value="classpath:comp-servlet.xml" />
    </create>
  </allow>
</dwr>
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE beans PUBLIC "-//SPRING//DTD BEAN//EN" "http://www.springframework.org/dtd/spring-beans.dtd" >
<!-- WEB-INF/comp-servlet.xml -->

  <bean id="dataSource" class="org.apache.commons.dbcp.BasicDataSource" destroy-method="close">
    <property name="driverClassName" value="com.mysql.jdbc.Driver" />
    <property name="url" value="jdbc:mysql://localhost:3306/bookshelfdb" />
    <property name="username" value="root" />
    <property name="password" value="mysql" />
  </bean>
   <bean id="bookReviewService" class="org.component.services.BookReviewService">
    <property name="dataSource" ref="dataSource" />
  </bean>
   <bean id="bookReviewController" class="org.component.controllers.BookReviewController">
  <property name="service" ref="bookReviewService" />
  </bean>
   <bean id="simpleUrlHandlerMapping" class="org.springframework.web.servlet.handler.SimpleUrlHandlerMapping">
    <property name="mappings">
      <props>
        <prop key="main.do">bookReviewController</prop>
      </props>
    </property>
  </bean>

</beans>

The Service

The code for the BookReviewService class is shown below. It consists of a set of public methods that the Javascript proxy can call, all of which return a String. The mergeContent() method takes a bean and a template name and renders the bean into the template. The BookReview bean is a simple JavaBean holder of properties, and the BookReviewCollection is a wrapper over a List<BookReview> which also contains the current page number and the total number of pages that can be displayed. In the interests of keeping this blog post to a manageable size, neither of these beans are shown, but they are trivial to implement.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
// BookReviewService.java
package org.component.services;

import java.io.StringWriter;
import java.util.List;
import java.util.Map;

import javax.sql.DataSource;

import org.apache.commons.lang.StringUtils;
import org.apache.log4j.Logger;
import org.apache.velocity.Template;
import org.apache.velocity.VelocityContext;
import org.apache.velocity.app.Velocity;
import org.component.beans.BookReview;
import org.component.beans.BookReviewCollection;
import org.springframework.jdbc.core.ColumnMapRowMapper;
import org.springframework.jdbc.core.JdbcTemplate;

public class BookReviewService {

    public static final String DEFAULT_ORDER_BY = "name";

    private static final Logger log = Logger.getLogger(BookReviewService.class);
    private static final int NUM_ROWS_PER_PAGE = 5;
    private static final String TEMPLATE_DIR = "src/main/resources/templates";

    private DataSource dataSource;

    public BookReviewService() {
        super();
    }

    public void setDataSource(DataSource dataSource) {
        this.dataSource = dataSource;
    }

    public String getAllReviews(int page, String orderBy, boolean isOrderAscending) {
        log.debug("getAllReviews(page=" + page + ", orderBy=" + orderBy + ", isOrderAscending=" + isOrderAscending + ")");
        return getAllReviewsAndMergeToTemplate(page, orderBy, isOrderAscending, "all_reviews");
    }

    public String getAllReviewsAndMergeToTemplate(int page, String orderBy, boolean isOrderAscending, String templateFile) {
        log.debug("getAllReviewsAndMergeToTemplate(page=" + page + ", orderBy=" + orderBy + ", isOrderAscending=" + isOrderAscending + ", templateFile=" + templateFile + ")");
        JdbcTemplate jt = new JdbcTemplate(dataSource);
        String limitStr = String.valueOf(page * NUM_ROWS_PER_PAGE) + "," + String.valueOf(NUM_ROWS_PER_PAGE);
        if (orderBy == null) {
            orderBy = "name";
        }
        List list = jt.queryForList(
            "select id, name, author, review from books order by " +
            (StringUtils.isEmpty(orderBy) ? DEFAULT_ORDER_BY : orderBy) +
            (isOrderAscending ? " ASC" : " DESC") +
            " limit " + limitStr, new Object[0]);
        BookReview[] reviews = new BookReview[list.size()];
        for (int i = 0; i < reviews.length; i++) {
            Map row = (Map) list.get(i);
            reviews[i] = new BookReview();
            reviews[i].setId((Long) row.get("id"));
            reviews[i].setBookTitle((String) row.get("name"));
            reviews[i].setReviewer((String) row.get("author"));
            reviews[i].setReviewText((String) row.get("review"));
        }
        int numReviews = jt.queryForInt("select count(*) from books");
        int lastPage = (int) Math.ceil((double) numReviews / NUM_ROWS_PER_PAGE);        BookReviewCollection collection = new BookReviewCollection();
        collection.setCurrentPage(page);
        collection.setLastPage(lastPage);
        collection.setReviews(reviews);
        return mergeContent(collection, templateFile);
    }

    public String getReview(int id) {
        log.debug("getReview(id=" + id + ")");
        BookReview review = getBookReview(id);
        return mergeContent(review, "single_review");
    }

    public String addOrEditReviewForm(int id, String bookTitle, String reviewer, String text) {
        log.debug("addOrEditReviewForm(id=" + id + ", bookTitle=" + bookTitle + ", reviewer=" + reviewer + ", text=" + text + ")");
        BookReview review = new BookReview();
        review.setId((long) id);
        review.setBookTitle(bookTitle);
        review.setReviewer(reviewer);
        review.setReviewText(text);
        return mergeContent(review, "add_edit_review");
    }

    public String previewReview(int id, String bookTitle, String reviewer, String text) {
        log.debug("previewReview(id=" + id + ", bookTitle=" + bookTitle + ", reviewer=" + reviewer + ", text=" + text + ")");
        BookReview review = new BookReview();
        review.setId((long) id);
        review.setBookTitle(bookTitle);
        review.setReviewer(reviewer);
        review.setReviewText(text);
        return mergeContent(review, "preview_review");
    }

    public String saveReview(int id, String bookTitle, String reviewer, String text) {
        log.debug("saveReview(id=" + id + ", bookTitle=" + bookTitle + ", reviewer=" + reviewer + ", text=" + text + ")");
        JdbcTemplate jt = new JdbcTemplate(dataSource);
        if (id == 0) {
            jt.update("insert into books (id, name, author, review) values (0, ?, ?, ?)",
                new Object[] {bookTitle, reviewer, text});
        }
        return getAllReviews(0, DEFAULT_ORDER_BY, true);
    }

    public String deleteReview(int id) {
        log.debug("deleteReview(id=" + id + ")");
        JdbcTemplate jt = new JdbcTemplate(dataSource);
        if (id != 0) {
            jt.update("delete from books where id=?", new Object[] {new Long(id)});
        }
        return getAllReviews(0, DEFAULT_ORDER_BY, true);
    }

    // for package access by test class
    protected BookReview getBookReview(int id) {
        BookReview review = new BookReview();
        if (id != 0) {
            JdbcTemplate jt = new JdbcTemplate(dataSource);
            Map row = (Map) jt.queryForObject("select id, name, author, review from books where id=?", new Object[] {id}, new ColumnMapRowMapper());
            review.setId((Long) row.get("id"));
            review.setBookTitle((String) row.get("name"));
            review.setReviewer((String) row.get("author"));
            review.setReviewText((String) row.get("review"));
        } else {
            review.setId(0L);
        }
        return review;
    }

    private String mergeContent(Object bean, String templateFile) {
        try {
            Velocity.init();
            VelocityContext vc = new VelocityContext();
            vc.put("bean", bean);
            Template t = Velocity.getTemplate(TEMPLATE_DIR + "/" + templateFile + ".vm");
            StringWriter writer = new StringWriter();
            t.merge(vc, writer);
            writer.flush();
            writer.close();
            return writer.getBuffer().toString();
        } catch (Exception e) {
            log.error("Error merging content", e);
            return "";
        }
    }
}

The Velocity Templates

The Velocity templates have a 1:1 correspondence with the nodes in our state diagram above. The main.vm template represents the component as it will first appear when the containing page is invoked. Notice the span tag named "component". This is where all the subsequent content pulled from method calls on the BookReviewService bean will be put.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
## main.vm
<!--
  test page: http://localhost:8080/smart-component/dwr/index.html
  this page: http://localhost:8080/smart-component/test.html
-->
<html>
  <head>
    <title>BookReviews</title>
    <script type="text/javascript" src="/smart-component/dwr/interface/BookReviewService.js"></script>
    <script type="text/javascript" src="/smart-component/dwr/engine.js"></script>
    <script type="text/javascript" src="/smart-component/dwr/util.js"></script>
  </head>
  <body>
    <script type="text/javascript">
    var callback = function(contents) {
        document.getElementById('component').innerHTML = contents;
    }
    </script>
    <span id="component">
      <table cellspacing="2" cellpadding="2" border="1">
        <tr>
          <td><b>Book Title</b></td>
          <td><b>Reviewer</b></td>
          <td><b>Review</b></td>
          <td><b>Edit</b></td>
          <td><b>Delete</b></td>
        </tr>
#foreach ($review in ${bean.reviews})
        <tr>
          <td>${review.bookTitle}</td>
          <td>${review.reviewer}</td>
          <td>${review.reviewText}</td>
          <td><input type="button" name="edit" value="Edit" onClick="BookReviewService.addOrEditReviewForm('${review.id}', '${review.bookTitle}', '${review.reviewer}', '${review.reviewText}', callback);" /></td>
          <td><input type="button" name="delete" value="Delete" onClick="BookReviewService.deleteReview('${review.id}', callback);" /></td>
        </tr>
#end
      </table>
      <input type="button" name="add" value="Add Review" onClick="BookReviewService.addOrEditReviewForm('0', '', '', '', callback);" />
      &nbsp;|
#if (${bean.currentPage} > 0 && ${bean.currentPage} < ${bean.lastPage})
#set ($prevPage = ${bean.currentPage} - 1)
      &nbsp;
      <input type="button" name="prevPage" value="Previous Page" onClick="BookReviewService.getAllReviews('${prevPage}', '', 'true', callback);" />
#end
#if (${bean.currentPage} == 0)
#set ($nextPage = ${bean.currentPage} + 1)
      &nbsp;
      <input type="button" name="nextPage" value="Next Page" onClick="BookReviewService.getAllReviews('${nextPage}', '', 'true', callback);" />
#end
    </span>
  </body>
</html>

The other pages are all_reviews.vm, add_edit_review.vm, single_review.vm and preview_review.vm. The all_reviews.vm contains the template for the list view, the add_edit_review.vm is the form template, and the single_review.vm and preview_review.vm are templates for the single book review view and the preview view (before saving). One thing to notice in the add_edit_review.vm is that it is not enclosed in a form tag. Enclosing the form in a form tag will make it do a request on submit, which we don't want.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
## all_reviews.vm
<table cellspacing="2" cellpadding="2" border="1">
  <tr>
    <td><b>Book Title</b></td>
    <td><b>Reviewer</b></td>
    <td><b>Review</b></td>
    <td><b>Edit</b></td>
    <td><b>Delete</b></td>
  </tr>
#foreach ($review in ${bean.reviews})
  <tr>
    <td>${review.bookTitle}</td>
    <td>${review.reviewer}</td>
    <td>${review.reviewText}</td>
    <td><input type="button" name="edit" value="Edit" onClick="BookReviewService.addOrEditReviewForm('${review.id}', '${review.bookTitle}', '${review.reviewer}', '${review.reviewText}', callback);" /></td>
    <td><input type="button" name="delete" value="Delete" onClick="BookReviewService.deleteReview('${review.id}', callback);" /></td>
  </tr>
#end
</table>
<input type="button" name="add" value="Add Review" onClick="BookReviewService.addOrEditReviewForm('0', '', '', '', callback)" />
&nbsp;|
#if (${bean.currentPage} > 0 && ${bean.currentPage} < ${bean.lastPage})
#set ($prevPage = ${bean.currentPage} - 1)
&nbsp;
<input type="button" name="prevPage" value="Previous Page" onClick="BookReviewService.getAllReviews('${prevPage}', '', 'true', callback);" />
#end
#if (${bean.currentPage} == 0)
#set ($nextPage = ${bean.currentPage} + 1)
&nbsp;
<input type="button" name="nextPage" value="Next Page" onClick="BookReviewService.getAllReviews('${nextPage}', '', 'true', callback)" />
#end
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
## add_edit_review.vm
<input id="add_edit.id" type="hidden" name="id" value="$!{bean.id}" />
<table cellspacing="3" cellpadding="0" border="0">
  <tr>
    <td><b>Title:</b></td>
    <td><input id="add_edit.bookTitle" type="text" name="name" value="$!{bean.bookTitle}" /></td>
  </tr>
  <tr>
    <td><b>Your name:</b></td>
    <td><input id="add_edit.reviewer" type="text" name="author" value="$!{bean.reviewer}" /></td>
  </tr>
  <tr><td colspan="2"><b>Comment</td></tr>
  <tr>
    <td colspan="2"><textarea id="add_edit.reviewText" cols="80" rows="10" name="text">$!{bean.reviewText}</textarea></td>
  </tr>
</table>
<input type="button" name="preview" value="Preview" onClick="BookReviewService.previewReview(document.getElementById('add_edit.id').value, document.getElementById('add_edit.bookTitle').value, document.getElementById('add_edit.reviewer').value, document.getElementById('add_edit.reviewText').value, callback);" />
&nbsp;
<input type="button" name="cancel" value="Cancel" onClick="BookReviewService.getAllReviews('0', '', 'true', callback);" />
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
## single_review.vm
<table>
  <tr>
    <td><b>Title:</b>&nbsp;${bean.bookTitle}</td>
  </tr>
  <tr>
    <td><b>Reviewed by:</b>&nbsp;${bean.reviewer}</td>
  </tr>
  <tr>
    <td><b>Review:</b>&nbsp;${bean.reviewText}</td>
  </tr>
</table>
<input type="button" name="edit" value="Edit" onClick="BookReviewService.addOrEditReviewForm('${bean.id}', '${bean.bookTitle}', '${bean.reviewer}', '${bean.reviewText}', callback);" />&nbsp;
<input type="button" name="delete" value="Delete" onClick="BookReviewService.deleteReview('${bean.id}', callback);" />&nbsp;
<input type="button" name="list" value="Back to List" onClick="BookReviewService.getAllReviews('0', '', 'true', callback);" />
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
## preview_review.vm
<table>
  <tr>
    <td><b>Title:</b>
    <td>${bean.bookTitle}</td>
  </tr>
  <tr>
    <td><b>Reviewed by:</b>
    <td>${bean.reviewer}</td>
  </tr>
  <tr>
    <td colspan="2">${bean.reviewText}</td>
  </tr>
  <tr>
  </tr>
</table>
<input type="button" name="save" value="Save" onClick="BookReviewService.saveReview('${bean.id}', '${bean.bookTitle}', '${bean.reviewer}', '${bean.reviewText}', callback);" />
&nbsp;
<input type="button" name="edit" value="Edit" onClick="BookReviewService.addOrEditReviewForm('${bean.id}', '${bean.bookTitle}', '${bean.reviewer}', '${bean.reviewText}', callback);" />
&nbsp;
<input type="button" name="cancel" value="Cancel" onClick="BookReviewService.getAllReviews('0', '', 'true', callback);" />

Bootstrapping the Component

Since Javascript is event based, there has to be some event to start it up. I tried starting the list view with an onLoad event, but that was getting very confusing, since I could not populate the same span tag for all subsequent events. So I decided to bootstrap the component with a standard Spring Controller. So when you type this URL into your browser,

1
http://localhost:8080/comp/main.do

The main.vm template is used to provide an initial listing of BookReview objects in the database. The code for the Controller is straightforward, it just invokes the BookReviewService.getAllReviewsAndMergeToTemplate() method and writes directly to the ServletOutputStream.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
package org.component.controllers;

import java.io.OutputStream;

import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import org.apache.log4j.Logger;
import org.component.services.BookReviewService;
import org.springframework.web.servlet.ModelAndView;
import org.springframework.web.servlet.mvc.Controller;

public class BookReviewController implements Controller {

    private static final Logger log = Logger.getLogger(BookReviewController.class);

    private BookReviewService service;

    public BookReviewController() {
        super();
    }

    public void setService(BookReviewService service) {
        this.service = service;
    }

    public ModelAndView handleRequest(HttpServletRequest request, HttpServletResponse response) throws Exception {
        String htmlOutput = service.getAllReviewsAndMergeToTemplate(0, BookReviewService.DEFAULT_ORDER_BY, true, "main");
        OutputStream ostream = response.getOutputStream();
        ostream.write(htmlOutput.getBytes());
        ostream.flush();
        ostream.close();
        return null;
    }
}

Possible DWR Bug

I could not make method calls on onClick events on links work with DWR and Firefox 1.5. It looks like it may be a bug in DWR since the Javascript error message points to engine.js, a DWR supplied file. That is why the templates have so many buttons, since onClick events are triggered correctly if the link is replaced with a button. If anybody has made it work, please let me know.

Conclusion

The combination of Velocity templates to generate HTML on the server and DWR clients to consume it makes for very readable and maintainable code, compared to using XmlHttpRequest calls from Javascript. AJAX is definitely here to stay, and opens up lots of possibilities for partitioning application functionality.

Friday, July 21, 2006

Python XML Viewer for Linux

I recently needed to view an XML file I had generated, to verify that the code worked right. Normally I would just view the file with Firefox using the file:// protocol, and Firefox would show me the document tree with its default XML rendering. But this file was quite large (about 800MB) and Firefox was not able to complete loading the file. It did not crash, but the keyboard and mouse became unresponsive and I had to manually kill Firefox from the command prompt.

So I figured that since Firefox is a general purpose browser, it was probably expecting too much to ask it to render such large files, and I would have better luck with software that was optimized to only render XML - in other words, an XML viewer. So I did a quick search for "XMLViewer" on Google, but came up empty in terms of software I could actually use. There appears to be many more XML Viewers for Windows world than for Linux. The only ones I came up with were KXMLViewer and gxmlviewer.

The screenshots for KXMLViewer look nice but the functionality was not exactly what I wanted. Gxmlviewer was tested on RedHat 7.1 and has probably not been updated since. I was unable to either install the RPM or build from source using the downloads on Fedora Core 3. I could probably have done it if I had tried a little harder, but I decided to pass.

Having failed to find tools to do what I wanted, I started wondering if I probably should just build it myself. Since I wanted to parse large XML files, using a SAX parser seemed to be the obvious choice. As a matter of fact, one of the reasons Firefox was crashing because it was trying to slurp in all the 800MB of data before trying to render it. It needs to do that because it allows you additional controls to expand and collapse elements. That functionality would be a nice-to-have for me, but I definitely did not need it. All I wanted was something that would format the XML (which was written for compactness) into something that I could read. Writing a SAX parser that formats the output into a nice indented document tree structures is practically one of the first examples you encounter when you read about SAX parsing. So building such a tool would be trivial.

To navigate, ie go up or down one or more pages can be done simply using the Linux utility less. The [SPACE] key moves the formatted output forward a page, and [CTRL-B] moves it back. To call the tool, specify the file name and the tabstop indent.

1
$ xmlcat.py my_really_large_xml_file.xml 4 | less

I wrote xmlcat.py in Python. Here is the code.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
#!/usr/bin/python
# A simple SAX Parser to view large XML files as a nicely formatted XML
# document tree. Pipe the output through less and move forward and backward
# using [SPACE] and [CTRL-B] respectively. Standard less keyboard commands
# will also work.
#
import string
import sys
from xml.sax import make_parser
from xml.sax.handler import ContentHandler

class PrettyPrintingContentHandler(ContentHandler):
    """ Subclass of the SAX ContentHandler to print document tree """

    def __init__(self, indent):
        """ Ctor """
        self.indent = indent
        self.level = 0
        self.chars = ''

    def startElement(self, name, attrs):
        """ Set the level and print opening tag with attributes """
        self.level = self.level + 1
        attrString = ""
        qnames = attrs.getQNames()
        for i in range(0, len(qnames)):
            attrString = attrString + " " + qnames[i] + "=\"" + attrs.getValueByQName(qnames[i]) + "\""
        print self.tab(self.level) + "<" + string.rstrip(name) + attrString + ">"

    def endElement(self, name):
        """ Print the characters and the closing tag """
        if (len(string.strip(self.chars)) > 0):
            print self.tab(self.level + 1) + string.rstrip(self.chars)
        self.chars = ''
        print self.tab(self.level) + "</" + string.rstrip(name) + ">"
        self.level = self.level - 1

    def characters(self, c):
        """ Accumulate characters, ignore whitespace """
        if (len(string.strip(c)) > 0):
            self.chars = self.chars + c

    def tab(self, n):
        """ Print the tabstop for the current element """
        tab = ""
        for i in range(1, n):
            for j in range(1, int(self.indent)):
                tab = tab + " "
        return tab

def usage():
    """ Print the usage """
    print "Usage: xmlcat.py xml_file indent | less"
    print "Use [SPACE] and [CTRL-B] to move forward and backward"
    sys.exit(-1)

def main():
    """ Check the arguments, instantiate the parser and parse """
    if (len(sys.argv) != 3):
        usage()
    file = sys.argv[1]
    indent = sys.argv[2]
    parser = make_parser()
    prettyPrintingContentHandler = PrettyPrintingContentHandler(indent)
    parser.setContentHandler(prettyPrintingContentHandler)
    parser.parse(file)

if __name__ == "__main__":
    main()

As you can see, the code is quite trivial and based in large part on this DevShed article, but I am including it here anyway. It took me all of half an hour to write and test, so its not rocket science, but it may help you save half an hour when you are looking for a similar tool, and you stumble upon this page.

Saturday, July 15, 2006

Decoupling with DynaBeans

A producer-consumer design is a fairly common pattern in business application programming. A producer module produces some data which the consumer module uses. Decoupling the two modules means that data must flow from the producer to the consumer, and the data produced by the producer must be parseable by the consumer.

A naive implementation may attempt to get around the problem of parseable data by interleaving the producer and consumer code in the same module. This is not too maintainable, since the two modules have to be developed in parallel and are usually very tightly coupled and are not very extensible.

For implementations where the problem space is well defined, the data could be passed as a JavaBean. However, if we are designing an engine where we are able to plug in various implementations of the Producer and Consumer to produce unique data flows, then a logical extension would be to provide various implementations of a bean interface to pass the data between them. In this case, the data bean interface would be simply a marker interface, since there is unlikely to be commonality between data bean implementations to justify specifying methods that they must implement.

However, we know that all implementations are going to be simple data carriers. The Producer will know how to build the bean and the Consumer would know how to parse it. A logical choice in such situations would be to specify that all beans connecting various Producer and Consumer implementations in our engine should implement DynaBeans.

DynaBeans are a unique data structure and are provided by the Jakarta Commons BeanUtils project. Like object implementations in some scripting languages, the DynaBean "object" is really a HashMap. A DynaBean is an instance of a DynaClass, which is created from an array of DynaProperties. The base DynaBean will not allow getting and setting properties which are not declared in the DynaClass. However, you can get this behavior in implementations of MutableDynaBean, such as LazyDynaClass, where DynaProperties can be added as needed. The ResultSetDynaClass and RowSetDynaClass are useful DynaClass implementations that can be used with SQL ResultSets and RowSets.

Assume that the Producer in our system was running an SQL query, and writing out an XML representation of the results. The snippets of code below illustrate how DynaBeans can be used to pass the data over from the Producer to the Consumer. Note that there are no messy unrolling of the ResultSet using ResultSetMetaData and getObject() calls.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
public class Producer {
    ...
    public void produce() {
        ...
        PreparedStatement ps = conn.executeQuery(sql);
        ResultSet rs = ps.executeQuery();
        ResultSetDynaClass rsdc = new ResultSetDynaClass(rs);
        for (Iterator it = rsdc.iterator(); it.hasNext();) {
            DynaBean row = (DynaBean) it.next();
            consumer.consume(row);
        }
        rs.close();
        ...
    }
}

public class Consumer {
    ...
    public void consume(DynaBean object) {
        DynaProperty[] props = dynaBean.getDynaClass().getDynaProperties();
        for (int i = 0; i < props.length; i++) {
            String key = props[i].getName();
            String value = (dynaBean.get(key) == null ? "" : dynaBean.get(key).toString());
            ostream.write(("<" + key + ">").getBytes());
            ostream.write(value.getBytes());
            ostream.write(("</" + key + ">\n").getBytes());
        }
    }
    ...
}

There are situations where we want to do more processing than run a single SQL query. For example, we could get a foreign key to another table out of the resultset and get additional data from it, and add these new columns to our result. So effectively we will have to add new "member variables" to our DynaBean. For that, we will need to instantiate an instance of a MutableDynaBean such as a LazyDynaBean. The following snippet illustrates this.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
    PreparedStatement ps = conn.prepareStatement(sqlQuery);
    ResultSet rs = ps.executeQuery();
    ResultSetDynaClass rsdc = new ResultSetDynaClass(rs);
    // create our output DynaClass contains an extra property than 
    // that returned from the ResultSet.
    DynaProperty[] props = rsdc.getDynaProperties();
    LazyDynaClass odc = new LazyDynaClass();
    for (int i = 0; i < props.length; i++) {
        odc.add(props[i].getName(), props[i].getType());
    }
    odc.add("extraProperty", String.class);
    // iterate through the ResultSet
    for (Iterator it = rsdc.iterator(); it.hasNext();) {
        DynaBean row = (DynaBean) it.next();
        // clone the data into the mutable DynaBean
        DynaBean orow = odc.newInstance();
        for (int i = 0; i < props.length; i++) {
            orow.set(props[i].getName(), row.get(props[i].getName()));
        }
        orow.set("extraProperty", "foo");
        // do something with orow
        ...
    }
    ps.close();

DynaBeans offer a unique advantage compared to plain JavaBeans. JavaBeans would work if we knew exactly what our Producer produced. However, populating the JavaBean in the Producer means writing a bunch of setXXX() methods, and parsing in the Consumer means writing a bunch of getXXX() methods, which is error prone and high maintenance. We could use introspection on both ends to build and parse the bean, but populating and doing key lookups off a HashMap is faster than introspection.

In some cases the data format may not be fully known, such as when the Producer uses an SQL query to generate its data, and the SQL query is provided to the Producer during runtime. In such cases, the data structure to use would probably be a simple HashMap. DynaBeans provides stronger typing by requiring that the property being set or read be specified in the corresponding DynaClass.

DynaBeans provide a middle ground between the strong type checking of concrete bean implementations and the flexibility of a HashMap. There are some convenience implementations of DynaBean which can help in reducing the amount of repetitive code you have to write. I never had much use for DynaBeans before this, but for this particular application, it seemed tailor made for the job.

On a completely unrelated note, this is the first blog entry I am writing using Opera 9. I recently download and started using it, after I needed a second browser application on my Linux workstation for some work I was doing. I have been using Firefox before this, and I am too new to Opera to be able to compare, but so far, most of the features I use in Firefox seem to be available. Opera does seem to have more keyboard shortcuts.