Friday, May 26, 2006

Java Data Structure: A Generic Tree

A few weeks ago, I was talking to another Java programmer who was once a C++ programmer. The conversation gravitated towards data structures, and about how C++ programmers need to consider the best data structure for their particular application and then figure out the best and most optimal way to build it, while Java programmers typically just choose the most optimal data structure for their application from the standard java.util library. In my opinion, this is both a good thing and a bad thing for Java programmers. Its good because you dont have to worry about implementing a Vector for each project (or remember which third party library to include or purchase), and its bad, because it makes you lazy. So lazy, in fact, that when the time comes to implement a data structure that is not available in the Java standard library, or cannot be synthesized by combining Maps and Lists, Java programmers are often at a loss for how to proceed.

This conversation turned out to be prescient, since a few days later, I found myself confronted with programming a business object that just had to be mapped to a Tree data structure. It was a Task object, which can contain other Tasks, which can contain other Tasks, and so on. Since there is no built in Tree data structure in the java.utils package, I had to create my own. This article explains the design and provides the code for a generic Tree structure.

The basic Tree Data Structure

First off, the Tree class itself should be a container class, and should contain elements of type Node. The Node element is a thin wrapper over the actual data content, to which we will assign the generic type T. So our Tree<T> will contain a collection of Node<T> objects, where T is the actual data that we want to represent in our tree structure. The Node is represented as an element of type T, and a List<T> of children of type Node<T>. We use a List object to hold all the children, because we are dealing with an N-ary Tree, ie, we cannot tell in advance how many children a particular Node will have. Here is a diagram showing the logical tree and the corresponding data structure.

The code for the Tree and Node objects follow:

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
/**
 * Represents a Tree of Objects of generic type T. The Tree is represented as
 * a single rootElement which points to a List<Node<T>> of children. There is
 * no restriction on the number of children that a particular node may have.
 * This Tree provides a method to serialize the Tree into a List by doing a
 * pre-order traversal. It has several methods to allow easy updation of Nodes
 * in the Tree.
 */
public class Tree<T> {
 
    private Node<T> rootElement;
     
    /**
     * Default ctor.
     */
    public Tree() {
        super();
    }
 
    /**
     * Return the root Node of the tree.
     * @return the root element.
     */
    public Node<T> getRootElement() {
        return this.rootElement;
    }
 
    /**
     * Set the root Element for the tree.
     * @param rootElement the root element to set.
     */
    public void setRootElement(Node<T> rootElement) {
        this.rootElement = rootElement;
    }
     
    /**
     * Returns the Tree<T> as a List of Node<T> objects. The elements of the
     * List are generated from a pre-order traversal of the tree.
     * @return a List<Node<T>>.
     */
    public List<Node<T>> toList() {
        List<Node<T>> list = new ArrayList<Node<T>>();
        walk(rootElement, list);
        return list;
    }
     
    /**
     * Returns a String representation of the Tree. The elements are generated
     * from a pre-order traversal of the Tree.
     * @return the String representation of the Tree.
     */
    public String toString() {
        return toList().toString();
    }
     
    /**
     * Walks the Tree in pre-order style. This is a recursive method, and is
     * called from the toList() method with the root element as the first
     * argument. It appends to the second argument, which is passed by reference     * as it recurses down the tree.
     * @param element the starting element.
     * @param list the output of the walk.
     */
    private void walk(Node<T> element, List<Node<T>> list) {
        list.add(element);
        for (Node<T> data : element.getChildren()) {
            walk(data, list);
        }
    }
}

/**
 * Represents a node of the Tree<T> class. The Node<T> is also a container, and
 * can be thought of as instrumentation to determine the location of the type T
 * in the Tree<T>.
 */
public class Node<T> {
 
    public T data;
    public List<Node<T>> children;
 
    /**
     * Default ctor.
     */
    public Node() {
        super();
    }
 
    /**
     * Convenience ctor to create a Node<T> with an instance of T.
     * @param data an instance of T.
     */
    public Node(T data) {
        this();
        setData(data);
    }
     
    /**
     * Return the children of Node<T>. The Tree<T> is represented by a single
     * root Node<T> whose children are represented by a List<Node<T>>. Each of
     * these Node<T> elements in the List can have children. The getChildren()
     * method will return the children of a Node<T>.
     * @return the children of Node<T>
     */
    public List<Node<T>> getChildren() {
        if (this.children == null) {
            return new ArrayList<Node<T>>();
        }
        return this.children;
    }
 
    /**
     * Sets the children of a Node<T> object. See docs for getChildren() for
     * more information.
     * @param children the List<Node<T>> to set.
     */
    public void setChildren(List<Node<T>> children) {
        this.children = children;
    }
 
    /**
     * Returns the number of immediate children of this Node<T>.
     * @return the number of immediate children.
     */
    public int getNumberOfChildren() {
        if (children == null) {
            return 0;
        }
        return children.size();
    }
     
    /**
     * Adds a child to the list of children for this Node<T>. The addition of
     * the first child will create a new List<Node<T>>.
     * @param child a Node<T> object to set.
     */
    public void addChild(Node<T> child) {
        if (children == null) {
            children = new ArrayList<Node<T>>();
        }
        children.add(child);
    }
     
    /**
     * Inserts a Node<T> at the specified position in the child list. Will     * throw an ArrayIndexOutOfBoundsException if the index does not exist.
     * @param index the position to insert at.
     * @param child the Node<T> object to insert.
     * @throws IndexOutOfBoundsException if thrown.
     */
    public void insertChildAt(int index, Node<T> child) throws IndexOutOfBoundsException {
        if (index == getNumberOfChildren()) {
            // this is really an append
            addChild(child);
            return;
        } else {
            children.get(index); //just to throw the exception, and stop here
            children.add(index, child);
        }
    }
     
    /**
     * Remove the Node<T> element at index index of the List<Node<T>>.
     * @param index the index of the element to delete.
     * @throws IndexOutOfBoundsException if thrown.
     */
    public void removeChildAt(int index) throws IndexOutOfBoundsException {
        children.remove(index);
    }
 
    public T getData() {
        return this.data;
    }
 
    public void setData(T data) {
        this.data = data;
    }
     
    public String toString() {
        StringBuilder sb = new StringBuilder();
        sb.append("{").append(getData().toString()).append(",[");
        int i = 0;
        for (Node<T> e : getChildren()) {
            if (i > 0) {
                sb.append(",");
            }
            sb.append(e.getData().toString());
            i++;
        }
        sb.append("]").append("}");
        return sb.toString();
    }
}

Reading and Writing Trees from Database

The Tree and Node classes are not very useful by themselves, since what I am after is the ability to store and retrieve my Tasks from a database. I will use Hibernate for my ORM layer. Since Hibernate, to the best of my knowledge, does not work with generic Java code, we create fully specified subclasses out of our generic classes above, and the Task POJO itself. The Task database table itself follows the standard pattern for representing hierarchical data in relational tables, with an embedded parent id that points back to the row's logical parent. The diagram below illustrates this pattern.

And here is the code for the subclasses of the generic Tree and Node classes, the POJO representing the Task object, the Hibernate XML mapping, the abstract base Hibernate DAO class, and the DAO classes for TaskTree and Task.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
/**
 * Non-generic subclass of Tree<Task>
 */
public class TaskTree extends Tree<Task> {
    public TaskTree() {
        super();
    }
}

/**
 * Non-generic subclass of Node<T>
 */
public class TaskNode extends Node<Task> {
    public TaskNode() {
        super();
    }
}

/**
 * Simple POJO with one to one mapping to database table Task.
 */
public class Task implements Serializable {
    private Long id;
    private Long parentId;
    private String taskName;
    private String taskDescription;
    private String allocatedTo;
    private Integer estimatedHours;
    private Integer actualHours;
    private String status;
 
    public Task() {
        super();
    }
     
    public Task(String taskName) {
        this();
        setTaskName(taskName);
    }
     
    public Integer getActualHours() {
        return this.actualHours;
    }
 
    public void setActualHours(Integer actualHours) {
        this.actualHours = actualHours;
    }
 
    public String getAllocatedTo() {
        return this.allocatedTo;
    }
 
    public void setAllocatedTo(String allocatedTo) {
        this.allocatedTo = allocatedTo;
    }
 
    public Integer getEstimatedHours() {
        return this.estimatedHours;
    }
 
    public void setEstimatedHours(Integer estimatedHours) {
        this.estimatedHours = estimatedHours;
    }
 
    public Long getId() {
        return this.id;
    }
 
    public void setId(Long id) {
        this.id = id;
    }
 
    public Long getParentId() {
        return this.parentId;
    }
 
    public void setParentId(Long parentId) {
        this.parentId = parentId;
    }
 
    public String getStatus() {
        return this.status;
    }
 
    public void setStatus(String status) {
        this.status = status;
    }
 
    public String getTaskDescription() {
        return this.taskDescription;
    }
 
    public void setTaskDescription(String taskDescription) {
        this.taskDescription = taskDescription;
    }
 
    public String getTaskName() {
        return this.taskName;
    }
 
    public void setTaskName(String taskName) {
        this.taskName = taskName;
    }
     
    public String toString() {
        return ReflectionToStringBuilder.reflectionToString(this, ToStringStyle.DEFAULT_STYLE);
    }
}

The Hibernate mapping file for the Task object looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE hibernate-mapping PUBLIC "-//Hibernate/Hibernate Mapping DTD 3.0//EN" "http://hibernate.sourceforge.net/hibernate-mapping-3.0.dtd" >
<hibernate-mapping>
                                                                                
  <class name="com.mycompany.beans.Task" table="Task" lazy="false">
                                                                                
    <id name="id" type="java.lang.Long" unsaved-value="null">
      <generator class="native" />
    </id>
                                                                                
    <property name="parentId" type="java.lang.Long">
      <column name="parentId" />
    </property>
                                                                                
    <property name="taskName" type="java.lang.String">
      <column name="taskName" />
    </property>
                                                                                
    <property name="taskDescription" type="java.lang.String">
      <column name="taskDescription" />
    </property>
                                                                                
    <property name="allocatedTo" type="java.lang.String">
      <column name="allocatedTo" />
    </property>
                                                                                
    <property name="estimatedHours" type="java.lang.Integer">
      <column name="estimatedHours" />
    </property>
                                                                                
    <property name="actualHours" type="java.lang.Integer">
      <column name="actualHours" />
    </property>
                                                                                
    <property name="status" type="java.lang.String">
      <column name="status" />
    </property>
                                                                                
  </class>
</hibernate-mapping>

Finally, in keeping with our thinking that the Tree object should be a collection and concern itself only with navigation, and the Node object should reference the actual object, we create two separate DAO (Data Access Objects). The client will normally reference the TreeDao, and the TreeDao will internally reference the TaskDao as needed. Here is the code for the TreeDao and TaskDao classes. We have also refactored the common code into an abstract class AbstractHibernateDao (which extends Spring's HibernateDaoSupport).

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
/**
 * A base DAO object that needs to be extended by all our individual DAOs. This
 * class provides the base functionality to interact with the Hibernate ORM.
 * @author Sujit Pal (sujit.pal@comcast.net)
 * @version $Revision$
 */
public abstract class AbstractHibernateDao extends HibernateDaoSupport {
 
    public AbstractHibernateDao() {
        super();
    }
 
    public Serializable save(Object entity) {
        return getHibernateTemplate().save(entity);
    }
     
    public void saveOrUpdate(Object entity) {
        getHibernateTemplate().saveOrUpdate(entity);
    }
     
    public void update(Object entity) {
        getHibernateTemplate().update(entity);
    }
     
    public void remove(Object entity) {
        getHibernateTemplate().delete(entity);
    }
     
    public Object get(Class clazz, Serializable id) {
        return getHibernateTemplate().get(clazz, id);
    }
     
    public Object load(Class clazz, Serializable id) {
        return getHibernateTemplate().load(clazz, id);
    }
     
    public List find(final String hqlQuery, final String[] params, final Object[] paramValues, final Type[] paramTypes, final Integer start, final Integer count) {
        return (List) getHibernateTemplate().execute(new HibernateCallback() {
            public Object doInHibernate(Session session) throws HibernateException, SQLException {
                Query query = session.createQuery(hqlQuery);
                int paramsLength = (params == null ? 0 : params.length);
                if (params != null) {
                    if (paramValues == null || paramValues.length != params.length) {
                        throw new IllegalArgumentException("ParamValues not completely specified for all params");
                    }
                    if (paramValues == null || paramTypes.length != params.length) {
                        throw new IllegalArgumentException("ParamTypes not completely specified for all params");
                    }
                }
                for (int i = 0; i < paramsLength; i++) {
                    query.setParameter(params[i], paramValues[i], paramTypes[i]);
                }
                if (start == null) {
                    query.setFirstResult(0);
                } else {
                    query.setFirstResult(start.intValue());
                }
                if (count == null) {
                    query.setMaxResults(0);
                } else {
                    query.setMaxResults(count.intValue());
                }
                return query.list();
            }
        }, true);
    }
}

/**
 * Dao for the Task POJO.
 */
public class TaskDao extends AbstractHibernateDao {
     
    private final static Logger log = Logger.getLogger(TaskDao.class);
 
    public TaskDao() {
        super();
    }
     
    /**
     * Remove a Task object from the database.
     * @param task the Task to remove.
     */
    public void remove(Task task) {
        super.remove(task);
    }
     
    /**
     * Saves a Task to the database if it is a new one, else updates it.
     * @param task the Task to insert or update.
     * @return the task id.
     */
    public Long saveOrUpdate(Task task) {
        if (task.getId() == null) {
            return (Long) save(task);
        } else {
            update(task);
            return task.getId();
        }
    }
     
    /**
     * Returns the Task pointed to by the task id. Returns a NULL if the
     * Task is not found in the database.
     * @param taskId the task id to look up.
     * @return the Task object corresponding to the id.
     */
    public Task get(Long taskId) {
        return (Task) super.get(Task.class, taskId);
    }
     
    /**
     * Returns all the children of the Task which has the specified parent id.
     * @param parentId the id of the parent Task.
     * @return a List of Tasks which are children of the Task specified.
     */
    @SuppressWarnings("unchecked")
    public List<Task> findByParentId(Long parentId) {
        return super.find("from Task where parentId = :parentId",
            new String[] {"parentId"},
            new Object[] {parentId},
            new Type[] {Hibernate.LONG},
            new Integer(0), new Integer(1));
    }
     
    /**
     * Returns the first task with the specified name. This will typically never     * be used directly from the application, mainly for use in tests.
     * @param taskName the name of the task to return.
     * @return the first task with the specified name.
     */
    @SuppressWarnings("unchecked")
    public Task findByName(String taskName) {
        List<Task> tasks =  super.find("from Task where taskName = :taskName",
            new String[] {"taskName"},
            new Object[] {taskName},
            new Type[] {Hibernate.STRING},
            new Integer(0), new Integer(1));
        return tasks.get(0);
    }
}

/**
 * Data Access Object for the TaskTree object. This is not a true Dao, since it
 * delegates to the TaskDao object for any database operations.
 */
public class TaskTreeDao {
     
    private static final Logger log = Logger.getLogger(TaskTreeDao.class);
     
    private TaskDao taskDao;
 
    public TaskTreeDao() {
        super();
    }
     
    public void setTaskDao(TaskDao taskDao) {
        this.taskDao = taskDao;
    }

    /**
     * Saves a TaskTree object.
     * @param taskTree a TaskTree object.
     */     
    public void saveOrUpdate(TaskTree taskTree) {
        List<Node<Task>> tasks = taskTree.toList();
        // save the tree in reverse order, starting from the leaf nodes
        // and going up to the root of the tree.
        int numberOfNodes = tasks.size();
        for (int i = numberOfNodes - 1; i >= 0; i--) {
            Node<Task> taskElement = tasks.get(i);
            Task task = taskElement.getData();
            taskDao.saveOrUpdate(task);
            Long parentId = task.getId();
            for (Iterator<Node<Task>> it = taskElement.getChildren().iterator(); it.hasNext();) {
                Node<Task> childElement = it.next();
                Task childTask = childElement.getData();
                childTask.setParentId(parentId);
                taskDao.saveOrUpdate(childTask);
            }
        }
    }
     
    /**
     * Gets a single TaskTree by id. This is the head of a recursive method call
     * to getRecursive().
     * @param id the id of the TaskTree.
     * @return a TaskTree for that id.
     */
    public TaskTree get(Long id) {
        TaskTree taskTree = new TaskTree();
        Node<Task> rootElement = new Node<Task>(taskDao.get(id));
        getRecursive(rootElement, taskTree);
        taskTree.setRootElement(rootElement);
        return taskTree;
    }
     
    /**
     * Recursively descends the Tree and populates the TaskTree object.
     * @param taskElement the root Node.
     * @param tree the TaskTree object to populate.
     */
    private void getRecursive(Node<Task> taskElement, TaskTree tree) {
        List<Task> children = taskDao.findByParentId(taskElement.getData().getId());
        List<Node<Task>> childElements = new ArrayList<Node<Task>>();
        for (Iterator<Task> it = children.iterator(); it.hasNext();) {
            Task childTask = it.next();
            Node<Task> childElement = new Node<Task>(childTask);
            childElements.add(childElement);
            getRecursive(childElement, tree);
        }
        taskElement.setChildren(childElements);
    }
}

Usage

Using the above classes, if we have built up a TaskTree object, we can save it using the following call:

1
2
3
    TaskTree taskTree = buildUpOurTaskTreeSomehow();
    TaskTreeDao taskTreeDao = getReferenceToATaskTreeDao();
    taskTreeDao.saveOrUpdate(taskTree);

and retrieve it back from the database using this call:

1
2
3
    TaskNode rootElement = getOurRootNodeSomehow();
    TaskTreeDao taskTreeDao = getReferenceToATaskTreeDao();
    TaskTree taskTree = taskTreeDao.get(rootElement.getId());

So, although Java allows the Java programmer to be blissfully unaware of the complexity of building data structures by hand, sometimes it becomes necessary to do so, simply because the data structures provided by Java is not sufficient for the task at hand. There are a number of good books on the subjects, as mentioned in my last blog post, and if you don't want to spend any money, check out The Particle's Guide to Java Data Structures. I've only just skimmed it, and it could do with improvement, but hey, its free.

92 comments (moderated to prevent spam):

Anonymous said...

Approach for persisting the tree and the associated task nodes using Hibernate is very inefficient. Specially calling taskDAO.saveOrUpdate in a loop for each Task.

Sujit Pal said...

Thanks for your comment. I was wondering how you would approach a similar problem, ie persisting a tree structured business object with Hibernate.

Thanks

Sujit Pal said...

Actually, I think I know what you mean. I recently came across Joe Celko's "Trees in SQL" article, and what I have here is a tree represented using the Adjacency List model, a better, more performant, although less intuitive representation is the Nested Set model. I will remember to use this model the next time I have to build a tree. Thanks for the pointer.

Debee said...

I have follolwed the same instructions you have mentioned here.At the end you specified that

TaskTree taskTree = buildUpOurTaskTreeSomehow();
TaskTreeDao taskTreeDao = getReferenceToATaskTreeDao();

can you please be more elaborate on this. I wantr to make it work. Also please let me know if you gave any better solution to handle this.

Sujit Pal said...

Its been a while since I wrote this, and I dont have the code around to check...

However, the buildUpTheTaskTreeSomehow() stands for a set of calls that will populate the task tree using a tree of tasks. How this is done will depend on your application.

The getReferenceToTaskDao() could be as simple as:
TaskDao td = new TaskDao()
or you could have it injected into your configuration if you are using some sort of dependency injection.

Hope this helps,
Sujit

Anonymous said...

Hi

I am using JDK 1.4 and need a basic tree structure.

The examples that was provided uses 1.5. Is there a possibility that u have the code for 1.4?

If you happen to have it, it would be great if u could post it as well.


Thanks
Paul

Sujit Pal said...

Hi Paul,

No I don't have 1.4 code for this, but this should be fairly easy to derive from this. When I say Tree<E> or Node<E>, all I am saying is that it can take any object of type E, so you can just get rid of the angle brackets and the stuff within it to make it Java 1.4. As long as you ensure that a Tree of Foo contains Nodes whose payload is an instance of Foo, you should be fine.

Hope this helps,
Sujit

Anonymous said...

Hi Sujit

Thanks.

Quick question: Can one say that this structure is similar to the Composite pattern. Can one write a composite pattern to have a tree-like behavior? Or is it different? if so in what way.

All i anm looking is for a tree like struture and the capability to traverse the particular tree path given a node.


Thanks
Paul

Sujit Pal said...

Hi Paul,

Yes the structure is a Composite pattern. So Tree contains Nodes, and Nodes contains references to other nodes. I think you should be able to traverse the tree given the root node and successive calls to getChildren(), if I remember the original data structure correctly.

-sujit

Waverick said...

The JDots project offers a more-or-less the same functionality

http://jdots.sourceforge.net

I use it in all my projects.

Sujit Pal said...

Thanks for the link Waverick, the project looks interesting, although the object naming is kind of strange, but will check it out.

Webster said...

Celko's approach is good, but I found the XPath Accelerator approach to be better. Check it out:

http://www.inf.uni-konstanz.de/dbis/teaching/ws0304/datatypes/xpath-accelerator.html

Anonymous said...

I used both the Node and Tree classes in one of my programming projects. Very nice code, easy to read and use. Also, it works wonderfully with XLMEnconder to save Node and Tree data to disk.

Thanks.

Sujit Pal said...

Hi, thanks for the feedback and glad it worked well for you.

Sujit Pal said...

Hi webster, this is an interesting approach, and I am sure everybody has used it for properties files keys naming, but this is a really good way to decompose the tree and store it in a relational database. Thanks for the link!

Anonymous said...

the easier way out would have been to use the JTree data strcuture provided by the java swing utilities ... ofcourse it wont be the corret thing to do ... but jtree does do the job ... it even maintains the original data formats when a object is assigned to a tree node... but thats just me being lazy :P

Sujit Pal said...

Thanks, I did look at the JTree some time back, but the data structure is quite complicated since its oriented to Swing programming. I actually missed the whole Java desktop GUI programming phase, coming to Java from a C and database background. So I found that I'll have to learn a lot of Swing related stuff to just use JTree, which is why I went the way of a pure data structure. Although I guess if you've already been programming in Swing, using JTree would be relatively painless.

Farooq M Khan said...

Good approach, I was thinking of using DOM Document instance to store my data which needs a tree like structure. Anyone tried this.
Do you thing there are any disadvantages with this approach

Julien said...

Really thanks for all. This article and code will be very helpfull. I have integrated this in 5 minutes and it works perfectly.
Thanks ! good job.

A french developper.

Sujit Pal said...

Cool! Thanks for the comment, Julien, glad it helped.

Sujit Pal said...

Hi Farooq, if you look at the comments on this post, there are suggestions to use the jdots project and the Swing JTree. So using the JDOM Document should probably also work fine, although I have used it to only do XML parsing stuff. I personally think that a Tree should be a core data structure in the JDK (java.utils) instead of developers having to depend on implementations created in response to specific requirements, which consequently may have additional functionality tacked on to work against the particular requirement. Just the purist in me ranting, I guess, but pragmatically, sure, whatever works.

rajani said...

Thanks it worked like a charm. I had problem while walking through the tree, as I had my tree nodes had unique gif images for each node.
But it worked great to begin with!

Sujit Pal said...

Thanks Rajani, glad it worked for you.

Andy said...

Thanks for this article, it's really helpful. I too am wondering why on earth there isn't a built-in Tree structure in Java!

My background is in Computer Science and after a few years off programming I am only just starting to get back into it in a serious way. I'm just wondering why you chose to make a Node consist of a data item and a list of nodes, rather than a data item and a list of trees? The classic tree structure (in Haskell) is as follows:

data Tree a = Null | Node a [Tree a]

which means that a Tree of type "a" can be Null (i.e. the empty tree), or it can be a Node of type "a" followed by a list of Trees of type "a".

Is this just imperative programming expedience or am I missing something?

Andy

Sujit Pal said...

Hi Andy, thanks for the definition. It is what I should have used if I had thought of it at the time. What I am doing currently is storing a list of Nodes instead of the Tree a on the RHS. This is a much cleaner and more succinct definition.

Jack said...

sujit, thanks a lot for the example.

i am a recent newb to java and am trying very hard to learn everything I can.

Suppose if I use only the tree[t] and the node[t] classes to create a tree in a java app. in that case how should i create the tree object in the main class?

i am trying to make a tree of strings.

thanks again.

Sujit Pal said...

Hi Jack, its been a while since I wrote this, so it may not be really accurate, but here goes:

Tree<String> t = new Tree<String>();
Node<String> root = new Node<String>("root");
root.addChildren(Arrays.asList(new String[] {"foo", "bar"}));
t.setRootElement(root);

The red Guy said...

I believe the toString method of the node should be as follows:

public String toString() {
StringBuilder sb = new StringBuilder();
sb.append("{").append(getData().toString()).append(",[");
int i = 0;
for (Node<T> e : getChildren()) {
if (i > 0) {
sb.append(",");
}
sb.append(e.toString());
i++;
}
sb.append("]").append("}");
return sb.toString();
}

Anyway - thanks for the code :)

Sujit Pal said...

Thanks for the correction, and yes, you are right, the recursive toString() method is better.

Anonymous said...

Hi All,
I took the inspiration from this example and had implemented building tree. if any one is interested, just let me know. I know it's not an optimized way. But surely works for me. Moreover it is written for JDK 1.4 (older version).

Thanks a lot,
Satish Dasari

Vijay said...

Thanks
I am trying to use your code to store some list of values which are in tree form.
I need few clarifications.
1) In findByParentId() why the limit is set to 1? Only 1 child is fetched in each level. If I remove the condition query.setMaxResults(); then the nodes are rendered in one single list without ordering.
2) As I do not use spring framework for Hibernate session I have changed find() method as:

public List find(final String hqlQuery, final String[] params,
final Object[] paramValues, final Type[] paramTypes,
final Integer start, final Integer count) {

Query query = getSession().createQuery(hqlQuery);
int paramsLength = (params == null ? 0 : params.length);
if (params != null) {
if (paramValues == null || paramValues.length != params.length) {
throw new IllegalArgumentException(
"ParamValues not completely specified for all params");
}
if (paramValues == null || paramTypes.length != params.length) {
throw new IllegalArgumentException(
"ParamTypes not completely specified for all params");
}
}
for (int i = 0; i < paramsLength; i++) {
query.setParameter(params[i], paramValues[i], paramTypes[i]);
}
if (start == null) {
query.setFirstResult(0);
} else {
query.setFirstResult(start.intValue());
}
if (count == null) {
//query.setMaxResults();
} else {
query.setMaxResults(count.intValue());
}
return query.list();
}

Is this the correct way of implementing the spring HibernateCallback.

Kindly suggest me.

Thank you,
Vijay

Sujit Pal said...

Hi Vijay, thanks for catching the problem, if it is a problem. Unfortunately, its been a while since I wrote it, so I don't really remember why the #-children should be restricted to 1, but logically I agree it should not be. Also the callback impl looks good, if it works for you, it must be good, right :-).

phonician said...

i would like to thank you for the nice work.

i wanted just to ask you if you know a way to insert Nodes for the Leaves of the Tree?? i need this to use it with your Code in one of my projects.

many thanx once more

Sujit Pal said...

Hi, thanks for the compliment.

To insert a node at the leaf, I think you would need to make sure its a leaf, and then add a node to it. Perhaps something like this?

if (node.getChildren().size() == 0) {
..Node childNode = new Node();
..node.addChild(childNode);
}

To navigate to the leaf, you would start at the root, then recursively call getChildren() until the Node has no children (and is therefore the leaf).

Anonymous said...

The code is much simpler if you initialise a node's children on creation. This simplifies the code a lot, but does use up a little more memory as even leaf nodes have an empty arraylist. For most people this won't be too much of a problem as large trees are not common.

In insertChildAt(), simply propogating the IndexOutOfBounds exception from the get() is a hidden side-effect. I'd prefer an explicit check of the index against the size of the child list and to throw the exception explicitly. Also, the 'else' in that method is redundant (because of the return).

Some of the comments in Node are slightly confusing. getChildren() doesn't return the children, it returns the list of children. Also, it should return a copy of the list so as to protect the list structure.

This list doesn't allow navigation up the tree, only down.

Apart from that, thanks, you've saved me some typing.

Sujit Pal said...

Hi, thanks for the review, I will incorporate your suggestions.

Cuong said...

Hello Sujit,

I just read your blog and I'm wondering how can I add a child into a tree by specifying the parent's name. I think recursive would be useful but I still cannot get it to work.

Let's say i have a tree like this:

Portfolio
|
Edition
|
Presentation
| |
PresentationType AccessRights

If I want to add a child node into Edition, I just give "Edition" as a parameter in my method.

I appreciate your help.
You can send me an email at: cuong.thai@gmail.com

Thanks a lot.

Cuong said...

Hello Sujit,

I'm wondering how can I code to add the child at a parent specified by name. For example, I have this tree:

Presentation
|
Edition
|
Presentation
| |
PresentationType AccessRight

Say, i want to add a child for Edition, I will pass "Edition" as part of my paramater in my method.

Thanks a lot.

Mark said...

There are a couple of tree classes in java.util:

http://java.sun.com/j2se/1.5.0/docs/api/java/util/TreeMap.html

http://java.sun.com/javase/6/docs/api/java/util/TreeSet.html

Sujit Pal said...

@Mark: yes, but in my understanding, the Tree data structures are hidden from the caller - both the TreeSet and TreeMap use a tree as an internal data structure to allow sortable keys.
@Cuong: its been a while since I wrote this, but I will take a look at the code and your use-case and comment later.

rahul popat said...

thank u so very much.. awesome.

Sujit Pal said...

Thanks Rahul, and you are welcome.

Mitch said...

How about performance for large tree structures? I always had a hard time when I needed to traverse thousands of tree node.
It results in a lot of lazy calls to the database.

Sujit Pal said...

Hi Mitch, not sure, since I haven't used this code for large trees. But in case of large trees (or graphs), you can try using either serialized versions of the data structure, or a graph database such as neo4j, which is still lazy, but optimized to this sort of usage.

Lopa said...

Very useful information with excellent documentation! Great job.

Yash said...

Hey Sujit,
First of all, thanks for this excellent work.

I know it's been while since you wrote it :) but I have tiny doubt.
Shouldn't overridden method "toString" of Node be like this one?

---------------------------------------------------
public String toString() {
StringBuilder sb = new StringBuilder();
sb.append("{").append(getData());
sb.append(getChildren());
sb.append("}");
return sb.toString();
}
----------------------------------------------------

Here as getChildren() is a list of Node, it already knows how to
represent it as String.

Also, we don't have to call toString on objects appending to StringBuilder as
it appends the string representation of the Object argument. This can also save you from ugly null pointer exception.

--Thanks--

Sujit Pal said...

@Lopa: thanks for the kind words.
@Yash: thanks, and yes, you are right. Like you said, its been a while, perhaps today I would do things a bit differently :-).

Anonymous said...

Hi Sujit,

You should post an improved version hehe, now that you have more experience (I suppose).

I liked your implementation, I'm wondering how you would proceed nowadays. I have exactly the same requirement: a relational db with a recursive table, and be able to represent that in Java as a tree, and filling objects/creating/updating/etc through Hibernate...

I'm tempted to do a straight copy-paste haha.

Thanks for the code.
John

Sujit Pal said...

Hi John, thanks for the comment. I don't have a real answer to your question...there are concerns about this implementation pointed out by various commenters, some of them come with solutions. I would advise using these solutions to make it more robust.

This probably doesn't qualify as an implementation, but more recently (about 2-3 years ago I think), I have used one of the ideas in the comments to represent a tree as a flat name-value set - the keys are dotted tuples, much like package names in Java, and use SortedSet.headSet() and tailSet() to navigate to the correct location.

Anonymous said...

Hi Sujit!

I know this article has been discuss for years but I was wondering if you could post a sample of a nested set model implementation of this problem. If you could also place some detailed comments to easily understand it, that will be deeply appreciated.

Looking forward to your immediate response.

Thanks!

Sujit Pal said...

Hi Anonymous, its been a while since I wrote this code so putting in comments (apart from whats already in there) is out, unfortunately. I completely forgot about the nested set model though, thanks for the reminder. Not sure when I will be able to do it though, depends on if I need a tree data structure in the near future.

shanthiya said...

hi.. really u code is very help full for me... i'm a college student dong package in m-way tree in java..
can i know how to delete a node and search a particular node from the tree.. im also having a doubt that what modification would i able to do to add one more field to a paricular node.

with regards
giri

Sujit Pal said...

Thanks Shanthiya. There is no good way to delete or search for a node in the code as written, but you can imitate Neo4j's strategy (kind of), and maintain a HashMap of the object with the node's location. So if an object matches one in the HashMap, then it can jump to the element and mark it deleted (or return the current position in case of search). Adding should be fairly simple, you just add to the ArrayList.

sandy said...

Thanks sujit... for ur suggestion.. sorry to disturb u again... im not able to find Neo4j's strategy here.. would u help me by providing it... and also i want to know that did u have any code like this to implement b tree in java

Sujit Pal said...

Its not in the code, its in my comments above - maintain a hashmap with a reference to the node, and jump to the reference and mark it deleted. The code in the post models an m-way tree, for a binary tree you probably don't need to maintain a hashmap (although that would certainly be quicker), you can leverage the fact that the binary tree must be sorted to navigate down to the node you want to delete or add.

sandy said...

thanks sujit.... what i asked is for Btree that is balanced m-way tree not a binary tree

jkhines said...

Thanks, Sujit. I am brushing up on my Java and this was an excellent example for me to go through. Thanks for posting it!

Sujit Pal said...

You are welcome, glad it helped.

Rocky said...

Perhaps we could use Directory Service (its a tree...)... or something like JDOM..

But, anyway... i want to build a ver huge tree, and be able to persist it quickly.. wat would be the best approach.

Sujit Pal said...

A directory service would be a good option for what you are looking to do, I think. You can also build your own directory service using a hashmap (in-memory, cache, etc) where the keys are dotted tuples. I would probably not choose JDOM if I were you, only because you probably don't want to pull everything into memory.

Deepak Lalchandani said...

Hi Salmon,
Could you please post your entire re-fined source code with table structure,POJOS,HBM and XML files in a ZIP file so that it is easier to download and work on it.

Please email me the above Zip file to deepakl_2000@yahoo.com once done

Sujit Pal said...

Hi Deepak, no I can't, sorry. Don't have it, all that there is in the post.

sameera said...

Hi Sujit,

Thanks a lot for the detailed example. Today I came across with a tree requirement for my java project and with this help got it working with no time

Instead hibernate, I used OpenJPA

thanks a lot again.

cheers

sameera

Sujit Pal said...

Thanks Sameera, glad it helped.

ahmetolgun said...

very nice and clear implementation. Thanks a lot. I plan to use this, to show a menu in a web page.

Sujit Pal said...

Thanks ahmetolgun, glad it helped you.

Anonymous said...

hello!
i'm doing an academic project where i need to construct an nary tree.yr code was really helpful.thanks for sharing it.but,as i'm a newbie 2 hibernate..i just wanted to know whether any more code has to be written to get the dao working(other than the main of course)..like should we create a session object?..i'm getting a nullpointer exception when i run my code..so i was wondering whether we have to do anything else which might ve been too obvious that u omitted mentioning it!!

Sujit Pal said...

Hi, its been a very long time since I wrote this, and I don't use Hibernate anymore, we use Spring-JDBC where I work. I am almost certain that you do need a Hibernate Session though...and the contents of the NPE should tell you more on what the code is expecting.

Mark said...

Thanks Sujit. This was very helpful. Very clear and exactly what I needed.

Sujit Pal said...

Thanks Mark, glad it helped.

gem said...

Thanks Sujit, it was very helpful for me.

gem

Sujit Pal said...

Thanks gem, glad it helped.

Anonymous said...

It's funny how a small chunk of code can help so many people :D

Thanks for the post.
C├║ddles

Sujit Pal said...

Thanks, glad it helped, and you're very welcome :-).

John said...

Nice post and I am in total agreement about Java needing a tree structure in it's standard library.

One more comment, I assume the member variables of the Node class should be private and to access you should use the getters and setter, in your class the member variables are public, is this a mistake?

Sujit Pal said...

Thanks John, and yes, the public settings seem to be an oversight on my part, since I have a public getter, it seems redundant to have the member variable itself be public.

Vishva Deepak said...

Hi Sujit ,

I need a help from you. I have list of agent and every agent have hierarchy level so i want to display these agent in tree form .

Sujit Pal said...

Hi Vishwa, I am guessing that you want to load your level annotated list into a tree structure, right? I would basically run through the list sequentially, checking previous level and if it changes, then either climb up or down the tree structure you are populating.

Priyanka said...

sir, i want to ask u one thing that after adding the node, if we add a dependency then how to check that the tree remains acyclic....

Sujit Pal said...

Hi Priyanka, I don't think there is a way to do it with the data structure itself. One way to check would be to maintain a Set of nodes already in the tree and not allow a node to be added if it already exists in the Set. Since you are building the tree top-down, the set would contain all the ancestors (and siblings) that have already been added.

mmrush said...

Hi,
I need to create a family tree. How can I adapt your code to use dynamic nodes?

For example,

for each member, I need to create an object Node sister = new Node();

How can I make a class to add members, for example tree.addnode("father");

Thanks a lot.

Sujit Pal said...

Hi mmrush, perhaps something like this:

Tree t = new Tree<Node>();
Node father = new Node("father");
t.setRootElement(father);
Node sister = new Node("sister");
father.addChild(sister);

Looking back at this code though, I think I should have added a getNodeById() method which uses an internal hashmap to store the node by id and Node should have a mandatory parameter "id". Then it becomes much simpler to locate a node in a tree rather than have to keep track of it yourself.

David said...

Above all, thank you for your work.
I have read a lot of webs and it seems there are no implementation for a BINARY (no ordered) balanced tree with Hibernate.

How would you change your code to do so?

Thanks in advance

Sujit Pal said...

Hi David, you are welcome, and I think one can build a binary tree out of this data structure quite easily (the rule being that each node can have either 0 or exact 2 children). Perhaps subclass the Node class and block addChild() (maybe throw UnsupportedOperationException), and instead add a addLeftChild() and addRghtChild() methods which append to positions 0 and 1 of your list. During saving you could validate that each node contains exactly 0 or 2 children.

Drago said...

Hi Sujit,

first of all, thank you for this article and code, they are very helpful!

I was curious if this code can be used in commercial projects, and if there are any restrictions/requirements for such a use?

Thanks!

Sujit Pal said...

Thanks Drago, and no, there are no restrictions on the use of this code. I came up with it myself, so there are no derivative licenses to worry about, and I don't have any restrictions to the use of my code if you find it useful. It has served me well at one point, so if it serves your purposes, by all means use it.

dan said...

im new to programming can u add these method to node class, and how could i implement them
public Node[] getSiblings()
{
Returns the sibling nodes of the current node
}

public Node[] children()
{
Returns the children of the current node as an
array of Node types
}

Sujit Pal said...

Hi Dan,

The code here is based on something I did a long time ago. But the two can be implemented quite easily with a bit of additional information. I recently used the ideas in this post to build a slightly different tree, like so:

class Tree<K> {
..TreeNode<K> root;
..List<TreeNode<K>> children;
}

class TreeNode<K> {
..K current;
..TreeNode<K> parent;
..List<TreeNode<K>> children;
}

So as you add each child to a TreeNode, you maintain a reference to the TreeNode in child.parent.

Then getSiblings(node) is just node.parent.children, and children() is just node.children.toArray().

Khuzaime Hadzri said...

Hello..im a final year student. my project is 2-way test suite generator for random value.this project use java eclipse.

Sujit Pal said...

Hi Khuzaime, thats good to know, but not sure how this relates to the blog post?

ajmath shaik said...

Excellent info ............
what is data structure and its types

Sujit Pal said...

Hi Ajmath, thanks for the link, nice summary of common data structures.