The Badgerfish convention defines a standard way to convert an XML document to a JSON object. Their website lists tools written in PHP and Ruby, and even a web service, but I needed one for Java. Since the conversion rules are nicely enumerated on their site, it did not seem terribly difficult to write one myself, so I did. The code for the converter is modeled after the JDOM XMLOutputter, and allows for outputting either the compact format (for computer programs) or the pretty format (for humans). Unlike the JDOM XMLOutputter, however, methods are only provided to work with a JDOM Document object. An additional convenience outputString() method is provided to work with an XML string, converting it to a JDOM Document internally. The code is shown below:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 | package com.mycompany.myapp.converters;
import java.io.IOException;
import java.io.OutputStream;
import java.io.StringReader;
import java.io.Writer;
import java.util.List;
import net.sf.json.JSONObject;
import org.apache.commons.lang.StringUtils;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.jdom.Attribute;
import org.jdom.Document;
import org.jdom.Element;
import org.jdom.JDOMException;
import org.jdom.Namespace;
import org.jdom.input.SAXBuilder;
import org.jdom.output.Format;
/**
* Provides methods to convert an XML string into an equivalent JSON string
* using the Badgerfish convention described in http://badgerfish.ning.com.
*
* Conversion Rules copied from the website are enumerated below:
*
* 1. Element names become object properties
* 2. Text content of elements goes in the $ property of an object.
* <alice>bob</alice>
* becomes
* { "alice": { "$" : "bob" } }
* 3. Nested elements become nested properties
* <alice><bob>charlie</bob><david>edgar</david></alice>
* becomes
* { "alice": { "bob" : { "$": "charlie" }, "david": { "$": "edgar"} } }
* 4. Multiple elements at the same level become array elements.
* <alice><bob>charlie</bob><bob>david</bob></alice>
* becomes
* { "alice": { "bob" : [{"$": charlie" }, {"$": "david" }] } }
* 5. Attributes go in properties whose names begin with @.
* <alice charlie="david">bob</alice>
* becomes
* { "alice": { "$" : "bob", "@charlie" : "david" } }
* 6. Active namespaces for an element go in the element's @xmlns property.
* 7. The default namespace URI goes in @xmlns.$.
* <alice xmlns="http://some-namespace">bob</alice>
* becomes
* { "alice": { "$" : "bob", "@xmlns": { "$" : "http:\/\/some-namespace"} } }
* 8. Other namespaces go in other properties of @xmlns.
* <alice xmlns="http:\/\/some-namespace" xmlns:charlie="http:\/\/some-other-namespace">bob</alice>
* becomes
* { "alice": { "$" : "bob", "@xmlns": { "$" : "http:\/\/some-namespace", "charlie" : "http:\/\/some-other-namespace" } } }
* 9. Elements with namespace prefixes become object properties, too.
* <alice xmlns="http://some-namespace" xmlns:charlie="http://some-other-namespace"> <bob>david</bob> <charlie:edgar>frank</charlie:edgar> </alice>
* becomes
* { "alice" : { "bob" : { "$" : "david" , "@xmlns" : {"charlie" : "http:\/\/some-other-namespace" , "$" : "http:\/\/some-namespace"} } , "charlie:edgar" : { "$" : "frank" , "@xmlns" : {"charlie":"http:\/\/some-other-namespace", "$" : "http:\/\/some-namespace"} }, "@xmlns" : { "charlie" : "http:\/\/some-other-namespace", "$" : "http:\/\/some-namespace"} } }
*
* @author Sujit Pal
*/
public class JsonOutputter {
private final Log log = LogFactory.getLog(getClass());
private int indent = 0;
/**
* Set the format for the outputter. Default is compact format.
* @param format the format to set.
*/
public void setFormat(Format format) {
String indentString = format.getIndent();
if (indentString != null) {
indent = format.getIndent().length();
}
}
/**
* Converts a JDOM Document into a JSON string and writes the result into
* the specified OutputStream.
* @param document the JDOM Document.
* @param ostream the OutputStream.
* @throws IOException if one is thrown.
*/
public void output(Document document, OutputStream ostream) throws IOException {
ostream.write(outputString(document).getBytes());
}
/**
* Converts the JDOM Document into a JSON string and writes the result into
* the specified Writer.
* @param document the JDOM Document.
* @param writer the Writer.
* @throws IOException if one is thrown.
*/
public void output(Document document, Writer writer) throws IOException {
writer.write(outputString(document));
}
/**
* Convenience method that accepts an XML string and returns a String
* representing the converted JSON Object.
* @param xml the input XML string.
* @return the String representation of the converted JSON object.
* @throws IOException if one is thrown.
* @throws JDOMException if one is thrown.
*/
public String outputString(String xml) throws IOException, JDOMException {
SAXBuilder builder = new SAXBuilder();
Document doc = builder.build(new StringReader(xml));
return outputString(doc);
}
/**
* Converts the JDOM Document into a JSON String and returns it.
* @param document the JDOM Document.
* @return the JSON String representing the JDOM Document.
*/
public String outputString(Document document) {
Element rootElement = document.getRootElement();
JSONObject jsonObject = new JSONObject();
JSONObject namespaceJsonObject = getNamespaceJsonObject(rootElement);
processElement(rootElement, jsonObject, namespaceJsonObject);
processChildren(rootElement, jsonObject, namespaceJsonObject);
if (indent == 0) {
return StringUtils.replace(jsonObject.toString(), "/", "\\/");
} else {
return StringUtils.replace(jsonObject.toString(indent), "/", "\\/");
}
}
/**
* Process the children of the specified JDOM element. This method is recursive.
* The children for the given element are found, and the method is called for
* each child.
* @param element the element whose children needs to be processed.
* @param jsonObject the reference to the JSON Object to update.
* @param namespaceJsonObject the reference to the root Namespace JSON object.
*/
private void processChildren(Element element, JSONObject jsonObject, JSONObject namespaceJsonObject) {
List<Element> children = element.getChildren();
JSONObject properties;
if (jsonObject.has(getQName(element))) {
properties = jsonObject.getJSONObject(getQName(element));
} else {
properties = new JSONObject();
}
for (Element child : children) {
// Rule 1: Element names become object properties
// Rule 9: Elements with namespace prefixes become object properties, too.
JSONObject childJsonObject = new JSONObject();
processElement(child, childJsonObject, namespaceJsonObject);
processChildren(child, childJsonObject, namespaceJsonObject);
if (! childJsonObject.isEmpty()) {
properties.accumulate(getQName(child), childJsonObject.getJSONObject(getQName(child)));
}
}
if (! properties.isEmpty()) {
jsonObject.put(getQName(element), properties);
}
}
/**
* Process the text content and attributes of a JDOM element into a JSON object.
* @param element the element to parse.
* @param jsonObject the JSONObject to update with the element's properties.
* @param namespaceJsonObject the reference to the root Namespace JSON object.
*/
private void processElement(Element element, JSONObject jsonObject, JSONObject namespaceJsonObject) {
JSONObject properties = new JSONObject();
// Rule 2: Text content of elements goes in the $ property of an object.
if (StringUtils.isNotBlank(element.getTextTrim())) {
properties.accumulate("$", element.getTextTrim());
}
// Rule 5: Attributes go in properties whose names begin with @.
List<Attribute> attributes = element.getAttributes();
for (Attribute attribute : attributes) {
properties.accumulate("@" + attribute.getName(), attribute.getValue());
}
if (! namespaceJsonObject.isEmpty()) {
properties.accumulate("@xmlns", namespaceJsonObject);
}
if (! properties.isEmpty()) {
jsonObject.accumulate(getQName(element), properties);
}
}
/**
* Return a JSON Object containing the default and additional namespace
* properties of the Element.
* @param element the element whose namespace properties are to be extracted.
* @return the JSON Object with the namespace properties.
*/
private JSONObject getNamespaceJsonObject(Element element) {
// Rule 6: Active namespaces for an element go in the element's @xmlns property.
// Rule 7: The default namespace URI goes in @xmlns.$.
JSONObject namespaceProps = new JSONObject();
Namespace defaultNamespace = element.getNamespace();
if (StringUtils.isNotBlank(defaultNamespace.getURI())) {
namespaceProps.accumulate("$", defaultNamespace.getURI());
}
// Rule 8: Other namespaces go in other properties of @xmlns.
List<Namespace> additionalNamespaces = element.getAdditionalNamespaces();
for (Namespace additionalNamespace : additionalNamespaces) {
if (StringUtils.isNotBlank(additionalNamespace.getURI())) {
namespaceProps.accumulate(additionalNamespace.getPrefix(), additionalNamespace.getURI());
}
}
return namespaceProps;
}
/**
* Return the qualified name (namespace:elementname) of the element.
* @param element the element to set.
* @return the element name qualified with its namespace.
*/
private String getQName(Element element) {
if (StringUtils.isNotBlank(element.getNamespacePrefix())) {
return element.getNamespacePrefix() + ":" + element.getName();
} else {
return element.getName();
}
}
}
|
The only dependencies for this code are commons-lang, JDOM and json-lib. I guess I could have just used the methods built into String, but I have gotten too used to StringUtils doing null-safe operations for me. JDOM happens to be my favorite XML parsing and generation toolkit by far, even though there are many toolkits that are more popular because they are faster. I also prefer using json-lib for JSON stuff than the more popular org.json module because of the way json-lib is architected.
Most of the rules have expected outputs for a given input, so testing the converter was simply a matter of writing a JUnit test case and making sure the inputs returned the expected outputs. Here is the JUnit test I wrote to test the converter.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 | package com.mycompany.myapp.converters;
import java.io.StringReader;
import java.io.StringWriter;
import junit.framework.Assert;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.jdom.Document;
import org.jdom.input.SAXBuilder;
import org.jdom.output.Format;
import org.jdom.output.XMLOutputter;
import org.junit.Before;
import org.junit.Test;
/**
* Test for XML to JSON conversion tool.
* @author Sujit Pal
*/
public class JsonOutputterTest {
private final Log log = LogFactory.getLog(getClass());
private JsonOutputter jsonOutputter;
@Before
public void setUp() throws Exception {
jsonOutputter = new JsonOutputter();
// this call is redundant, really
jsonOutputter.setFormat(Format.getCompactFormat());
}
/**
* Rule 1: Element names become object properties
* <foo><bar><baz>baztext</baz></bar></foo>
* becomes:
* {"foo":{"bar":{"baz":{"$":"baztext"}}}}
*/
@Test
public void testBadgerfishRule1() throws Exception {
String xml = "<foo><bar><baz>baztext</baz></bar></foo>";
String json = jsonOutputter.outputString(xml);
log.debug("Rule 1:" + json);
Assert.assertEquals("{\"foo\":{\"bar\":{\"baz\":{\"$\":\"baztext\"}}}}", json);
}
/**
* Rule 2: Text content of elements goes in the $ property of an object.
* <alice>bob</alice>
* becomes
* {"alice":{"$":"bob"}}
*/
@Test
public void testBadgerfishRule2() throws Exception {
String xml = "<alice>bob</alice>";
String json = jsonOutputter.outputString(xml);
log.debug("Rule 2:" + json);
Assert.assertEquals("{\"alice\":{\"$\":\"bob\"}}", json);
}
/**
* Rule 3: Nested elements become nested properties
* <alice><bob>charlie</bob><david>edgar</david></alice>
* becomes
* {"alice":{"bob":{"$":"charlie"},"david":{"$":"edgar"}}}
*/
@Test
public void testBadgerfishRule3() throws Exception {
String xml = "<alice><bob>charlie</bob><david>edgar</david></alice>";
String json = jsonOutputter.outputString(xml);
log.debug("Rule 3:" + json);
Assert.assertEquals("{\"alice\":{\"bob\":{\"$\":\"charlie\"},\"david\":{\"$\":\"edgar\"}}}", json);
}
/**
* Rule 4: Multiple elements at the same level become array elements.
* <alice><bob>charlie</bob><bob>david</bob></alice>
* becomes
* {"alice":{"bob":[{"$":"charlie"},{"$":"david"}]}}
*/
@Test
public void testBadgerfishRule4() throws Exception {
String xml = "<alice><bob>charlie</bob><bob>david</bob></alice>";
String json = jsonOutputter.outputString(xml);
log.debug("Rule 4:" + json);
Assert.assertEquals("{\"alice\":{\"bob\":[{\"$\":\"charlie\"},{\"$\":\"david\"}]}}", json);
}
/**
* Rule 5: Attributes go in properties whose names begin with @.
* <alice charlie="david">bob</alice>
* becomes
* {"alice":{"$":"bob","@charlie":"david"}}
*/
@Test
public void testBadgerfishRule5() throws Exception {
String xml = "<alice charlie=\"david\">bob</alice>";
String json = jsonOutputter.outputString(xml);
log.debug("Rule 5:" + json);
Assert.assertEquals("{\"alice\":{\"$\":\"bob\",\"@charlie\":\"david\"}}", json);
}
/**
* Rule 6: Active namespaces for an element go in the element's @xmlns property.
* Rule 7: The default namespace URI goes in @xmlns.$.
* <alice xmlns="http://some-namespace">bob</alice>
* becomes
* {"alice":{"$":"bob","@xmlns":{"$":"http:\/\/some-namespace"}}}
*/
@Test
public void testBadgerfishRule6And7() throws Exception {
String xml = "<alice xmlns=\"http://some-namespace\">bob</alice>";
String json = jsonOutputter.outputString(xml);
log.debug("Rule 6+7:" + json);
Assert.assertEquals("{\"alice\":{\"$\":\"bob\",\"@xmlns\":{\"$\":\"http:\\/\\/some-namespace\"}}}", json);
}
/**
* Rule 8: Other namespaces go in other properties of @xmlns.
* <alice xmlns="http:\/\/some-namespace" xmlns:charlie="http:\/\/some-other-namespace">bob</alice>
* becomes
* {"alice":{"$":"bob","@xmlns":{"$":"http:\/\/some-namespace","charlie":"http:\/\/some-other-namespace"}}}
*/
@Test
public void testBadgerfishRule8() throws Exception {
String xml = "<alice xmlns=\"http://some-namespace\" xmlns:charlie=\"http://some-other-namespace\">bob</alice>";
String json = jsonOutputter.outputString(xml);
log.debug("Rule 8:" + json);
Assert.assertEquals("{\"alice\":{\"$\":\"bob\",\"@xmlns\":{\"$\":\"http:\\/\\/some-namespace\",\"charlie\":\"http:\\/\\/some-other-namespace\"}}}", json);
}
/**
* Rule 9: Elements with namespace prefixes become object properties, too.
* <alice xmlns="http://some-namespace" xmlns:charlie="http://some-other-namespace"> <bob>david</bob> <charlie:edgar>frank</charlie:edgar> </alice>
* becomes
* {"alice":{"bob":{"$":"david","@xmlns":{"$":"http:\/\/some-namespace","charlie":"http:\/\/some-other-namespace"}},"charlie:edgar":{"$":"frank","@xmlns":{"$":"http:\/\/some-namespace","charlie":"http:\/\/some-other-namespace"}},"@xmlns":{"$":"http:\/\/some-namespace","charlie":"http:\/\/some-other-namespace"}}}
*/
@Test
public void testBadgerfishRule9() throws Exception {
String xml = "<alice xmlns=\"http://some-namespace\" xmlns:charlie=\"http://some-other-namespace\"> <bob>david</bob> <charlie:edgar>frank</charlie:edgar> </alice>";
String json = jsonOutputter.outputString(xml);
log.debug("Rule 9:" + json);
Assert.assertEquals("{\"alice\":{\"bob\":{\"$\":\"david\",\"@xmlns\":{\"$\":\"http:\\/\\/some-namespace\",\"charlie\":\"http:\\/\\/some-other-namespace\"}},\"charlie:edgar\":{\"$\":\"frank\",\"@xmlns\":{\"$\":\"http:\\/\\/some-namespace\",\"charlie\":\"http:\\/\\/some-other-namespace\"}},\"@xmlns\":{\"$\":\"http:\\/\\/some-namespace\",\"charlie\":\"http:\\/\\/some-other-namespace\"}}}", json);
}
}
|
I did not know much about the Badgerfish convention until quite recently. This move towards being able to generate XML into a standard JSON format seems really cool, and I wonder if it is widely used. Frankly, given that so many Java applications use XML and JSON, I was hoping to just snag the code from the net, rather than have to write it myself. If you use Java and generate JSON using the Badgerfish convention, I would love to know of alternative approaches you may be using for the conversion.
I use the following style sheet with XSLT to translate XML to JSON in java. Works pretty well:
ReplyDeletehttp://www.bramstein.nl/xsltjson/
Thanks very much, Erik, I'll check out the link.
ReplyDeleteCould you please clarify the licensing terms for this code? I am about to write a similar code and I would love to avoid rewriting it again since this code already exists.
ReplyDeleteHi Krokodil, you are welcome to use the code if it is of use to you. There are no licensing terms attached, except maybe attribution if you are using this as part of some open source project.
ReplyDeleteThank you very much and congratulations for this class (and class test): this is exactly what I was looking for my project.
ReplyDeleteYou are welcome.
ReplyDeleteGreat work sujit. This is what i'm looking for. Do you have similar code for JSON to XML conversion.
ReplyDeleteAgain thanks for this.
Regards,
Siva
Thanks for the kind words Siva. I haven't built one for JSON to XML (I don't need one), but it should be simple to do using Jackson to convert the JSON to a Java object and then XStream from Java object to XML.
ReplyDeleteThis is workin perfectly fine
ReplyDeleteused to convert normal xml to a json of badgerfish style
xml tags are denoted with $
xml attribute are denoted with @
Very usefull to convert xml to badgerfish json
ReplyDeleteThank you Arshad for the confirmation!
ReplyDelete