fromSeptember 2014
Article:

JSON or XML

What's Best for Your API?
0

Now that Drupal 8 has built-in support for Web Services, you’re likely thinking about exposing the content in your site with an API. But should you make the data available in JSON, XML, or both?

 Lady Justice

A Short History of XML and JSON

XML and JSON are the primary formats used for data exchange on the web. XML was born when some individuals involved in the Standard Generalized Markup Language (SGML) effort became early adopters of the Web. SGML is a way of defining languages for marking up documents, like HTML; XML borrowed many of the core principles and simplified the rest. The initial draft of XML was completed by a subcommittee of the W3C’s SGML Activity in 1996. Even in the early drafting process, it had support from many large technology companies.

In contrast, JSON (JavaScript Object Notation) is known for having been more discovered than invented: Douglass Crockford saw that language constructs already existing in JavaScript could be used to represent objects as strings. He coined the term JSON for this usage in 2001. It didn’t go through the standardization process, in part because it is a proper subset of the JavaScript standard. When Crockford was told by clients that they couldn’t use JSON because it wasn’t a standard, he bought json.org and put up a Web page declaring it a standard. JSON slowly gained popularity as people discovered the page. Since then, it has become an official standard, and support for encoding to and decoding from JSON has been added to many languages.

The choice between these two has been a topic of debate for nearly a decade.

Why is JSON Better?

JSON is lightweight. It often takes fewer characters to transmit the same information. For example, compare the following data in XML with the same data in JSON.

XML:
<pre>
<root>
<foo>text goes here</foo>
<bar>and here</bar>
</root>

JSON:

{
  foo: "text goes here"
  bar: "and here"
}

JSON is also easier for people to read, which is handy when you want to see what data an unfamiliar service is giving you, or debug a tricky property path. JSON maps better to objects in many object oriented languages. When you run json_decode on an object in PHP, you will get back a familiar-looking object which is easy to traverse. In contrast, working with XML often requires querying the data in the data structure using XPath or XQuery.

Why is XML Better?

The X in XML stands for extensible. It is possible to write different sub-languages of XML, which make sense within the context of a particular use case. For example, the KML standard makes it possible to describe geographic locations:

<Point>
        <coordinates>-74.006393,40.714172,0</coordinates>
</Point>

The XML ecosystem provides tools for defining such domain-specific sets of elements, called schemas, and the use of them is pretty common. There is now a standard for the same in JSON called JSON Schema, but it still isn’t core to most JSON usage.

The benefit of XML Schema is arguable, and there have been heated debates around the JSON Schema effort, e.g. the “JSON Schema considered harmful” thread on the IETF's apps-discuss list.

The structure of XML documents is more regularized. If a field switches from a single value to a multi-value field, it won’t break your code in the same way as it likely will in JSON. Just as with the previous point, it is possible to structure your data in the same resilient way in JSON, but such structuring would often be seen as over-engineering in a JSON format.

Why You Probably Want to Pick JSON

Ultimately, the most important consideration is what the consumers of your service expect and can handle. And this is likely to be JSON.

More and more, client tools are defaulting to JSON, even to the exclusion of other formats. For example, the very popular AngularJS, used to create single page applications, only supports exchanging data in JSON. It currently doesn't even support the x-www-form-encoded format traditionally used to POST form values (though support can be added through custom code or modules on top of core).

As client tools come to expect JSON, it results in more and more services defaulting to JSON. Tim Bray, a co-editor of the original XML draft, notes that “Most server-side APIs these days are JSON-over-HTTP.” This means that supporting XML is even less relevant for client tools such as AngularJS, creating a self-amplifying loop.

Tools like AngularJS are on a path to becoming even more popular. A major downside of single page applications (SPA) previously was that search engines like Google could not process the contents. However, in May of this year, Google announced that it was starting to solve this problem by executing the JavaScript on pages. This removes the one reservation that many businesses had about switching over to an SPA architecture – its negative impact on SEO. As an editor of the HTML spec, Robin Berjon, notes in his Web 2024 predictions, that this is likely to spur a massive growth of content being exchanged in JSON.

For more information on Google's handling of single page applications, check out the announcement on Webmaster Central.

This is not to say that JSON is always the right choice. For example, you may want to target a particular consumer ecosystem that makes use of XML. In this case you will probably also be making use of namespaces and standardized elements and attributes. It is unlikely that you would be able to use a simple XML serialization of your Drupal data model, which is what Drupal 8 core’s serialization module gives you.

Although you can make your data available in both formats, that increases the maintenance burden of your API, and really, it’s unlikely that anyone who can process your XML won’t be able to process JSON.

Bottom line: unless you are targeting a consumer who you know depends on XML, your best bet is to go with JSON.

Image: "Lady Justice" by Scott is licensed under CC BY-NC-SA 2.0