Robin Berjon

XML Bad Practices


At the XML Prague 2009 conference I presented a paper on "Designing XML/Web Languages: A Review of Common Mistakes". Since much of the subject matter presented there is still the topic of heated discussion amongst specialists I thought it a good idea to make a pass through that paper, updating it based on feedback I have received and newer examples, and post it in blog form. I will post each section here as I go through this process. Today, we start simply with the introduction.

XML is now over ten years old and can euphemistically be dubbed a success. That being said, I don't believe I need convince readers that not all of its uses have been successful. Over time, many bright minds have attempted to describe how to best make use of it when designing vocabularies, but I believe it is safe to say that those efforts, no matter how excellent, have not been sufficient in ensuring that all applications of XML are produced in an entirely sane manner.

Part of the reason for that is education and outreach: people will often just grab XML and run, without digging around for best practices. But a larger problem is that XML combines simplicity and flexibility in such a way that a set of best practices only gets one so far in avoiding pitfalls. This does not mean that we are doomed to repeat mistakes over and over again, simply that we need to learn from our experience.

That is why this paper does not try to define a nice and simple manual as an amulet against poor vocabulary design, but rather intends to show some mistakes so that we may learn from them. As such, its organisation is more that of a shopping list rather than a treatise on XML.

Several errors outlined here use SVG as their source. This does not mean that SVG is the only language to make those mistakes, neither does it mean that SVG is a bad XML vocabulary — in fact, SVG rocks. While not at all SVG-specific, there are several reasons for me to appeal to it as an example in several instances:

As a final note before we delve into these mistakes, I would like to make it clear that this domain does not deal in absolutes. There are cases in which one may consider these mistakes to be good solutions; and a few of what I describe as bad practices may even be controversial to the point that they are considered by some as the right option in all situations. I do not see that as an issue: as a community we can discuss and disagree. What matters is that when choosing one way of designing a language over another, one be informed of the discussion so as to make one's own decision.

Table of Contents

Namespace Issues
XML Is For Humans
Language Issues
Wishful Thinking and Doe-Eyed Beliefs

Share and Enjoy!