XML Bad Practices
Unreadable Names
After a number of articles on namespaces bad practices, follow a few that talk of the human in the loop. XML was intended to be human-readable. While that idea may make some people chortle, it is still a worthy goal to design with human readability — and writability — in mind. After all, if all one needs is a way to dump data that is only to be readable by machines in a format available anywhere other options will be faster and simpler, e.g. JSON or YAML. This article is part of a series based the paper on "Designing XML/Web Languages: A Review of Common Mistakes" which I presented at the XML Prague 2009 conference. Many of the mistakes in this section are by no means limited to XML, and tend to apply to other contexts — notably programming — as well, but they are nevertheless worth recalling.
There are two primary ways in which one can make element and attribute names hard to read.
The first is to make them too short when they are not common elements. It is a good idea
for instance to use p
for paragraphs as it is an extremely common element,
but it is more dubious to use s
. Is that going to be for strike-through or
sup text? Should it have been kept for span
?
The other is compound names. Those can be difficult to read for native speakers of the
language from which the names come from (typically, English) even though one has a natural feel
for word boundaries; they often become hell for people who do not know the language well.
The most common offender in this category is DocBook (I believe largely for historical
reasons, and then for consistency). To wit: personblurb
, personname
,
audioobject
, imageobjectco
, inlinemediaobject
,
qandadiv
,
classsynopsisinfo
, citebiblioid
, simplemsgentry
...
the itemizedlist
goes on.
Before you call in Norm Walsh to ask if he demands revenge, I'd like to point out that I've already bought him copious amounts of beer for stating the above while he was in the room. At least I think I did.