Cavaliers, roundheads and XML

Oil IT Journal editor Neil McNaughton gets behind XML Schema and hears echoes of earlier attempts to constrain programmers’ flights of fancy. But while attempting to make validate, he discovers that it’s the web browser itself which refuses to take any notice of the standards. A big problem for corporate IT?

One of the oldest IT anecdotes concerns some NASA Fortran code from the dawn of time in which a mistyped comma turned into a decimal point. Thus instead of kicking off a ‘do loop’ as in

DO 10 I=1,10

The compiler initialized a new variable ‘DO10I’ as in

DO10I = 1.10

This error has been credited with causing the failure of NASA’s Mariner I Venus probe, although is seems that this is not exactly how things happened. What I find interesting in this IT apocrypha is the lesson it holds for programmers. This, and a host of other errors can be easily avoided simply by declaring your variables! Good programming shops do this and take it further by making variable names tell a story—so that an integer variable might be declared as nAnecdoteRepeats, reminding the editorialist/programmer that this variable holds the number of times a particular story has been told.

Queen of Hearts

Now it should be realized that computer programmers, like the population at large, divide into two camps—pleasure seeking cavaliers and roundheads. To the former, the very notion of declaring a variable is an attack on their personal freedom. The ‘cavalier’ programmer, like the Queen of Hearts, expects variables to mean just whatever they want them to mean at a point in the program. Some languages encourage these bad habits by not having variable types; by implicit typing, by allowing coercion or through the ‘gender-neuter’ variant type of Visual Basic. What is important here is that although these dangers are well understood, compiler writers have in the past aided and abetted cavalier programmers by encouraging such bad habits.

Minimize risk

I say in the past because modern programming languages and programming shops increasingly mandate variable declaration—and other quality-oriented best practices to minimize the risk of program failure. Such constraints on what is going to happen within a program are a contract between the programmer and the compiler—which is continuously testing for non-conformance.


What brought the old Fortran anecdote and the variable preaching on was my reading of a book on XML Schema*. A dry tome which set out to render digestible the massive, uncompromising official W3C documentation. I’m afraid that the XML book failed to turn the subject into a good bed-time read. XML is not for cavaliers, it’s not much fun. XML is serious stuff! The spec goes way beyond the ‘<CuteTag>Hello</CuteTag>’ of the presentation to executives. XML is about extending the contractual nature of programming into the data domain. It is about data typing, constraints and validation.


Which I admit, returns me to the subject of last month’s editorial. We have been trying to validate the HTML frames-based pages which have served reasonably well for the past three years or so. It turns out that it is impossible to produce frames that look right in the main browsers, and that at the same time validate. I will spare you the details—it’s all to do with where the “frameborder” attribute goes. The W3C DTD says ‘here’ and the browsers want ‘there’. Now I must be a roundhead programmer by nature—because I like frames. With a few lines of code, you can create quite complex interactivity—and although they have been around for a while—way before XML—they correctly separate content from presentation.


My enthusiasm for frames is not shared by the majority of web developers today. A few lines of compact code achieving a lot is anathema to the cavalier. Look at most any website (in Explorer use Tools—>View Source) and you will see pages of JavaScript (full of un-typed variables by the way!) catering for various combinations of operating system and browser. The commercial ‘browser wars’ have left a scorched earth in their wake. Instead of writing to a DTD, most web programmers’ time is spent coding around browser idiosyncrasies. This undoubtedly provides endless futzing-work for the programming community.

Fault oblivious

IT today is predicated on the browser being a neutral, consistent vehicle for HTML, XML web services and the rest. This, unfortunately is not the case. If you switch on debugging in your browser (Tools->Internet Options) you will get an error on around one page in every three you visit. Because browsers are designed to be fault oblivious rather than fault tolerant, these errors, and the complex code that generates them are normally hidden.

Open Source

The current state of affairs is really quite remarkable. The whole infrastructure of the internet is a monument to interoperability through standards compliance. Except for the last link in the chain—the software delivery vehicle itself—the browser. Corporations which intend to leverage web technology (that means all of you!) should reflect on this situation. It will be impossible to realize the stability and predictability benefits of XML ‘contract-based’ development until we have compliant browsers. A great project for the Open Source movement—but why doesn’t corporate IT get into the driving seat where it belongs?

* XML Schema JJ Thomasson—Eyrolles 2002. ISBN 2-212-11195-9 (in French).


Some great new content on this month. Jerome Bellian of the University of Texas at Austin has contributed an illustrated exposé of the use of Lasers in digital geological outcrop mapping.

Click here to comment on this article

Click here to view this article in context on a desktop

© Oil IT Journal - all rights reserved.