Hatton on forensic software engineering

Following last month’s editorial, we are pleased to print Les Hatton’s analysis of ‘software woes.’ Hatton, of Oakwood Computing Associates, studies ‘defects’ in seismic and other code.

Seismic processing IT guru Les Hatton (Oakwood Computing Associates) addressed the Birkbeck School of Crystallography in London last month on his specialty, ‘Forensic Software Engineering,’ or how to evaluate software reliability, security, cost and other ‘woes’. Hatton, who is Professor of Forensic Software Engineering at the University of Kingston, offered guidance on survival strategies for the avoidance of software failure. Hatton’s first observation is that many failures of software-controlled systems could have been avoided by applying techniques that are already known.

Fail gracefully

Such well established engineering techniques include designing systems to fail gracefully, minimizing effects on users. When failures do occur, it should always be possible to trace them back to a root cause. The aerospace industry came in for particular criticism with an Airbus’ uncontrolled dive attributed to ‘a fault in the automatic pilot.’ Test pilots of the F/A-22 (Raptor) fighter used to spend an average of 14 minutes per flight rebooting critical systems—this is now down to ‘only’ 36 seconds per flight.

Problem prone computers

Similar problems caused recall rates in the US motor industry to top 19 million in 2003, this despite engineering ‘never being better’. Experts cited ‘problem-prone’ computers as a significant factor. The NIST (US National Institute of Standards and Technology) estimated the cost of software failure in 2002 at $60 billion per year in the US. While a report from the Royal Academy of Engineering (UK) estimated that around £17 billion will be ‘wasted’ on software projects in 2004.

Forensic analysis

Hatton’s specialty, forensic software analysis, involves investigating defects in executable programs—both applications and operating systems. Code quality is measured in faults per thousand lines of executable (machine language) code (KXLOC). The ‘state of the art’ is represented by the NASA Shuttle software—estimated as having a 0.1 KXLOC fault rate. Windows 2000 is thought to have a fault count in the 2-4 KLOC range while Linux fares much better at 0.5 KLOC. It is thought that 5-10% of such faults will have a significant effect on the results or behavior of a system.

Compiler quality

Forensic Software Engineering involves the analysis of software processes, products and project failures—along with the study of operating system reliability, security and compiler quality. Hatton’s investigations to date suggest some guidelines for software project planners: no sub-task with software systems should be longer than 1 week, projects should be tracked weekly with progress published, programmers underestimate the time taken to do things by about 50%.

Parsing engines

Hatton has developed a range of code parsing engines for a variety of domains from embedded systems to geophysical processing. The latter involved the study of nine independently developed FORTRAN packages for seismic processing—the tests produced nine ‘subtly different’ views of the geology. The types of defect detected by the analysis proved to have had exceptionally long lives and ‘cost a fortune.’ Hatton advocates more static testing of code, to establish KLOC count, rather than the usual dynamic, run time testing.

Compiler complexity

But even the best programmers are going to have a hard time producing quality code because of the complexity and unpredictability of modern programming languages. These are ‘riddled with poorly defined behavior’, ‘numerical computations are often wrong however they are done.’ Forensic analysis can be applied to OS reliability, security, arithmetic environment and compiler quality.

OS of choice

Hatton has little time for Windows as an operating system citing mean time between failure (MTBF) for Windows 2000/XP as in the 100-300 hour range against over 50,000 hours for Linux. Most modern compilers fail the standard validation suites. In fact NIST stopped validating compilers in 2000 because it’s no longer cost-effective for them, now they are partly privatized. Hatton’s recommendations are that if you want a reliable and secure OS, don’t use Windows. Furthermore, computer arithmetic should be checked on mission critical code and compiler quality ‘should not be taken for granted.’

More on software forensics, quality and Hatton’s other passion, javelin throwing, from www.leshatton.org.

Click here to comment on this article

Click here to view this article in context on a desktop

© Oil IT Journal - all rights reserved.