Watson at Digital Plant

‘Massively parallel probabilistic evidence-based’ system heralds move from ‘descriptive’ to ‘prescriptive’ analytics. Target applications include knowledge preservation of graying workforce.

Katherine Frase, VP research, IBM, provided the keynote to this month’s Digital Plant event in Houston. Frase noted the massive amount of data ‘at rest’ in databases along with the exploding amount of data ‘in motion’ from sensors and real time feeds. Such ‘big data’ can fuel three types of analysis: descriptive (rear view mirror stuff), predictive (what if) and prescriptive (control). IBM’s conventional toolset (SPSS, Cognos, Maximo, and Ilog) already provides ‘stochastic process optimization in the face of noisy data.’ Examples include equipment failure prediction, truck route optimization and minimizing unplanned maintenance of offshore platforms.

But the next frontier is the 80% of information that is ‘unstructured,’ i.e. text. Enter natural language processing (NLP) and IBM’s Jeopardy-winning ‘Watson’ with its (DeepQA) software. Watson is a ‘massively parallel probabilistic evidence-based architecture.’ The system was primed with multiple information sources including old Jeopardy questions and answers. Watson builds multiple possible answers which are evaluated when all processing is complete.

Watson uses the Apache Unstructured Information Management Architecture (UIMA) as a ‘standardized approach to handling information.’ Actually the version that played the game was below the skill level of top Jeopardy players. The tie at the end of the first round was considered a good result by IBM’s researchers.

IBM has tried Watson on healthcare, replacing Shakespeare and the Bible with Gray’s Anatomy and the Merck Index. Early attempts to play the American College of Physicians’ Doctor’s Dilemma challenge were not successful. But IBM has tweaked the system and demonstrated Watson’s merit as an aid to diagnostics.

Frase made a rather soft sales pitch to oil and gas. In the future, Watson may help in ingesting and preserving knowledge of operating procedures from a retiring workforce. Oil and gas, with its multiple information sources such as seismics, well data and real time should be a good candidate for big data analytics.

Comment—UIMA apart, Frase failed to make clear the contribution of open source software to Watson. According to Heaton Research, Watson runs on Linux and Hadoop. IBM reported a $1 billon/year investment in open source back in 2000 and produces the excellent IBM Developer Works resource.

Click here to comment on this article

Click here to view this article in context on a desktop

© Oil IT Journal - all rights reserved.