Jim Crompton on data, integration and semweb’s promise

Chevron’s iField guru thinks the data pipeline is kinked. Will semantics straighten it out?

In the keynote address to the World Wide Web Consortium’s (W3C) Semantic Web in Oil and Gas Workshop (page 6) Chevron’s iField program advisor Jim Crompton recalled the bygone days of the well organized paper file room saying, ‘Life has never been that good since.’ Companies ‘lost control’ of data in the passage to the digital world. Surveys show that even today, 30-70% of our time is spent finding data. Knowledge workers may only have access to a small fraction of the information they need—so, ‘they give up looking and go with a best guess.’ Unfortunately, as the collective experience of the workforce diminishes, the quality of ‘best guesses’ is declining.

The information age has given us Google-type full text search. This is all very well, but for accurate search, you still need to tag documents and you need to know if you have found everything. The reality is that we can still only search accurately within a given environment. Chevron’s intranet search is OK, but does not include email and many databases. Search has become siloed.

Meanwhile the information pipeline is fed with increasingly large amounts of data from low cost sensors with implications for HSE, operations and maintenance. Downhole intelligent completions provide real time data and an exponential increase in data gathering capacity. Our modeling capability is likewise now ‘massive’. But between the expanding data volumes and the modeling capability there is a ‘kink in the pipeline,’ a yawning gap between data collection and modeling/analysis for decision support.

But we are reasonable people! How did we screw it up? The answer is in the economic cycles – when the cycle is down, there is no money for information management. When the going gets good, acquisitions and mergers kick in. Chevron has experienced ten years of data hell following the Texaco merger! After a merger, nobody knows how the legacy systems work. The next worst thing is when someone leaves and hands over their spreadsheet. Engineers are not very good at documenting what they do.

The business impact of the current state of affairs is that engineers use month-old data for decisions. Optimization works fine at a small scale, but multiple attempts at optimization at different scales hamper the ‘big picture’ optimization. It is hard to react to dynamic changes such as water breakthrough or equipment failure. Today we are reactive and we need to be more proactive.

Regarding the semantic web, haven’t we been here before? Yes, not perhaps exactly, but there have been other attempts to bridge IM and the business. Even IT folks shy away from this problem.

Crompton believes that the path to data sanity lies in data governance, a reference and master data architecture spanning structured and unstructured data, and data quality. Such an information architecture is a new thing for Chevron. Standardization is important but ‘we need to go further and allow for portfolio rationalization, we need a common language.’

Crompton advocates a three tier architecture. SOA may be a way to achieve this but oil and gas has not yet got a taxonomy to support this—a potential role for the semantic web. This could be applied to legacy data by re-tagging and structuring the data. The industry has had some success with XML data protocols. Chevron came late to the WITSML party but has been more proactive with PRODML. There have been some successes here, but Chevron is still ‘kicking the rock’ on internal deployment. We already tried the one big data model approach and it failed miserably. Today Chevron has 600TB of data (70% technical, 25% business, 5% financial) and this is growing. There are currently 300 million Office documents.

Click here to comment on this article

Click here to view this article in context on a desktop

© Oil IT Journal - all rights reserved.