PNEC Data Integration 2004, Houston

This was a well attended PNEC (250 pre-registered), with a high proportion of company presentations reporting on real-world data management achievements. Catalogues remain popular as witnessed by papers from Shell and Halliburton and the W3C Semantic Web initiative is emerging as a potential solution to the taxonomy problem. Russian oil major Yukos presented a refreshing look at merits of building vs. buying software. PNEC now welcomes both PPDM and POSC—resulting in timid, but significant joint POSC/PPDM presentation. Panel members agreed that data management was under-recognized and under-funded. ChevronTexaco, ExxonMobil and Shell all plugged our ‘star of the show’—Innerlogix’ Datalogix data clean-up tool. On the technology front, ExxonMobil presented a new XML standard for Fluid reporting, ‘FluidReportML’.

Carol Tessier described how Pioneer now has 100 communities and around 1,000 users. Even superficial ‘lipstick on the pig’ solutions can be useful as was moving Pioneer’s Artesia ERP ‘green screens’ to the browser. Pioneer’s Portal leverages various components including Schlumberger’s Decision Point (ArcIMS-like GIS), Unify NXJ (workflow) and Citrix MetaFrame for Unix access. Open Spirit is being integrated and will become a ‘critical piece of future developments’. Content management for Sarbanes Oxley reporting is ‘just exploding for us’.


According to Pat Meroney, ConocoPhillips (CP) has been ‘immersed in the merger’ for the last 18 months. Meroney’s group, which had to contend with shared servers and Oracle databases ‘everywhere,’ was tasked with improving data delivery and developing a data management strategy. CP has a home-grown data integration layer between databases and front ends such as the web browser, GIS and Excel. The solution involves a ‘metadata server’ and web services, based on a spatial data catalog. Business and spatial data are stored in separate databases with GIS browsing though ArcIMS.


Guy Moore related another post-merger data management tale. The starting point was ‘a mess’ with Chevron, an IESX user, and Texaco on OpenWorks. The Gulf of Mexico (GOM) unit supported 120 earth scientists, over 1,000 projects and 16TB of seismic interpretation data. ChevronTexaco (CT) set up a change management unit of IT support staff and subject matter experts, located at the Landmark ‘super site’ in Houston. Documenting conversion responsibilities and decisions helped avoid ‘finger pointing’ and ensured projects ran smoothly. CT went from 311 to 24 OpenWorks projects; and data volumes declined to 5TB—a testament to interpreters’ ability to ‘chuck stuff away’ (into the archive). This entailed considerable savings—for every 10% of CT’s data volumes archived, a $250,000 saving in data server costs was achieved.

Semantic Web

Oil IT Journal editor Neil McNaughton first encountered the ‘semantic web’ when developing an RSS news feed for Oil IT Journal. One quick win for RSS was the ability to read the feed from a Java-based smart phone. The techniques which form the semantic web may impact upstream taxonomy development and sharing. These include widespread use of XML namespaces to expose local taxonomies and the use of the Resource Description Format (RDF), a simple triple-based data modeling construct which allows metadata to be embedded in complex XML documents. RDF metadata is embedded in Adobe PDF where it leverages the Dublin Core metadata standard for bibliographic information. McNaughton’s paper is available on the website.

XML standards

Trudy Curtis (PPDM) and Alan Doniger’s (POSC) joint presentation was most significant for its taking place. Doniger and Curtis made a promise of agreement on schema design principles—especially on units of measure—and on ‘profileable’ schemas. In the context of catalogues and taxonomies, the Petroleum Industry Data Dictionary (PIDD) ‘is coming back to life’.


Mike Underwood (ChevronTexaco) advocates a proactive approach to data management, an approach developed by adapting best practices from ‘heritage’ Chevron and Texaco. Some 500 OpenWorks project databases were consolidated using Innerlogix’ Datalogix data QC and cleanup application. Datalogix usage was extended to project management with the adoption of pro-active processes for project database management. CTC is well pleased with the Innerlogix tools—Underwood said ‘Datalogix is the best thing our data managers have seen in the last five years.’


Shell’s data managers now deploy ‘pick lists’ of standard attribute names and quality processes according to John Kievit. Shell’s workflows pipe data from various databases and public data sources into interpretation systems (OpenWorks) and data browsers (PowerExplorer). Shell used to have data ‘hoarders,’ who claimed to be ‘too busy to archive’. This inevitably led to wasted time looking for inaccessible data. Today, Shell has a ‘golden bucket’ of QC’d, screened and compliant data. Shell established standard names for well logs and curves. ‘Amended’ data vendor contracts now stipulate the format and delivery mechanism. Data is loaded via an ‘advanced data transfer’ system using data ‘blending rules’ to establish which records to keep. Data history capture is automated using Innerlogix’ DataLogix QC workflow, a ‘health check’ for Shell’s corporate data. The data workflow is a resource intensive process, with thousands of vendor transactions per month to be QC’d before loading. Shell has now ‘stopped the bleeding’, reduced log deliverables and made it easier to use data.

Panel discussion

Ellen Hoveland (Anadarko) bemoaned the data management’s low profile, the subject has ‘little recognition’ and is generally regarded as ‘plumbing.’ Sarbanes-Oxley means that we all have to do a better job. Charles Fried (BP) believes we are ‘still in the stone age regarding data bases.’ Problems exist with data in Excel spreadsheets and shared drives are ‘all filling up.’ Alan Doniger (POSC) described structured and unstructured data as a continuum noting that the same metadata constructs should be used across the board. This should leverage metadata standards like Dublin Core and semantic web techniques such as OWL and RDF. Trudy Curtis (PPDM) believes too that data management has not got the respect it deserves. Curtis recommended the technologies developed by the W3C as having application to data integration. Knut Bulow (Landmark Graphics) spoke of ‘trans-integration,’ which is ‘more than technology’ and implies a corporate philosophy and implementation. Fried stated that BP has done ‘a poor job’ of tagging metadata from BP’s heritage datasets. Madeline Bell (ExxonMobil) set out everyday concerns to the interpreter such as ‘is the base map complete?’, ‘have I used every log?’ and ‘what color should this horizon be?’ Fried concluded by opining that little has changed over the years, all these disparate data types are ‘a pain in the butt’ to manage and that there is ‘still no money’ for this activity.


Robert Aydelotte (ExxonMobil) described ExxonMobil’s ‘standardized’ XML-based fluid reporting protocol. The protocol integrates ExxonMobil’s Common Operating Environment (COE) and proprietary fluid characterization applications. Previously fluid property data was orphaned and often in the ‘DODD store’ (data on disk in desk)! Fluid-ReportML captures context in the form of a PDF document describing how things were done. This is stored on the file system and the XML goes to XOM’s corporate database. Information includes fluid transfer, reservoir conditions, fluid data, J-curve and separator tests and lots more. FluidReportML has been submitted to POSC as a candidate for a POSC standard.

E&P Taxonomies

Jeroen Kreijer described Shell’s previous efforts including the Shell Expro Discovery work and NAM’s own catalogue. The former revealed weaknesses in Shell UK’s IM practices. The NAM initiative shared the same objectives, but was largely disconnected from the Discovery work, ‘typical of Shell!’ Kreijer notes that taxonomies can be hierarchies or lists. For Kreijer, ‘unstructured data doesn’t exist, it is rather data of unknown structure’. Shell’s catalogue system (with reference to Flare Consultants) links content and context into knowledge collections to support asset integration, process compliance, organizational views and project delivery. The best ‘connection point’ for setting up a catalogue is the specific task—drilling a well, inspecting a pipeline—because of the ‘universality of the action.’ Shell’s ‘globally usable catalogue’ uses the concept of ‘document natures’. This strange term is used because ‘no one knows what it means – so you can spend time with them explaining what the system is all about’. Some 2,500 ‘natures’ uniquely identify each item.


Vitaly Kransnov’s company, troubled Russian supermajor Yukos, operates several giant oilfields in Siberia, typically with thousands of wells per field. Operational decisions are supported by near real-time modeling and simulation. This is enabled by ‘tight integration of knowledge, IT and data’ at the local level where ‘everyday decisions are based on simulator output’. But Kransnov’s most startling revelation was that all the software used was developed in-house. This is because, in Krasnov’s words, ‘We can’t use commercial software to add value to our assets. These tools are over-generalized and lack vital features. They are over-complicated and hard to use. They offer poor connectivity to the corporate database and no national language support. It is also impossible to introduce innovations in a reasonable time.’ Yukos adapts its software to its corporate knowledge, not vice versa. Little is left to chance: education in the use of the software is provided through a technical website, backed up by an ‘expertise team’. The operational workflow begins with the selection of a promising sector via the ‘web waterflood portal.’ Other tools guide users through well studies with PVT and SCAL ‘wizards,’ reducing analysis time ‘from months to hours’. The R-Viewer visualizes streamline simulation and history matching performed on the YUSIM simulator and ‘draws users towards useful information.’


David Lamar-Smith, (Halliburton) described the ‘content problem’ of unmanaged, uncategorized information. Halliburton opted for a simple corporate taxonomy merging existing intranet product line hierarchies and SAP equipment and organizational codes. The University of Tulsa Abstracts and POSC’s catalogue also ran. All this was put in a giant spreadsheet for loading into the Interwoven content management system and Plumtree portal. Documents can be published in various ways; a ‘customer’ view, specific to a client, a business view (e.g. HR) or a product line view. Taxonomies are categorized by ‘facets’ including content type (technical document, sales document etc.), location, E&P lifecycle, business process etc.. The system automates and simplifies document classification. Search is enhanced by a search engine that understands the taxonomy. Smith strongly recommends migrating, combining enterprise hierarchies into a taxonomy, removing redundancies and associating synonyms.

This report is abstracted from a 15 page report produced as part of The Data Room’s Technology Watch Reporting Service. For more information on this subscription-based service please email

Click here to comment on this article

Click here to view this article in context on a desktop

© Oil IT Journal - all rights reserved.