PNEC E&P Data and Information Management 2017

Teradata’s historical perspective on analytics. Shell’s ‘ESSA’ data automation. Total deploys Resqml. PPDM WIAW-as-an-ontology. Energistics’ Witsml 2.0. ConocoPhillips on AI and the auditable data pipeline. Devon’s ‘Sifter’ well log manager. Statoil’s 4D seismic data lake. ExxonMobil on Agile/DevOps. Huvr on the drone ‘data dilemma.’ EP Energy on ‘killing’ traditional data management.

Duncan Irving (Teradata) kicked off the 2017 PNEC conference with a ‘historical perspective on business analytics,’ from the labor saving and decreased ‘time-to-insight’ that business computing, from the late 1940s on, brought to resource planning, supply chain intelligence and the elimination of non-productive time. Irving’s thesis is that such data-driven transformation was principally a feature of low margin, highly competitive businesses. Oil and gas has been largely insulated from such pressures thanks to high oil prices.

The situation is changing now and the need for cost control and high process visibility has revealed a gap between the business and IT capabilities and a feeling that the business operates ‘despite, rather than in tandem with IT.’ In oil and gas, operational technology, petrotechnical and other disciplines are siloed by point software solutions and databases. Leveraging insights across the different domains remains a ‘significant data management challenge.’ Operations are plagued with ‘brittle’ data architectures, poorly contextualized interpretations and the lost data lineage. A new approach to data management is needed.

Enter the ‘curated data lake,’ a combination of software and hardware technologies that enables access to all a company’s data, from a seismic trace up. The data lake requires a new management paradigm as well, Irving advocates the creation of chief data officer and a team of analysts used to working in a data-driven paradigm. Most importantly, industry needs to move away from the ‘unsustainable’ way it procures analytical capabilities as point solutions or outsourced capabilities.

Cora Poché provided an update on Shell’s technical data management initiative that includes a control framework describing how the TDM discipline is managed and also, ESSA, a program to ‘eliminate, simplify, standardize and automate’ operating procedures to reduce time and costs and enhance data quality. One Essa program targets subsurface data, now addressed with a ‘front-office,’ co-located with business assets and a central ‘back-office.’ Key locally-executed tasks are being re-assigned to the central data operations center where they can be streamlined and optimized with a lean approach. Well headers, logs and commercial data have proved amenable for back office loading to the Shell corporate repository. Likewise seismic archival, spatialization of navigation data and potential field data loading.

Francis Morandini described how Energistics’ Resqml standard is now a key component of Total’s proprietary, in-house developed ‘Sismage’ CIG* geoscience interpretation suite. Resqml leverages the Energistics transfer protocol (ETP) to provide file-less data transfer using web sockets, streaming data over the network from one application to another. ETP also facilitates data exchange in and out of the cloud. ETP adds a unique identifier to data sources to unambiguously tie data objects together. Data is defined in schemas and ‘serialized’ using Apache Avro. Serialization means that a rich Resqml data object can be turned into an XML string and transmitted over the wire. ETP also hides database and architecture dependencies, making for services that are independent of the physical representation of the data. Total is now considering using ETP to drive cloud-based data exchanges and broaden adoption across the Sismage-CIG platform. Total has developed a Resqml plug-in for Sismage using the gSOAP toolkit and the FesAPI, an open source, multi-language Resqml library from F2I Consulting. The work has unveiled some issues with the Resqml protocol which will be addressed in a future release. Morandini encouraged others to join Total in support of the Resqml initiatives and to take part in Energistics SIGs to ensure that their own developers will not be troubled by missing Resqml features. In general, Resqml has proved stable and provides robust functionality and can support new capabilities as they are developed.

Mara Abel from the Federal university of Rio Grande do Sul, Brazil reported on an ambitious attempt, carried out with help from IBM, to recast the PPDM ‘What is a well’ document as an ontology-based standard using the web ontology language OWL. At issue is the fact that the concept of a well changes according to who is producing data. Geologists, reservoir engineers, drillers and managers each have their own viewpoint. The PPDM Wiaw document sets out to provide a common ground with definitions of key well components. While these have progressed understanding of the well lifecycle, Mara’s group wanted to see if an ontological analysis of the Wiaw concepts could lead to an even more rigorous set of definitions that would be amenable to machine-to-machine interactions. The Wiaw concepts were formalized into a domain ontology, leveraging the BFO upper ontology. The OWL definitions and Wiaw concepts can be downloaded for visualization in the ontologist’s modeling tool of choice, Protégé. The modeling process revealed certain ‘ambiguities, conflicts and vagueness’ in the Wiaw definitions. Mara noted in conclusion that the ontological analysis ‘was carried out without the support of authors in clarifying the meaning of the concepts.’ It is hoped that feedback from the PPDM standard authors will bring clarity in how the definitions have been interpreted and extend the formalization to include axioms that currently limit the application of ontology in exploration.

Jay Hollingsworth explained how Energistics’ latest standards, in particular Witsml 2.0, are now based on the common technical architecture. This allows for information on the trustworthiness of data to be transferred in real time, along with the bulk data. The new data assurance object is designed to make data ‘auditable, traceable, and fit for purpose.’ While the new functionality does not address the issue of data quality, it does provide support for ‘processes and rules that standardize data management, simplify data gathering for data analytics, and help reduce software development life cycles.’ Information on sensor data precision, calibration and other parameters can be included in the transfer. Rules may be determined by the values of the data itself or may be defined by company policy. As an example, a company may mandate that well location information must be provided as a WGS84 latitude, longitude pair. Energistics is now working on a blueprint to help companies realize the full potential of the new data assurance object.

Joanna Reast works in ConocoPhillips’ Analytics innovation center of excellence (the AICoE) that was founded in 2014 to support data science initiatives. In parallel with this, an enterprise data curation function was established to streamline data collection for the AI/machine learning effort. Curating data for analytics requires different processes and skills from traditional data management. New skills include discovering and evaluating new data sources and steering a path between regulations on the use of personal data. When a source of interest has been found, a sustainable and auditable data pipeline is created into the analytics environment. Data sources of interest include in-house data warehouses, competitor databases, vendor portals and application data stores.

Ron Clymer presented Devon’s ‘Sifter’ well log management solution, an in-house development around components from IHS (LOGarc), Perigon Solutions (iPoint) and EnergyIQ (TDM). The Devon Sifter is a log quality assurance methodology that captures a log’s confidentiality, vintage and provenance, runs files through a common suite of business rules and quarantines failures. Sifter performs a triage function to send logs into the appropriate LOGarc instances, keeping for instance, MWD sourced data separate from wireline or interpreted logs. iPoint’s curve alias tables assure consistent tagging of logs. Sifter also interface with Microsoft Outlook to capture contextual metadata in emails. The environment now feeds QC’s data into Halliburton/Landmark’s DecisionSpace Geoscience interpretation environment and the underlying OpenWorks data store. The DecisionSpace Integration Server now provides ‘near real-time’ data integration, bulk loading and automatic versioning. OpenWorks also delivers Sifter-derived source context information to end-users. Another beneficiary of the Sifter-curated data set is Devon’s data and analytics team which can now access regional scale well information in a Hadoop data lake, over 10 million files and 80 billion rows of curve data.

Teradata’s Jane McConnell picked up the data lake theme with a demonstration of the use of ad hoc analytics to investigate how different parameters influenced the repeatability of 4D, permanent ocean bottom seismic monitoring. (We reported from last year’s ECIM conference on Statoil’s 4D monitoring tests with Teradata.) Along with the seismic data itself, other data streams of interest included rGPS positions of the air gun array, gun fill time, air pressure and the exact fire time for each shot. Additionally, multiple dGPS systems provide per-second positional data of streamers, ship’s heading and speed. All making for a true ‘big data’ set, in a wide range of data formats. In addition ‘hind cast’ weather data was obtained from a third party provider along with yet more spreadsheets and observers log data. All this was mashed up using a combination of SAXification and nPath to look for hidden patterns in the data using Sankey graphs. The search turned up trumps, allowing for further optimization of the shooting operations. McConnel warned that, while in other branches of analytics, a correlation may be a sufficient basis for a business change, in scientific disciplines, especially in high-cost or safety critical industries, it is necessary to demonstrate the underlying physical cause of a correlation. To do this requires an interdisciplinary team of data scientists, computer and domain specialists.

Gabriela Bantau described how ExxonMobil has evolved an ‘agile,’ process to develop a user-friendly front end to ‘all available’ data sources. The process known as ‘DevOps,’ involves coding, testing and releasing software prior to monitoring and evaluation of user behavior and fitness for purpose. Results of the evaluation are incorporated into the following development cycle. Exxon sees IT’s role in oil and gas as changing, from ‘system-centered’ to ‘user-centered’ design. An internal ‘data accessibility’ project involved the creation of a front-end to the corporate repository to serve as the primary mechanism for data loading, visualization and export to engineering and geoscience applications, enabling the retirement of a number of legacy applications and workflows. The old ExxonMobil mindset was to develop a tool and ‘let the users figure out for themselves how to use it.’ Consumer technology has forced a rethink. Today’s services and solutions should be ‘seamless and simple’ as they are in users’ day to day lives. Key DevOps paradigms leveraged in the process were the minimum viable product, the Kanban agile framework and a scrum methodology. All of which allows users to be directly engaged with the development process, speeding development and cultivating trust between users and the project team. One takeaway was that the project did not adhered strictly to any of the above methodologies but rather combined appropriate facets of each.

Robert Albright (HuvrData) is ‘solving the drone data dilemma! Drones, aka unmanned aerial vehicles are used today in mapping and surveying pipelines and facilities. Sensors on drones create considerable data volumes, maybe over 25 GB/day which presents considerable data management challenges. Huvr has partnered with EdgeData to apply machine learning to analyzing drone imagery in near real time. While ML shows promise much more development is needed. Tests on wind turbine blade imagery show promise although images need to be flown in a very repeatable manner. Current accuracy is ‘about 85%.’ Huvr has also performed corrosion inspection of Hyundai’s Gusto P10000 ultra-deepwater drillship. Although some drone contractors claim ‘intrinsically safe’ hardware, Albright advises caution, ‘There are no Intrinsically safe drones that I am aware of!’ Moreover, guidelines and best practices for UAV use in industrial settings are nascent. Albright advises users to work with the industry on these, particularly through Stress Engineering’s Drone Users Group.

Last year, Chris Josefy (EP Energy) advocated ‘small’ data management. He now wants to ‘kill’ traditional data management! While all strive for good data delivery, it is the ‘how’ that is the problem. The ‘traditional’ model as exemplified by the Data governance institute’s DGI Framework is highly formalized with committees, councils, data stewards, policies and definitions. Instead, Josefy argues for ‘stealth data governance.’ Instead of creating new positions, ‘identify those already doing the work and add accountability and expectations.’ That way anyone can be a data steward and the tools of the trade are those you already have. Josefy’s presentation reminded us of a much earlier advocate of ‘non traditional’ upstream data management. Back in 1998 we reported from an earlier PNEC where Unocal’s Marion Eftink introduced the concept of ‘back-door’ data management with a similar objective.

PNEC E&P Data and Information Management 2017

Click here to comment on this article

Click here to view this article in context on a desktop

PNEC E&P Data and Information Management 2017

Sign up for occasional emails and subscription information...

Click here to comment on this article

Click here to view this article in context on a desktop