2018 PNEC E&P Data Management Conference, Houston*

ConocoPhillips on the ‘million wells yet to be drilled in the US.’ Total’s data management revolution. Fixing Shell’s data ‘Tower of Babel.’ DrakeWell, ‘your data pipeline is your competitive advantage.’ BP’s GIS-enabled common operating picture. CGG on machine learning and document classification. Newfield’s daily text crawler. Talus, ‘don’t put all your data online.’ Shell, the Data Doc and sustainable petrophysical workflows. Enaxis defines the data lake. Wipro on the power of ‘small’ data management. Anadarko and Perigon’s ‘intelligent autoloader.’ PDS’ open source Witsml server.

The Petroleum Network Education Conferences (PNEC) annual E&P data management conference was acquired by PennWell (publisher of the Oil & Gas Journal) in 2013. PennWell itself was acquired by Blackstone-based event organizer Clarion just before the 2018 event. 2018 turnout topped 500.

ConocoPhillips’ Greg Leveille sees data management as blending into analytics. As such, it makes up one of the two recent ‘revolutions’ in oil and gas, the other being shale a.k.a. the ‘unconventional and analytics’ revolution. We are living through a period of ‘amazing change.’ 90% of US wells are horizontal and the volumes found in the past ten years are mindboggling! There are maybe a million wells yet to be drilled into US unconventional targets with correspondingly massive future resources exceeding the 450 billion barrels of historical production. Data analytics are key to completion design, a multi-dimensional problem spanning production, geoscience, rock mechanics and more. ConocoPhillips has seen a 50% reduction in shale ‘spud to spud’ time with the application of analytics. Unconventionals represent a fantastic proving ground for new technology as experiments can be performed far more quickly than offshore. Leveille sees more digital/analytics progress to come. Another area of progress is automated drilling, ‘removing the human from the drilling process,’ leveraging technologies such as the instrumented top drive. Real time measurements can be related to historical data in microseconds for optimization. In conclusion, ‘both revolutions are still relatively immature technically but already combine powerfully.’

Elodie Laurent reported on another ‘revolution,’ in Total’s data management. In the last few years, Total has re-tooled its data organization, introducing automation, tracking and auditing its processes, reviewing contracts and using more standards. The data management organization provides a single point of contact and a data gate ‘dropsite’ for well and seismics. QC’d data is available through a web portal for upload into Sismage (Total’s in-house seismic workstation), Petrel, Geolog and its corporate databases. A SQL database tracks requests across the data workflow, adding comments and metadata. A ‘QSY’ app for SEG-Y seismic QC includes a maps and section viewer, ‘a huge time saver.’

Leah Camilli was troubleshooting a massive spreadsheet containing data on Shell’s Brazilian unit and found critical data that had been overlooked. Also, different data was being used for the same objectives, a ‘Tower of Babel.’ Although users wanted ‘another database,’ this was the last thing they really needed. The crux of the matter was data visibility. Camilli deployed Spotfire with Alteryx in background (this was later replaced with native Spotfire functionality). Spotfire now sees all original data through a common UWI linking its various corporate data stores. The solution provides Spotfire dashboard plots of lithology in ‘a successful combination of the Spotfire toolset with embedded data management personnel. ‘Tomorrow, AI may dominate the workplace but right now it’s people interacting with people.’

Brian Boulmay provided an update on BP’s OneMap Esri-based geospatial data management system. BP’s GIS platform has expanded beyond its original upstream focus to be used across the company, in shipping, pipeline, oil spill analysis and HSE. OneMap provides a ‘common operating picture’ for BP’s 11,000 users, blending BP functional data, local system of record data and an individual’s project data. Boulmay places GIS at the center of a hub, with geoscience, business intelligence and other domains at the extremity of the platform, ‘location is key to everything.’ In the Q&A, Boulmay agreed that this picture may be different for other user communities. For instance, while ‘SAP is peripheral to us, and we are peripheral to them.’ This is the advantage of the platform and shared/linked data approach.

Karen Blohm (CGG) reported on reevaluation work done for Kuwait Gulf Oil that included quick look interpretation of a diverse set of new and legacy data. Some 200,000 files without metadata were processed using AgileDD’s ‘iQC’ machine learning automated document classification tool along with CGG’s own PleXus application. The whole process ran in a cloud environment produced a ‘good enough’ deliverable in a ‘previously impractical’ time frame.

Scot Nesom explained that Newfield Exploration, as an unconventional player, sees a lot of data coming in daily. In 2017 this amounted to 600GB, 75 million documents and around 100TB of derived data. Nomenclature across this large and diverse data can be ‘nuanced’ leading some 90% of project time spent on finding and preparing data and a measly 10% on analysis. To rectify the situation, Newfield built a crawler that searches and indexes everything. The system uses metadata relevance ‘boosters’ and captures usage patterns to enhance subsequent discovery. Some 7.5 billion terms are indexed and refreshed every 24 hours. The PPDM reference model helped fix the nomenclature issues. Semantic layers have been built individual use cases, ‘not the rigid taxonomies of Livelink.’ Newfield has now ‘flipped the 90/10 rule.’

David Morrison (Talus) observed that only 5% of corporate data is accessed, ‘so why spend more to have it all online?’ Data storage is costly and escalates with desired speed of retrieval. Paying millions for fast online systems doesn’t make much sense when all the users need is data on their local workstation. The cloud is seen as making such problems disappear. This is true to some extent, as cloud providers have massive economies of scale. But if you move your data center to the cloud, the intrinsic costs will be about the same and you will need a very high speed link to the cloud. Data in the cloud may be cheap to store but costly to retrieve. Putting all your data online is taking a sledgehammer to solve a simple problem of visibility. The answer? Give geoscientists good data visualization and provide storage options to the data managers ... and checkout Talus’ hybrid data storage offering.

Randy Petit (Shell) cited Thomas Redman, Data Quality Solutions’ ‘Data Doc’ who has it that the key to data quality is to get it right first time. Fixing data once it’s in the database is ‘unsustainable.’ The Data doc’s rule of 10 is, if it costs $1 to capture, it costs $10 to go around the loop and fix when bad, and $100 when a decision is made on bad data. This philosophy has informed Shell’s ‘sustainable petrophysical workflow’ which implements multiple controls upstream at time of capture. The use of commercial software can make it hard to implement data quality controls. Shell has built its own dashboard to check new logs tested against multiple attributes. The dashboard executes monthly checks on the log database, to find and fix errors and perform root cause analysis. This has led to fewer ‘out of control’ situations and load time is now under a week. Logging contractor contracts also need tweaking to ensure incoming data quality.

Tommy Ogden (Enaxis Consulting) has been working with an unnamed supermajor on a GIS/data lake. But what is a data lake? For Ogden it is a heterogenous accumulation of relational databases, imagery, video, PDFs and other documents. All of which come from data ‘tributaries that stream into the lake.’ Users are allowed read-only access. Key to such access is the data catalog, which ‘goes beyond an index with data definitions, formats and constraints.’ The catalog allows a schema-on-read approach as opposed to the schema-on-write of the RDBMS. Access is provided to a refined data area for analysis by Spotfire/Tableau/SQL Server. Alternatively, data may be pulled into a ‘user-defined’ area for discovery and use by ‘citizen data scientists.’

Mark Priest (Wipro) sees data management as (still) fighting an uphill battle. Management tends to switch off and customers come back at you fighting! This is a three decade-long issue, but why? To an extent this is down to the apathy towards ‘over-hyped’ data management and the perception that ‘it’s just IT.’ In the aftermath of the 2014 downturn, we are all asked to ‘do much more with less.’ Enter ‘small data management,’ i.e. things that you can do without a budget! Keep your eyes and ears open to your customers’ pain points. Learn the Windows PowerShell, Unix scripting, python, R, sed/awk. Count the hours spent on a painful process. Or just count things (and time stamp to establish trends). Start out by hooking an Excel table to a database and ‘in a few hours you have a dashboard.’ This you can use to identify data issues or to build a master table of who needs to see what data and link it in to the ActiveDirectory. Some zero-cost ‘skunky’ projects are still running six years later. Priest cited a home-brew clone of OpenIT that tracks the use of ‘very expensive’ software.

Chris Hanton (Perigon Solutions) reported on a core database project performed for Anadarko. Much legacy core data comes as non digital scans of imagery and reports. Also, when core or PVT data is digital, it can come in multiple vendor formats that change over time. Once data is structured, loading with a conventional linear data flow including QC results in ‘many bottlenecks and redundancies.’ Enter a new approach to data transfer. Lose the ‘linear’ transfer and go for ‘quick efficient population’ of clean data to databases and into analytics. Perigon supports ‘intelligent autoloading/crowdsourced’ data loading from a shared staging area. A generic query builder enables connections to any JDBC-compliant database. Anadarko’s James Miller took over to confirm the poor state of Anadarko’s data prior to Perigon’s intervention. This was initiated following a top-level request from Anadarko’s new advanced analytics and data science team. iPoint was chosen for core data consolidation, initially as a manual solution and later with bespoke bulk smart data loaders. This has seen a 80/90% success rate in integrating core data with master well header records for some 115,000 wells. Data is now available for analytics in tools such as Denodo and R, and for use in E&P interpretation packages.

Finally, a note from PNEC exhibitor PDS which has developed an open source Witsml server. This acts as a format converter from LAS, LIS, WITS to ‘standard’ Witsml as mandated by larger operators. Many smaller logging companies do not have the resources to do such transformations. The issue is all the more problematical as not all majors mandate the same version Witsml version, and the conversion is non-trivial.

The next PNEC conference will be held in Houston from the 21st to 23rd May 2019. More from PNEC Conferences.

* on-the-spot report.

Click here to comment on this article

Click here to view this article in context on a desktop

© Oil IT Journal - all rights reserved.