ECIM 2015, Haugesund

NPD, ‘data management key value.’ RocQC, ‘data management in bad shape.’ Statoil’s model store. Shell’s data health checks. Schlumberger’s data/IM as a service. Halliburton, CGG, Teradata on upstream analytics. CDA on data in the downturn. OMV and ArcticWeb. Hampton’s remote data QC.

The NPD’s Maria Juul celebrated 20 years of ECIM and 50 years of activity on the Norwegian shelf, stating that data is a keystone of the Norwegian exploration model. NPD’s role is to advise government and to govern data publishing, release and use. Industry and government’s data objectives may differ as companies seek exclusivity and confidentiality while the government strives for sharing and cooperation. Tax breaks mean that 78% of data costs are refunded, so government ‘has some say in the matter. Competition for new licenses should be based on creativity and new concepts rather than on access to data. Diskos has been a major contribution to the Norwegian economy and data management has been key to value creation and growth.

Ian Barron (RocQC) observed that while there is a general perception that oil and gas data management is ‘healthy and doing well,’ he sees it as in ‘bad shape and getting worse.’ Presentations emphasize data management successes and the same issues are ‘solved’ again and again. The reality is that the last few years have seen more catastrophic failures and users are striving for better data. Barron enumerated a few data management busts, wells without depth reference or with bad geoedetics. Elsewhere, different departments in the same NOC use different coordinate reference systems. Data is being trashed during loading to interpretation systems producing an ‘inextricable mess.’ There remains a universal misconception that a lat/long pair specifies a unique location! Budget cuts means that often there is nobody left to fix such issues. Barron concluded asking how many companies were prepared to certify their data to an ISO quality standard. There were no takers.

Maria Lehmann presented a data ‘success’ in the form of Statoil’s ‘corporate model store’ (CMS) for subsurface geological models. The CMS is designed to provide life-of-field support, with audit trails of who did what, what software versions and parameters were used. The solution was developed to fix rapidly growing storage requirements and unsustainable archiving and management costs. The CMS has cut the number of models from 67,000 to 17,000 with good metadata. 110 terabytes were deleted in the ‘massive’ cleanup of (mostly) Roxar RMS projects.

Shell’s Andries Helmholt presented on the value of technical data ‘health checks.’ These are conducted at Shell’s business units in a combination of self-assessment and peer review twice yearly. A week is spent checking data metrics against business rules and compliance. A ‘checklist-guided conversation’ leads in to a ‘no surprises’ meeting with the business and final report. ‘It may sound dry but it is real fun and puts a happy face on the business.’ Technical data management is intense but rewarding and essential to ensure that data remains ‘alive and kicking.’

Teradata’s Niall O’Doherty proposed to bust some big data myths as follows. Myth n° 1, big data is something new. O’Doherty cited Oil IT Journal’s account of the seminal PNEC presentation of Wal-Mart’s Teradata-based data environment. In 2006 Wal-Mart had a 900 terabyte data environment and could ‘answer any business question at any time.’ Today this has expanded to multi petabyte store in a single repository. Myth n° 2, big data is ‘big.’ It is actually more about flexibility in data exploration. Myth n° 3 it is about Hadoop. ‘This one drives me crazy.’ Few companies have achieved value from Hadoop. Yahoo and Google don’t use it. They use Spark, Ceph, Presto, Apache Mesos. The key is to use technology that relates to your problem and to remember that ‘explicit or implicit there is always at least one schema!’ Myth n° 4, the age of science is over, the data will tell us everything. No! Simple models and lots of data always trump elaborate models based on less data. Your experts are still needed. Myth n° 5, it is for big companies. BD is for all. It is a component of ‘data management 2.0.’ What is needed is some self- education on what is possible. You should be able to answer your management re Presto, Ceph and so on and be ready to bring a competitive advantage to your data strategy.

Anne-Sophie Beck Sylvesteren described how Schlumberger is performing remote data and IM as a service. The client, a UK oil major (who could that be?) was struggling to get buy-in for a multi-year multi-million dollar data project whose value of project was not appreciated by the business nor by data managers. Moreover, data management did not fit into the ITIL framework used by IT. Objectives were finally met with a shift from project to service focus. Data management is achieved remotely using Schlumberger’s secure VPN link to its service hub in Pune, India. Data managers and petrotech support work on data that stays in the US and UK. A web-based system allows end users to load data into project workspaces.

Halliburton’s Lapi Dixit opined that E&P makes a legitimate claim as one of the top industries dealing with big, complex data but that realizing the value is hard. Part of the problem is poor definitions and ‘old business intelligence solutions repurposed as big data.’ Geology doesn’t fit the paradigm and predictive analytics don’t work on seismic data. The big data toolbox may not be so applicable to E&P. Having said that, acquisition-through-processing workflows are inefficient and an obstacle to holistic seismic analysis. Machine/deep learning could be used to speed seismic processing, to analyze stuck pipe and other causes of non-productive time. Data should be kept alive in a virtual pool leveraging a big data compute architecture (Apache Spark, SparkSQL, Streaming, MLib and GraphX) and Landmark Earth, the ‘industry’s first complete E&P cloud offering’ a.k.a. a ‘converged infrastructure appliance.’

Henri Blondelle (CGG) provided a more concrete take on how to wield the data science Swiss army knife in the service of upstream data management. CGG’s iQC prototype big data application replaces manual data load and QC with machine learning-based database population. CGG’s new tools of the trade include MapReduce, decision trees, latent semantic analysis and clustering. iQC establishes a link between unstructured (test) information and the database. A hybrid approach involves a first-pass automated classification which is checked by a human expert for errors. The errors are then fed back into the classifier to improve the classifier. An iQC prototype was demoed on 4000 documents from the UK’s CDA database to extract well names and document types. CGG plans to embed iQC in its ‘Plexus,’ ‘next generation’ data management platform and is inviting interested parties to join in its big data learning effort.

Duncan Irving (Teradata) presented a machine learning approach to basin analysis on a dataset from Japan’s Taranaki basin. Well logs were processed using ‘SAX,’ Teradata Aster’s ‘symbolic aggregate approximation’ a.k.a. Sax-ification. This represents log curve values as discretized alphanumeric buckets – AABBCBAA. Facies are identified with ‘dynamic time warping’ using nPath Teradata/Aster’s regular pattern matching toolset. The approach identified unnoticed features like hot shales and other facies. Pierre Marchand showed how KMeans clustering and Npath can be used to analyze production volumes, well status and stratigraphy. KMeans and NPath are just two of a 100 or so big data techniques available in Teradata’s Aster 6 toolset.

Terry Alexander provided a summary of the ECIM/CDA workshop held earlier this year on the subject of staying ‘sustainable’ at $60 oil. Speakers called for less tax and more (even free!) seismics and better IT and R&D collaboration. Big data/analytics is seen as an antidote to the loss of corporate memory as folks are laid off. Decommissioning is seen as a growth area for the data managers, once the associated regulations have been established. Geotechnical data has bypassed everybody, let’s get on the bandwagon!

Jens Jacobsen showed how OMV is using the ArcticWeb data portal to curate large volumes of public data and planning HSE/medevac for its exploration wells. The app captures medevac regulations, distance to shore, weather windows (icebergs), and other parameters that allow OMV to judge whether a more detailed analysis is required. ArcticWeb embeds Kadme’s Whereoil system.

Arun Narayanan (Schlumberger) and Lapi Dixit (Landmark) both set out their big data/analytics stalls. For Narayanan, analytics need to embed end-user workflows and must be ‘more than dashboards.’ Analytics is a parallel track to the classical ‘engineered’ approach of well and seismic interpretation. Showing a slide with a hundred or so buzzwords, Narayanan intimated that we need to ‘understand all of these.’ A study on public domain data from the Eagle Ford shale enabled an ‘84% accurate’ forecast of production.

Landmark advocates a ‘different path’ to data management where ‘open source’ is no longer a dirty word. Although the oil industry is in decline, tech is booming. Open source presents an opportunity to leverage new technology even as E&P’s IT spend is down. Dixit’s buzzword pastiche included Spark, MLib and GraphX, ‘all in memory,’ arguing for a vendor operated solution to for all of the above.

Wally Jakubowicz (Hampton Data Services) commented on the dichotomy between the data manager’s desire for order and the reality of multiple file systems and duplicated data. Attempting to replace such chaos with ordered, cleansed databased information is ‘difficult and mostly impossible.’ A preferred route is to use continual, domain specific data mining to monitor changes. This is done in a cloud based service that provides the E&P data manager with a dashboard and GIS window showing what objects have been added or edited, what is duplicated and where it is located. Hampton Data’s GeoScope offers remote analysis of company metadata in a ‘cloudy’ data management solution.

Tim Hollis (Schlumberger) presented a structured approach to migrating Open Works and GeoFrame data into Studio for Wintershall. The approach used methodology from the Technology services industry association. One key enabler was the fact that Wintershall gave Schlumberger access to its IT environment. More from ECIM.

Click here to comment on this article

Click here to view this article in context on a desktop

© Oil IT Journal - all rights reserved.