ECIM at 25

The 25th edition of the Haugesund EU Community for Information Managers (ECIM) was sold-out. We report on the following: Equinor – ‘data ages like wine’. 50 years of the NPD. Wintershall: ‘climate KPIs a huge data management problem’. OSDU experience at Shell, Equinor. Dell – ‘industry coalescing around OSDU’. DataOps 101. Diskos 2.0. NSTA on ‘Section 34’ infrastructure reporting. Offshore Energies UK – ESG reporting via the SEQual portal. Halliburton – ‘data management is going to disappear’. SLB and the ‘sheer weight of buzzwords!’

In his wide ranging keynote address to the 2020 ECIM Data Management Conference in Haugesund, Norway, Equinor’s Harald Wesenberg stated that ‘Apps age like fish, but data ages like wine’. This explains why so many digital oilfield projects fail (not only in Equinor!). Development methodologies like scrum and agile are easy and cheap to deploy. But software development is not like civil engineering. There are no first principles to build on and little relationship between cost and delivery. Agile teams may involve many players but not, perhaps, so many users. Successful software comes from successful user conversations. Development teams need to be immersed in the environment so that they build what users need and understand the data.

This is where things get complicated. Sensors are inherently unreliable, information needs context. ‘Do you know what you are measuring?’ A term like ‘cargo’ can mean different things to different stakeholders. Terminology alignment between domain specialists and data scientists is tricky. Again, context is key to capturing ‘silent deviations’, i.e. unknown local practices and adjustments. Applying manual data checks up-front may lead to stress and overwork. There are frequently differences between how outsiders think a process works and what is really going on. Developers have strong opinions on how software should work. The project manager in the middle translates requirements, but may have his/her own biases (groupthink). Apophenia (the tendency to perceive meaningful connections between unrelated things) is another gotcha.

Wesenberg warns of the ‘3 Cs’, career, competence and corruption which have led to falsified nuclear safety data, questionable Alzheimer R&D and more. Inspiration may be found in the work of Steven Shorrock. To understand and manage our complex systems, Wesenberg suggests the Cynefin framework for managing complexity. In any event, ‘What got me here will not get you here – you need to make your own journey’. Wesenberg wound-up with a quote from Jeff Bezos ‘When anecdotes and data disagree the anecdotes are usually right!’. In other words, ‘you may have been measuring wrong’.

Hilde Nordbo traced the 50 year history of the Norwegian Petroleum Directorate (NPD). With the 1969 discovery of Ekofisk it was clear that the oil business was going to be massive for Norway. The NPD has maximized the value from oil and gas for Norwegian society and now Norway has ‘the best oil industry in world’. A key component here has been the proper management of resources and of data. The establishment of a National geoscience resource was prescient. Geobank houses physical samples, logs and tapes with some 160 km of cores and 13 petabytes of data in Diskos. Diskos has also been a success factor. A recent huge find is attributed to re working old data in Diskos. In the context of the energy transition, is oil still necessary? ‘Yes, in combination with emerging sources’. All are part of NPD’s effort, preparing for the future, with data repurposed in new uses.

Mathias Hartung (Wintershall DEA) addressed the issue of data governance in a ‘mature yet transforming operator’, faced with the challenges of post-merger integration, energy security, the pandemic and ‘net zero’. The focus today is on operational efficiencies, quality decisions and compliance/reporting. This all mandates data quality and fixing the ‘garbage in garbage out’ problem which ‘even the best data science cannot fix’. Hartung leverages a SIPOC approach (supplier – input – process – output – customer) which is applied ‘right to left’ (q.v. ‘right to left thinking’ elsewhere in this issue). ‘SIPOC fixes GIGO’ in a process that goes from Scada systems into Wintershall’s data warehouse (an implementation of Steinhaus Information Systems’ TeBis historian), a hub that feeds data out to stakeholders in the GPOE unit (well integrity – see also Cegal on ‘New ways of working), reservoir management (OFM), ESG and finance. Wintershall’s dedicated data and information management organization has a direct line to corporate management. Like the data, the unit crosses all departments – subsurface, process, information management data science. Documents are managed is a combination of OneDrive, SharePoint and (for enterprise content required for compliance and reporting) in a dedicated document management system with links to the physical archive of paper documents. Hartung observed that Wintershall’s climate change KPIs are ‘exploding’ and are migrating from manual estimates to automated reporting and management. There will be some 50k climate KPIs to report on in 2023, a ‘huge data management problem’. Data governance is essential for digitalization and can be done by professionals focusing on business processes and data readiness. Governance delivers efficiencies, boosts morale and assures compliance. ‘Data and information flow is what we do’. Hartung is open to collaborate on data governance best practices.

Henk Tijhof presented Shell’s OSDU* experience to date and its take on OSDU strategy and vision. Shell initiated the OSDU program, contributing components of its in-house developed SDU, the subsurface data universe. Technology is changing relative to legacy systems and Shell is ‘trying to figure our where we are going’, with a focus on workflows. Some users are keen, but many are hard to convince. The majority say ‘ok if you want to change go ahead I have a day job!’ The expectation is of company-wide access to all global data for AI/ML workflows. Such business processes can be digitally orchestrated in the cloud. But Shell is struggling with data in cloud. Shell’s subsurface and wells (SSW) digital ecosystem is more than OSDU. With the move to an OSDU future cloud architecture, Shell ‘wants to partner, not to develop our own solutions’. With the new ways of orchestrating business workflows, it remains important to keep the benefits of legacy apps. OSDU needs more market solutions. Data migration is an issue, OSDU is currently more of a data transfer platform. In any event, Shell’s OSDU program is large and running at full capacity. Shell is committed to support OSDU, but this has never been done at scale. Generating and consuming data in the cloud via OSDU in support of end-to-end seismic workflows has value, but is technically difficult. The OSDU Forum needs to stay strong with sustainable and equitable contributions from members. We need a commitment to go all-in with OSDU as a platform for data storage. Here there are a lot of things to solve like data immutability and working with apps. A workspace concept needs to be designed and implemented ‘asap’ to avoid going back to the original project databases. In a slight dig at the scrum approach, Tijhof confessed to being an ‘old school waterfall man’. It’s key to get the design principles right. More thought is needed on data management, catalogues etc. We need more team work to test, break and fix.

* Originally the Open subsurface software universe. Now just ‘OSDU’, to allow infinite scope creep!

David Holmes (Dell) provided a more positive spin on the OSDU state of play, speaking on behalf of The Open Group. There are today some 200 companies paying TOG fees and providing resources, an ‘enormous community’. One major release involved a ‘million person minutes’ of Webex calls. The OSDU mission is to ‘reduce data silos’ and provide an exit ramp for existing technology*. An environment where we can deploy new stuff without three years of integration effort thanks to a ‘cloud and open standards-based ecosystem’. OSDU needs to be ‘tech-agnostic’ i.e. poly-cloud, on prem or public cloud. OSDU started in the subsurface but is extending across energy and broadens its scope. Notably with the Open Footprint forum (OFP). This means ‘using modern IT to industrialize data management’ with cloud capabilities displacing legacy (although ‘we are not there yet!’). Open Source is ‘at the heart of OSDU**’. But ‘not everything is ‘free’, open source means ‘differently expensive’. Service providers are building commercial OSDUs. The consultants, notably SLB, are in on the act. While you could roll your own, a vendor edition means less spent on platform capabilities. You need to re-evaluate your IP and just keep what is core, ‘just write the cool stuff’. Support the platform, grow domain coverage and think how to take this forward. OSDU is one of the few things around which industry is coalescing.

* An interesting remark, potentially with major consequences for the vendors of ‘legacy technology’.

** Actually it is not really clear if OSDU is open source.

Andreas Sandvik Jakobsen (Sopra-Steria) and Bruce Chalmers (Var Energie) offered a DataOps 101. Data Ops (DO) will ‘liberate data from its sources and deliver it to a person, app or system’. DO is ‘governed by agile’. Var Energie uses DO to deliver data products to its geoscientists and engineers, leveraging data science, domain expertise and data management. Cleaning exploration data for data science use is not easy. Quality challenges abound: trends can mislead, deviated wells are awkward to handle and there are many different data sources. DO helps with data cleansing, adding context, dashboard building and integration with Petrel. Wells are consolidated in a Python/Pandas DataFrame.DataFrame.html along with NPD attributes and viewed in a Python GUI. Was it worth it? Merging multiple data sets adds ‘debugging knowledge’. Most value add came from data consolidation and prep with DO. DO can dispel or verify data myths, by comparing different data sets/sources and identifying bias. The resulting data product can be used over time as a, ‘always on’ tool for interpreters. The trick is to combine data science, domain expertise and data management into one ‘cog’ that drives the interpretation machine. DO means an automated data pipeline, that is accessible and communicable and that supports enterprise data governance. DataOps engineer is ‘the sexiest job in analytics’. In the Q&A, Oil IT Journal asked what is DataOps, is it anything more than just using Python? The answer was that ‘DO is technology agnostic, it could be Python, PowerBI’. OK but what is it?

There was a full house for Einar Landre’s report from the trenches on Equinor’s experience of OSDU adoption. Since its 2018 launch by Shell and Johan Krebbers, OSDU has grown to 220 member companies and 2,600 contributors on the Slack channel. OSDU is now ‘near to production ready’. Hitherto data has suffered from broken lineage and lack of provenance. In the well delivery process knowledge and intent is lost in an operational ‘fog of war’ . OSDU can help here to maintain lineage in delivery with artefacts that point to their predecessors in the workflow. Data provenance can be fixed with metadata captured at source. But where is OSDU in Equinor’s journey to the cloud? Equinor’s ‘Omnia’ cloud is a bespoke data warehouse running in Microsoft Azure and preceded OSDU. One Omnia component, the subsurface data lake (SSDL) is being refactored to run OSDU code inside Omnia. This is slotted for ‘soft production’ by year-end 2022. OSDU currently runs inside Equinor custom schemas, in the future, Equinor data will leverage OSDU well known schema and ultimately to pure OSDU, with attributes on demand from apps/users. So where is the system of record? Seemingly ‘there is no golden record only an endless journey of insight’. The golden record/single source of truth is a fallacy. A data platform needs to support the endless data lifecycle, the continuous distillation of raw and immature datasets into fit-for-purpose datasets that are ‘bearers of knowledge’. These can be archived either along the datatype/context axis or along the artefact/project axis, OSDU supports both. A legal risk assessment of OSDU is underway in Equinor. The first data to store will be non confidential, ‘low hanging fruit’. Legal data tags are work in progress as is data management, governance and compliance. Are we there yet? No!

Maria Juul (NPD) presented Diskos 2.0 the fifth manifestation of Norway’s oil and gas data repository. Diskos is ‘the world’s largest NDR’ with 13 petabytes of data, over 300 users and 33 member organizations. Diskos is managed by three partner organizations the NPD (regulator and administrator), its members (contributors, users) and the operator (the technology provider). Landmark and Kadme return as operators of Diskos 2.0 which will go live in January 2023. Enhancements in 2.0 include a new API for access to well seismic trade and production data, automated reporting, dropsite data ingestion, improved QC and tagging of reporting requirements. Third parties can add data for completeness. More data will be available from the public portal. Diskos now runs in the public cloud, on both AWS and Azure. A virtual data room, served from AWS Stockholm, is available as an additional service. While DISKOS is said to be ‘OSDU ready’, there are no plans to ‘merge’ with OSDU. OSDU is not considered mature and ‘we need to understand more to see if this is where we want to go’.

Bee Smith and Zahir Ibrahim described how the UK North Sea Transition Authority is ‘driving data quality improvement through Section 34 infrastructure reporting. NSTA took over management of the UK National Data Repository in March 2022. The initial focus is for data quality improvement under impetus from the 2016 Energy Act. Ready access to climate data is a prerequisite to the energy transition. The Energy Pathfinder launched, extending NSTA’s brief to span oil and gas, CCS, electrification, Hydrogen and offshore wind. One emerging use is decommissioning, where new data attributes and a system of record are needed. Section 34 of the UK Offshore Energy Act empowers the NSTA to ‘to require information and samples’. To which end, an ‘agile’ approach has been deployed to improve data reporting. This involves some 45 types of infrastructure and here, ‘only shapefiles or geodatabases are accepted, no more spreadsheets!’ Data is run through an FME Workbench which applies some 36 checks on duplicates, attributes and spatial integrity. Future plans include tighter definitions, more attributes, more checks and more automation. NSTA is driving innovation through better quality data. In the Q&A, NSTA was asked why there was no facility for non GIS data (such as PDFs or reports of corrosion) and why the IOGP Spatial Data Model was not considered. This was acknowledged to be a possible future option.

Sakthi Norton (Offshore Energies UK) addressed the evolving value of data in ‘a North Sea in transition’. ESG factors are gaining in importance and are now ‘as important as profits’. This is changing data management as ESG data is now required to cover Carbon, biodiversity risk, water management, social, safety, governance and anti bribery. There is a lot of data to collect and maintain and also lots of overlap with different traditional functions in HR, operations, and finance. Much of this data is already captured but it is ‘siloed, messy, and not necessarily purposed for ESG’. OEUK’s current focus is on large operators and the supply chain. OEUK is there to support industry-wide ESG reporting via the SEQual registry of suppliers and buyers. ‘We need to be ready for ESG data’.

Chandra Yeleshwarapu (Halliburton) bravely stated that ‘Data management as a function is going to disappear!’ It will be replaced by ‘DataOps’. You need to think of data as a product, you need a data pipeline and thus you need an enterprise architecture. Today, companies are funding data science rather than data management. This means developers are in the wrong place. Better to bring them together in one scrum team, interlocking the cogs of MLOps, DataOps, DevSecOps that ‘constitute the enterprise architecture’.

Having already heard two talks on DataOps without understanding much, so we listened-up to SLB’s Jamie Cruise talk on ‘data without borders’ and digital transformation via ‘DataOps’. For Cruise, ‘data chaos is the enemy of the industry’ and current E&P environments impede digital transformation especially at the data layer. What is needed is a platform that connects producers to consumers. This is hard to achieve because companies have spent years developing ‘something that looks like a platform’. Now we have OSDU, the data platform and a ‘pure open source community*’ So now what? Cruise invokes ideas from outside the industry notably ‘data as a product’. He cited McKinsey and the Harvard Business Review’s ‘A better way to put your data to work’ HBR June 2022. Here we have a system of record feeding a data platform feeding applications. ‘Alongside of data as a product we have DataOps’ and ‘data starts speaking to humans in a trusted voice’ (one might think that Cruise has his tongue in his cheek at this point). Product thinking will change data boundaries - beyond corporate, vendor and national data silos. ‘The sheer weight of buzzwords will mean we start thinking of data products!’ (OK he definitely is kidding!) Standard platforms productize the collaboration model between producers and consumers. However, data is a ‘business’, enabled by data loading, curation and peer-to-peer data marketplaces. All of which could make up a digital energy environment and all of which are fee paying. OSDU is ‘just a piece of the puzzle’.

* Again – this is moot.

More from ECIM, the EU Community for Information Managers.

Click here to comment on this article

Click here to view this article in context on a desktop

© Oil IT Journal - all rights reserved.