In his wide ranging keynote address to the 2020 ECIM Data Management Conference in Haugesund, Norway, Equinor’s Harald Wesenberg stated that ‘Apps age like fish, but data ages like wine’. This explains why so many digital oilfield projects fail (not only in Equinor!). Development methodologies like scrum and agile are easy and cheap to deploy. But software development is not like civil engineering. There are no first principles to build on and little relationship between cost and delivery. Agile teams may involve many players but not, perhaps, so many users. Successful software comes from successful user conversations. Development teams need to be immersed in the environment so that they build what users need and understand the data.
This is where things get complicated. Sensors are inherently unreliable, information needs context. ‘Do you know what you are measuring?’ A term like ‘cargo’ can mean different things to different stakeholders. Terminology alignment between domain specialists and data scientists is tricky. Again, context is key to capturing ‘silent deviations’, i.e. unknown local practices and adjustments. Applying manual data checks up-front may lead to stress and overwork. There are frequently differences between how outsiders think a process works and what is really going on. Developers have strong opinions on how software should work. The project manager in the middle translates requirements, but may have his/her own biases (groupthink). Apophenia (the tendency to perceive meaningful connections between unrelated things) is another gotcha.
Wesenberg warns of the ‘3 Cs’, career, competence and corruption
which have led to falsified nuclear safety data, questionable Alzheimer
R&D and more. Inspiration may be found in the work of Steven Shorrock. To understand and manage our complex systems, Wesenberg suggests the Cynefin
framework for managing complexity. In any event, ‘What got me here will
not get you here – you need to make your own journey’. Wesenberg
wound-up with a quote from Jeff Bezos ‘When anecdotes and data disagree
the anecdotes are usually right!’. In other words, ‘you may have been
measuring wrong’.
Hilde Nordbo traced the 50 year history of the Norwegian Petroleum Directorate (NPD). With the 1969 discovery of Ekofisk
it was clear that the oil business was going to be massive for Norway.
The NPD has maximized the value from oil and gas for Norwegian society
and now Norway has ‘the best oil industry in world’. A key component
here has been the proper management of resources and of data. The
establishment of a National geoscience resource was prescient. Geobank
houses physical samples, logs and tapes with some 160 km of cores and
13 petabytes of data in Diskos. Diskos has also been a success factor.
A recent huge find is attributed to re working old data in Diskos. In
the context of the energy transition, is oil still necessary? ‘Yes, in
combination with emerging sources’. All are part of NPD’s effort,
preparing for the future, with data repurposed in new uses.
Mathias Hartung (Wintershall DEA)
addressed the issue of data governance in a ‘mature yet transforming
operator’, faced with the challenges of post-merger integration, energy
security, the pandemic and ‘net zero’. The focus today is on
operational efficiencies, quality decisions and compliance/reporting.
This all mandates data quality and fixing the ‘garbage in garbage out’
problem which ‘even the best data science cannot fix’. Hartung
leverages a SIPOC approach
(supplier – input – process – output – customer) which is applied
‘right to left’ (q.v. ‘right to left thinking’ elsewhere in this
issue). ‘SIPOC fixes GIGO’ in a process that goes from Scada systems
into Wintershall’s data warehouse (an implementation of Steinhaus
Information Systems’ TeBis historian), a hub that feeds data out to stakeholders in the GPOE unit (well integrity – see also Cegal on ‘New ways of working), reservoir management (OFM),
ESG and finance. Wintershall’s dedicated data and information
management organization has a direct line to corporate management. Like
the data, the unit crosses all departments – subsurface, process,
information management data science. Documents are managed is a
combination of OneDrive, SharePoint and (for enterprise content
required for compliance and reporting) in a dedicated document
management system with links to the physical archive of paper
documents. Hartung observed that Wintershall’s climate change KPIs are
‘exploding’ and are migrating from manual estimates to automated
reporting and management. There will be some 50k climate KPIs to report
on in 2023, a ‘huge data management problem’. Data governance is
essential for digitalization and can be done by professionals focusing
on business processes and data readiness. Governance delivers
efficiencies, boosts morale and assures compliance. ‘Data and
information flow is what we do’. Hartung is open to collaborate on data
governance best practices.
Henk Tijhof presented Shell’s
OSDU* experience to date and its take on OSDU strategy and vision.
Shell initiated the OSDU program, contributing components of its
in-house developed SDU, the subsurface data universe. Technology is
changing relative to legacy systems and Shell is ‘trying to figure our
where we are going’, with a focus on workflows. Some users are keen,
but many are hard to convince. The majority say ‘ok if you want to
change go ahead I have a day job!’ The expectation is of company-wide
access to all global data for AI/ML workflows. Such business processes
can be digitally orchestrated in the cloud. But Shell is struggling
with data in cloud. Shell’s subsurface and wells (SSW) digital
ecosystem is more than OSDU. With the move to an OSDU future cloud
architecture, Shell ‘wants to partner, not to develop our own
solutions’. With the new ways of orchestrating business workflows, it
remains important to keep the benefits of legacy apps. OSDU needs more
market solutions. Data migration is an issue, OSDU is currently more of
a data transfer platform. In any event, Shell’s OSDU program is large
and running at full capacity. Shell is committed to support OSDU, but
this has never been done at scale. Generating and consuming data in the
cloud via OSDU in support of end-to-end seismic workflows has value,
but is technically difficult. The OSDU Forum needs to stay strong with
sustainable and equitable contributions from members. We need a
commitment to go all-in with OSDU as a platform for data storage. Here
there are a lot of things to solve like data immutability and working
with apps. A workspace concept needs to be designed and implemented
‘asap’ to avoid going back to the original project databases. In a
slight dig at the scrum approach, Tijhof confessed to being an ‘old
school waterfall man’. It’s key to get the design principles right.
More thought is needed on data management, catalogues etc. We need more
team work to test, break and fix.
* Originally the Open subsurface software universe. Now just ‘OSDU’, to allow infinite scope creep!
David Holmes (Dell) provided a
more positive spin on the OSDU state of play, speaking on behalf of The
Open Group. There are today some 200 companies paying TOG fees and
providing resources, an ‘enormous community’. One major release
involved a ‘million person minutes’ of Webex calls. The OSDU mission is
to ‘reduce data silos’ and provide an exit ramp for existing
technology*. An environment where we can deploy new stuff without three
years of integration effort thanks to a ‘cloud and open standards-based
ecosystem’. OSDU needs to be ‘tech-agnostic’ i.e. poly-cloud, on prem
or public cloud. OSDU started in the subsurface but is extending across
energy and broadens its scope. Notably with the Open Footprint forum
(OFP). This means ‘using modern IT to industrialize data management’
with cloud capabilities displacing legacy (although ‘we are not there
yet!’). Open Source is ‘at the heart of OSDU**’. But ‘not everything is
‘free’, open source means ‘differently expensive’. Service providers
are building commercial OSDUs. The consultants, notably SLB, are in on
the act. While you could roll your own, a vendor edition means less
spent on platform capabilities. You need to re-evaluate your IP and
just keep what is core, ‘just write the cool stuff’. Support the
platform, grow domain coverage and think how to take this forward. OSDU
is one of the few things around which industry is coalescing.
* An interesting remark, potentially with major consequences for the vendors of ‘legacy technology’.
** Actually it is not really clear if OSDU is open source.
Andreas Sandvik Jakobsen (Sopra-Steria) and Bruce Chalmers (Var Energie)
offered a DataOps 101. Data Ops (DO) will ‘liberate data from its
sources and deliver it to a person, app or system’. DO is ‘governed by
agile’. Var Energie uses DO to deliver data products to its
geoscientists and engineers, leveraging data science, domain expertise
and data management. Cleaning exploration data for data science use is
not easy. Quality challenges abound: trends can mislead, deviated wells
are awkward to handle and there are many different data sources. DO
helps with data cleansing, adding context, dashboard building and
integration with Petrel. Wells are consolidated in a Python/Pandas DataFrame.DataFrame.html
along with NPD attributes and viewed in a Python GUI. Was it worth it?
Merging multiple data sets adds ‘debugging knowledge’. Most value add
came from data consolidation and prep with DO. DO can dispel or verify
data myths, by comparing different data sets/sources and identifying
bias. The resulting data product can be used over time as a, ‘always
on’ tool for interpreters. The trick is to combine data science, domain
expertise and data management into one ‘cog’ that drives the
interpretation machine. DO means an automated data pipeline, that is
accessible and communicable and that supports enterprise data
governance. DataOps engineer is ‘the sexiest job in analytics’. In the
Q&A, Oil IT Journal asked what is DataOps, is it anything more than
just using Python? The answer was that ‘DO is technology agnostic, it
could be Python, PowerBI’. OK but what is it?
There was a full house for Einar Landre’s report from the trenches on Equinor’s
experience of OSDU adoption. Since its 2018 launch by Shell and Johan
Krebbers, OSDU has grown to 220 member companies and 2,600 contributors
on the Slack channel. OSDU is now ‘near to production ready’. Hitherto
data has suffered from broken lineage and lack of provenance. In the
well delivery process knowledge and intent is lost in an operational
‘fog of war’ . OSDU can help here to maintain lineage in delivery with
artefacts that point to their predecessors in the workflow. Data
provenance can be fixed with metadata captured at source. But where is
OSDU in Equinor’s journey to the cloud? Equinor’s ‘Omnia’ cloud is a
bespoke data warehouse running in Microsoft Azure and preceded OSDU.
One Omnia component, the subsurface data lake (SSDL) is being
refactored to run OSDU code inside Omnia. This is slotted for ‘soft
production’ by year-end 2022. OSDU currently runs inside Equinor custom
schemas, in the future, Equinor data will leverage OSDU well known
schema and ultimately to pure OSDU, with attributes on demand from
apps/users. So where is the system of record? Seemingly ‘there is no
golden record only an endless journey of insight’. The golden
record/single source of truth is a fallacy. A data platform needs to
support the endless data lifecycle, the continuous distillation of raw
and immature datasets into fit-for-purpose datasets that are ‘bearers
of knowledge’. These can be archived either along the datatype/context
axis or along the artefact/project axis, OSDU supports both. A legal
risk assessment of OSDU is underway in Equinor. The first data to store
will be non confidential, ‘low hanging fruit’. Legal data tags are work
in progress as is data management, governance and compliance. Are we
there yet? No!
Maria Juul (NPD)
presented Diskos 2.0 the fifth manifestation of Norway’s oil and gas
data repository. Diskos is ‘the world’s largest NDR’ with 13 petabytes
of data, over 300 users and 33 member organizations. Diskos is managed
by three partner organizations the NPD (regulator and administrator),
its members (contributors, users) and the operator (the technology
provider). Landmark and Kadme return as operators of Diskos 2.0
which will go live in January 2023. Enhancements in 2.0 include a new
API for access to well seismic trade and production data, automated
reporting, dropsite data ingestion, improved QC and tagging of
reporting requirements. Third parties can add data for completeness.
More data will be available from the public portal.
Diskos now runs in the public cloud, on both AWS and Azure. A virtual
data room, served from AWS Stockholm, is available as an additional
service. While DISKOS is said to be ‘OSDU ready’, there are no plans to
‘merge’ with OSDU. OSDU is not considered mature and ‘we need to
understand more to see if this is where we want to go’.
Bee Smith and Zahir Ibrahim described how the UK North Sea Transition Authority
is ‘driving data quality improvement through Section 34 infrastructure
reporting. NSTA took over management of the UK National Data Repository
in March 2022. The initial focus is for data quality improvement under
impetus from the 2016 Energy Act. Ready access to climate data is a
prerequisite to the energy transition. The Energy Pathfinder
launched, extending NSTA’s brief to span oil and gas, CCS,
electrification, Hydrogen and offshore wind. One emerging use is
decommissioning, where new data attributes and a system of record are
needed. Section 34 of the UK
Offshore Energy Act empowers the NSTA to ‘to require information and
samples’. To which end, an ‘agile’ approach has been deployed to
improve data reporting. This involves some 45 types of infrastructure
and here, ‘only shapefiles or geodatabases are accepted, no more
spreadsheets!’ Data is run through an FME Workbench which applies some
36 checks on duplicates, attributes and spatial integrity. Future plans
include tighter definitions, more attributes, more checks and more
automation. NSTA is driving innovation through better quality data. In
the Q&A, NSTA was asked why there was no facility for non GIS data
(such as PDFs or reports of corrosion) and why the IOGP Spatial Data
Model was not considered. This was acknowledged to be a possible future
option.
Sakthi Norton (Offshore Energies UK)
addressed the evolving value of data in ‘a North Sea in transition’.
ESG factors are gaining in importance and are now ‘as important as
profits’. This is changing data management as ESG data is now required
to cover Carbon, biodiversity risk, water management, social, safety,
governance and anti bribery. There is a lot of data to collect and
maintain and also lots of overlap with different traditional functions
in HR, operations, and finance. Much of this data is already captured
but it is ‘siloed, messy, and not necessarily purposed for ESG’. OEUK’s
current focus is on large operators and the supply chain. OEUK is there
to support industry-wide ESG reporting via the SEQual registry of suppliers and buyers. ‘We need to be ready for ESG data’.
Chandra Yeleshwarapu (Halliburton)
bravely stated that ‘Data management as a function is going to
disappear!’ It will be replaced by ‘DataOps’. You need to think of data
as a product, you need a data pipeline and thus you need an enterprise
architecture. Today, companies are funding data science rather than
data management. This means developers are in the wrong place. Better
to bring them together in one scrum team, interlocking the cogs of
MLOps, DataOps, DevSecOps that ‘constitute the enterprise architecture’.
Having already heard two talks on DataOps without understanding much, so we listened-up to SLB’s
Jamie Cruise talk on ‘data without borders’ and digital transformation
via ‘DataOps’. For Cruise, ‘data chaos is the enemy of the industry’
and current E&P environments impede digital transformation
especially at the data layer. What is needed is a platform that
connects producers to consumers. This is hard to achieve because
companies have spent years developing ‘something that looks like a
platform’. Now we have OSDU, the data platform and a ‘pure open source
community*’ So now what? Cruise invokes ideas from outside the industry
notably ‘data as a product’. He cited McKinsey and the Harvard Business
Review’s ‘A better way to put your data to work’ HBR June 2022.
Here we have a system of record feeding a data platform feeding
applications. ‘Alongside of data as a product we have DataOps’ and
‘data starts speaking to humans in a trusted voice’ (one might think that Cruise has his tongue in his cheek at this point).
Product thinking will change data boundaries - beyond corporate, vendor
and national data silos. ‘The sheer weight of buzzwords will mean we
start thinking of data products!’ (OK he definitely is kidding!)
Standard platforms productize the collaboration model between producers
and consumers. However, data is a ‘business’, enabled by data loading,
curation and peer-to-peer data marketplaces. All of which could make up
a digital energy environment and all of which are fee paying. OSDU is
‘just a piece of the puzzle’.
* Again – this is moot.
More from ECIM, the EU Community for Information Managers.
© Oil IT Journal - all rights reserved.