OSDU - why no data egress?

Back from the 2024 ECIM data management conference, Oil IT Journal editor Neil McNaughton remains puzzled as to what OSDU is trying to achieve. While OSDU is developing paths for data ingestion to the cloud, data exit and use is conflicted. Whatever happened to the consumption zones? Why is there ‘turmoil in the Forum?’ He concludes by running OSDU against the DARPA/Heimler ‘catechism’ for project evaluation … with underwhelming results.

There is a popular, not to say tired, parable that speakers trot-out to embellish their talks, that of the blind men discovering an elephant. I propose to repurpose this tale in an attempt to describe OSDU, The Open Group’s Open subsurface data universe. In my retelling, there are no blind men (or women). All are sighted, intelligent and well-meaning. The only problem is. They are all inside the elephant.

In my (admittedly limited) interactions with the OSDU community, everyone has some knowledge of a piece of the puzzle, but when quizzed on related matters, like how and why, people back-off, deflate, to the extent even of admitting that I, a mere editor of a Journal, may know more about OSDU than they do! The OSDU crowd are all inside the beast, operating some more or less obscure bit of the machinery. But nobody knows the big picture. There is a good reason for this, there isn’t really a big picture at all. There are multiple vague objectives that shimmer and change with the telling.

Flashback to 2019 when I interviewed Johan Krebbers, the father of OSDU. I put it to him that the 2019 OSDU booklet ‘read like poetry’ and asked, ‘After 20 or 30 years of working on data issues, What’s new?’ He answered ‘the cloud’. So the big picture is something in the cloud, which is literally nebulous, ‘nebula’ being the Latin for cloud.

Speaking of cloud, a rain check. I went back to the OSDU Forum’s website where, stripping away the verbiage, we read that the objective is for a ‘common data platform architectural design under pinning how our industry works with its data’. This begs the question, is it a common data platform or a common platform design? The Forum’s top-level position continues that this platform is to ‘reduce costs, break down data silos, enable innovation and bring data together in one location’. Worthy aims indeed but how are they to be realized?

Digging deeper into the Forum we have some curious text announcing the ‘Quick Start Guides’ which ‘provide a way to rapidly start using the system’. Well if they did not they wouldn’t be quick start guides would they - duh! The web page boldly announces that ‘the following User Guides are available to you’. But no, it is not guides plural for there is only one, the OSDU Operator Data Loading Quick Start Guide.

In the absence of a quick start guide to getting data out of OSDU, I have been digging around in earlier OSDU material. It seems like there was, early on in OSDU’s history a plan for data egress in the form of ‘consumption zones’ tuned to particular use cases. Only one significant consumption zone has seen the light of day so far, Esri’s geospatial CZ. Why is there not a multitude of ‘open source’ CZs in the labyrinthine OSDU Git repository? Why is there, as Chris Brockman opined at the 2024 ECIM data management conference, ‘Every data type in the world and no functionality’? The answer, is that data access is where the CSPs and other holders of the OSDU keys (think SLB) are to cash-in. Attempts to facilitate open data access, or to make OSDU open as in ‘free’, are causing the ‘turmoil’ in the Forum*.

While the CSPs, IOCs and major vendors others tussle over who does what, let’s return to the big picture. On the one hand, OSDU is emerging as a route for major data migration programs into the cloud. Anyone who has been involved in data transcription (for that is what it is) will know that such programs are extremely complex when done ‘at scale’. They can involve data loss as formats and operating systems may not align. Also such programs do not in themselves bring huge immediate benefits to an operator. They are more in the line of the ‘cost of doing business’ as repositories are decommissioned and new technology comes along. There is no reason to think that a transcription to an OSDU cloud will avoid these gotchas and costs.

The Heimler catechism was recently brought to my attention. This is the US Defense Advanced Research Projects Agency’s (DARPA) approach to evaluating a new project. I propose to run it across OSDU (as I see it).

Heimler: What are you trying to do? Articulate your objectives using absolutely no jargon.

OSDU: Err… move data to the cloud?

Heimler: How is it done today, and what are the limits of current practice?

OSDU: Quite a lot of data has been in the cloud since before the cloud was called the cloud.

Heimler: What is new in your approach and why do you think it will be successful?

OSDU: It is an open source attempt with support from many. This may or may not lead to ‘success’.

Heimler: Who cares? If you are successful, what difference will it make?

OSDU: Data that was not previously in the cloud will be in the cloud. This may not make much of a difference at all!

Heimler: What are the risks?

OSDU: The whole project gets bogged down in acrimony between competing protagonists.

Heimler: How much will it cost?

OSDU: No idea!

Heimler: How long will it take?

OSDU: Well it has been running for six years so far…

Heimler: What are the mid-term and final ‘exams’ to check for success?

OSDU: Err… pass…

* See this issue’s lead.

Click here to comment on this article

Click here to view this article in context on a desktop

© Oil IT Journal - all rights reserved.