Editorial - Discovery - the future of E&P Data Modelling? (November 1996)

PDM reports on the state of play in the Discovery Project. The second attempt (and probably the last) to merge data models from the PPDM association and the POSC.

As the first signs of winter arrived in Calgary, a rather important meeting was taking place in the granite-clad high-rise of Petrocanada's offices. The AGM of the Public Petroleum Data Model Association had to deliberate on the continuing development of the PPDM data model (see insert) but also was the scene of the first public reporting on the preliminary out come of project Discovery. This, for those of you who either missed previous editions of PDM, or who were not paying attention, is an attempt to reconcile the disparate approaches of the Petrotechnical Open Software Corporation (POSC) and PPDM. This project 's core team is made up of representatives from Landmark, Chevron and Schlumberger with representatives from POSC and PPDM helping out. In other words, this is a business driven (in the true sense of that much misused phrase) initiative. The Discovery sponsors either want to encourage a rapprochement between PPDM and POSC, or at least send a clear message to the outside world that they are making a very serious attempt to do so, and that if nothing comes of it, then it ain't their fault!


What is at issue in this co-operative effort between competitors is not so much a drive towards the goal of interoperability (we saw in the October PDM that in the case of one major vendor, a POSC compliant implementation was in no way intended to facilitate this), the objective of the Discovery sponsors is simply that two data models out there is one too many. Furthermore because their client universe is split roughly between those who use PPDM in one guise or another, and those who have stated that only compliance with POSC will do for them, this situation is having a very real impact on their business, and on their development and maintenance effort. A single data model will avoid the necessity for a "forked-tongue" marketing strategy, and will roughly halve the data model development for these major players.


So what is the problem here? What is project Discovery trying to do? Reporting at the PPDM AGM, Mary Kay Manson of Chevron Canada gave an update into Discovery progress to date. While the long term strategic objective is to meld the two models, the current Discovery project is a preparation for the next major release of the PPDM data model - version 4.0, which, if all goes well will become a "subset" of POSC's Epicentre model. It should be possible to obtain, via an automated projection methodology, in other words a set of rules for generating one model from another, the new model from Epicentre. Thus will PPDM benefit from the rigor of POSC's purist data model, and POSC will have a new "subset" of happy clients.


So much for the overview, the reality of Project Discovery is more prosaic. While project Discovery set out initially with the objective of providing a common model, it was soon realized that this was a harder task than initially thought. Fundamental differences between the two models led to the scope of the project being reined in to test the feasibility of mapping from an Epicentre logical data model to a PPDM-like relational implementation. The area of seismic acquisition was selected as a test-bed. The methodology used is to use a projection tool to manufacture a physical model (Oracle tables and relations) from the Epicentre logical model. This has been found to be a very unstable process - with small changes in the projection method making for radically different models - a reflection of the complexity of the system.


Other problems arise from the different scopes of the two models. Some differences between PPDM and POSC come from the simple fact that different people, with different focus, have worked on the models at different times. While PPDM is strong in the domains which have been central to its member's activity - such as the dissemination of well data - POSC's top-down design and all-encompassing intended scope means inevitably that it is, at least locally, a bit thin on the ground. Another major difference stems from the inevitable compromise that is implicit in all data models using the relational data base. They can either be normalized (see side box) - for precise data modeling - or non-normalized for performance. Generally speaking, Epicentre is normalized and PPDM is not.

where's the data?

A further fundamental difference between the two models, and one of particular importance to project Discovery is where to store data. You may think that a database is a good place to start with, but things are not that simple. A historical difference between the PPDM and POSC specifications is the actual location of bulk data (i.e. well logs and seismic trace data). In all implementations of both models, the actual data is generally speaking stored outside of the database, on the file system in either native formats such as SEG-Y or LIS or in proprietary binaries. Project Discovery however appears to be following the pure POSCian line with data to be stored within the database. This - as we discussed last month, can have deleterious effects on query performance, and may be a stumbling block for Discovery take up if it is too hard-wired into the model.


Project Discovery's schedule has been broken down into four phases quaintly named "Strawman", "Woodman", "Ironman" and "Steelman". This is a handy way of delimiting the progress of a project as it allows for infinite variation of content and progress. Discovery is today in the "Ironman" phase - although what this means is not yet clear. The strong pro-Discovery lobby within PPDM is pushing to extend project Discovery's scope to substantially the whole of the PPDM data model scope for release as PPDM version 4.0. The zealots are even forecasting a release date towards the end of 1997 for this. This sound like tempting fate in view of prior similar announcements, but with the support for Discovery coming from the sponsors, and the newly elected PPDM board, may be more achievable than at the time of the last POSC/PPDM merger back in 1993.


Two mindsets will have to evolve before Discovery brings home the bacon. The rank and file of the PPDM membership will have to loosen up and accept that the next generation of the PPDM model is going to be very different from the 2.x and 3.x versions. This will be hard, because PPDM is a "physical" model, in other words it is implemented and used. Data and applications are plugged in and running off it, so that any change (even between prior mainstream versions of PPDM) has always been painful (see the PPDM article in this issue) Secondly POSC will have to enhance the marketing profile they ascribe to Discovery. Discovery will be still-born if POSC continues to see it as a "subset project". POSC will have to consider the Discovery data model as THE relational projection of Epicentre. If this happens, then even the recalcitrant

Click here to comment on this article

If your browser does not work with the MailTo button, send mail to pdm@the-data-room.com with PDM_V_2.0_199611_3 as the subject.

© Oil IT Journal - all rights reserved.