A recent meeting of the SEG standards committee addressed the problematic issue of storing large, sometimes very large, sequential data files such as seismic recordings in the public cloud. Current usage is to use the ‘object’ storage formats that are available in all of the cloud vendors’ portfolios. The problem with this naïve approach is that the cloud APIs retrieve an object in its entirety, making data access sluggish, especially if only a small number of traces is required. A similar issue is roiling the research community as large scientific data storage formats like HDF5 (as used by Energistics in RESQML) suffer from the same problem. HDF5 ‘works’ in the cloud but runs extremely slowly. Such issues have been taken up by the data science community to the extent that vanilla HDF (and SEG-Y) are now widely deprecated for cloud use. But what is going to replace them?
The issue of storing sequential data in the cloud echoes earlier agonizing over storing seismics on disk. See for instance our 2007 article, ‘Was the future random? Disk vs. tape-based data archival’. There are many solutions to this problem, for instance Troika’s Minima, Bluware’s volume data store (VDS) and others. Such solutions re-arrange sequential seismic data into an indexed format suitable for random access. At issue today is how such work-arounds can be ported to the cloud. The HDF world has addressed the problem with ‘Kita’. But Kita is not open source and thus fails to meet the requirements of both Energistics and OSDU.
OSDU came out of the starting blocks advocating ‘OpenVDS’, an open source subset of Bluware’s proprietary VDS. Since then, Bluware has been pushing OpenVDS quite aggressively, notably in an AAPG interview where chief product officer Andy James dissed the 1975 SEG-Y format, citing ‘inherent limitations’ in existing seismic data formats and vaunting the merits of VDS.
The SEG’s standards committee is not terribly pleased by this turn of events. In the virtual meeting, the discussion addressed the perceived ‘uselessness’ of SEG-Y as a cloud format. The committee believes that ‘uselessness’ is a matter of implementation and that the SEG needs to provide guidance on how SEG-Y should be used in the cloud, rather than abandoned. The idea is to publish a Guidance Note similar to those that accompany the IOGPs standards that will keep SEG-Y alive in the modern world, without being beholden to a single vendor solution, even one that is ‘open’. What instruction will be in the Note remains to be seen. This is quite a thorny problem but, as one committee member remarked, ‘We just need to combat the erroneous opinion that SEG-Y is useless’.
Comment: The Bluware/SEG-Y kerfuffle recalls the Petroware/LAS tiff we reported on from last year’s ECIM. In both cases, a vendor is offering a ‘modern’ version of a legacy standard. It is probably a good thing that these initiatives are forcing the standards community to react, at least for the SEG. LAS custodian, the Canadian Well Log Society has not reported any issues with LAS. A CWLS spokesperson told Oil IT Journal, ‘LAS 2.0 is the standard used by the majority of users and is fully supported and regularly updated’. LAS applications, ‘LasApps’, software and certification tools are available from the CWLS website.
© Oil IT Journal - all rights reserved.