Hadoop for upstream

PointCross unveils open source stack leveraging ‘big data’ technology for well and seismic data management. Is this the most significant development in the upstream since the CDP stack?

The highlight of the 2012 SPE Intelligent Energy event held in Utrecht, Netherlands this month was the low key introduction by PointCross (HQ, Foster City, California) of two upstream data stores and an open source software stack of data management tools. The new products comprise seismic and well data management solutions along with a generic enterprise taxonomy and search engine. PointCross’ Abhilash Naroth told Oil IT Journal, ‘Today’s data status quo turns around Witsml well data and relational databases—often PPDM. We are changing this paradigm with a unified repository based on Apache Hadoop, combing these and other industry formats using the scalable technology deployed by Google, Facebook and Twitter.’

Hadoop is a distributed file system designed for use across ‘petascale’ clusters. PointCross’ drilling data server and repository (DDSR) uses the ‘HBase’ non-relational database derived from Google’s ‘BigTable.’ This multi dimensional data store underpins Google Earth and Finance. Its data model is simple, all values comprise a row string, a column string and a time stamp. Developers can further structure the strings for a particular purpose. PointCross uses the BigTable to spatialize seismic positional data for rapid retrieval.

Other elements of the open source stack provided by the Apache foundation include Chukwa, used to parse and store across HBase and Hive. Hive maps ‘traditional’ SQL queries from industry standard apps to data in the Hbase commands. Spatial data is indexed using a convex hull algorithm that pulls up seismic polygons in performant, scale sensitive retrieval. Alongside the DDSR is a seismic data store and a taxonomy engine that provides lookup and cross referencing of common industry data types such as well log curve mnemonics from the major contractors.

All of PointCross’ Hadoop solutions are available for deployment within the corporate firewall or through Amazon’s elastic compute cloud for sharing access with oil field service providers. PointCross sees the scalability of such solutions as well-suited for data mining applications—looking for patterns in well logs and seismic data volumes and operational data. PointCross’ leveraging an open source stack is an interesting departure from its previous ‘allegiance’ to the Microsoft Upstream Reference Architecture. MURA is apparently absent from the DDSR—as indeed it seems to have been from Microsoft’s own Global Exploration Forum—see our report on page 7 of this issue. More from PointCross on epsales@pointcross.com.

Click here to comment on this article

Click here to view this article in context on a desktop

© Oil IT Journal - all rights reserved.