Hadoop in the upstream

Apache trials Hadoop-based geo-analytics. Devon and Noble report on big data initiatives. OSIsoft’s Project CAST brings fruit in new PI System Integrator for business intelligence.

Speaking at a Houston chapter meeting of the Esri Petroleum User Group last month, Apache’s Bruce Sanderson described the spatial data management window as ‘too short.’ Apache is trialing a Hadoop deployment to capture GIS and unstructured data from multiple sources. With help from Cloudera, Apache Squoop was used as an ETL* engine to transfer data to the Hadoop file system. The open source Snakebite Python interface to Hadoop feeds real time truck routes and directional drilling data to ArcGIS. A ‘geo-analytics’ interface embeds Cloudera’s Impala analytic database for Hadoop.

Hadoop also featured in presentations made earlier this year at the 2015 OSIsoft User conference in a joint presentation from John Baier (OSIsoft) and Don Morrison (Devon Energy). Baier cited a high failure rate on business intelligence projects, in part due to complex and slow data export from PI and other systems. To date, the vision of wide data access has proved elusive. Enter the new PI System integrator for business intelligence which feeds real time data directly into business intelligence tools such as Spotfire, Teradata and Tableau, or into the Hadoop file system.

Devon has used a prototype of the integrator to pipe real time data into its tools of choice which include Spotfire, IHS Harmony and Excel. PI is used across the company’s assets, from drilling to production. Currently this is done with PI DataLink which works for smaller data sets but does not scale to the multimillion point data sets that Devon receives from its unconventional operations. The Hadoop use case is an attempt to capture and blend real time data from PI and other sources to see how the Hadoop toolset performs.

A follow up presentation from Matt Ziegler (OSIsoft) and Wes Dyk (Noble Energy) set out to demonstrate the value of data science in production analysis and optimization. As we reported in our last issue, Hortonworks has been helping Noble aggregate data across Scada, subsurface and other systems. OSIsoft’s big data push began as Project CAST last year which set out to deliver ‘data delivered on your terms, in your language, to the tools you use, and to the people that can make a difference.’

The question now arises, where does this put all the business intelligence bells and whistles that OSIsoft itself provides? The latest PI system release addresses this issue with new functionality that allows predictive data to be stored natively and exposed throughout the PI System, ‘sharing data from 3rd party analytics and machine learning tools throughout the organization.’ More from OSIsoft.

* Extract, transform and load.

Click here to comment on this article

Click here to view this article in context on a desktop

© Oil IT Journal - all rights reserved.