BP presentations at the SPE’s digital energy (DE) event and PPDM’s annual Houston event demonstrate the growing role of open source software in high-end upstream information management. At DE, Mohamed Sidahmed presented work performed leveraging the ‘R’ statistical programming language to ‘augment’ operations monitoring by mining unstructured drilling reports. Unstructured, textual, hitherto the ‘missing link’ in the information workflow, contains valuable information on the root causes of deviations from plan and help address ‘inadequate reaction to real time changes.’
R-based text analytics leverage the collective knowledge stored in BP’s Well Advisor, looking for interesting patterns. Visual representations (word clouds) integrate existing surveillance systems and can provide early warning of, for instance, pump failure. More fancy techniques such as ‘latent Dirichlet allocation’ helps identify precursor events hidden in the data. Reports with similar content can then be attached to the root causes of non productive time and rarer high impact events. Data driven learning is now embedded in BP’s CoRE real time environment.
Meanwhile at the PPDM Houston data management symposium Meena Sundaram presented a ‘self service’ architecture deployed at BP’s Lower 48/Gulf of Mexico unit.
BP’s
service-oriented architecture is now up and running. Applications and
data sources are exposed as Rest endpoints, ‘providing scalability and
adaptability to technology innovation.’ The infrastructure stack builds
on a Cloudera ‘data lake,’ of over 35 domain-specific data sources.
These feed into business intelligence and descriptive analytics
applications. The system also supports enterprise level activities from
production accounting to budgets and reserves reporting along with
bespoke ‘on demand’ business scenarios and ad hoc queries.
Sundaram qualifies enterprise level data access as a ‘chicken and egg problem. Do you clean the data or show the data?’ BP has opted for the ‘show’ option, along with governance and data improvement with use. Today the data lake uses Cloudera MapReduce/HDFS, Voyager GIS data discovery and Amazon Cloud Search. ‘Big data’ tools including Solr and Lucene are also used. The toolset is now evolving to offer prescriptive analytics. BP’s near term goal is to move the supporting infrastructure to the Amazon Web Services cloud. More from the PPDM Houston conference and from SPE Digital Energy in the next edition of Oil IT Journal.
© Oil IT Journal - all rights reserved.