HansonWade 2019 Applied Data Analytics Upstream, Houston

ConocoPhillips ExtraTrees ML for SAGD. Apache on cost functions in forecasting. Schlumberger’s data-driven prognostics and health management. Woodmac’s new Analytics Lab.

Christopher Olsen presented a detailed data-driven approach to steam allocation optimization at CononcoPhillips’ Athabasca oil sands ‘Surmont’ acreage, which uses the steam assisted gravity drainage (SAGD) process. The objective was a full-field tool that reacts to live process data and well performance to optimize steam allocation in real time. A massive amount of data was available, on wells, injection, pressures, logs and (a lot) more ‘closely tracked’ process variables. A comprehensive analysis of data acquisition and tests of various ML-based optimization led to an approach based on a virtual flow meter and recurrent neural networks. RNNs are said to excel at forecasting sequences, the next sequence of characters in a text string, or the next set of values in a time series. The RNN was trained on historical data for several key variables and used to forecast what should happen for the next several hours. The output feeds other ConocoPhillips models to predict oil/water rates (a virtual flow meter fills gaps in the data). A first production version was released late 2018 using ExtraTrees (a variety of Random Forest) based forecasting. This has been updated to incorporate deep learning and now models automatically retrain on new data with reinforcement learning.

David Fulford (Apache) introduced his presentation on the importance of cost functions in production forecasting with a recap of Anscombe’s Quartet, a statistics classic on the multiplicity of statistical interpretations that can be derived from the same data, and on the importance of graphics in an analysis. To evaluate a machine learning algorithm, a quantification of ‘goodness of fit’ is required, aka a cost function. The choice of cost function(s) can be as important as the choice of predictive model. Understanding the data model gives insight into appropriate choice of cost function. Uncertainty is a fundamental characteristic of modeling. A ‘best fit’ is not the same as a best forecast. It does not mean that only one set of model parameters fits the data!

Schlumberger’s Enterprise Solutions unit is a long-time user of machine and deep learning for real - time predictive maintenance of its frac pumps as principal data scientist Jay Parashar explained. Schlumberger’s prognostics and health management (see also the PHM Society) uses data-driven PHM as opposed to reliability-based maintenance. The ‘state of the art’ PHM system encodes time-series data in a polar form and ‘performs a gram matrix-like operation on the resulting angles. Convolutional neural nets also ran. While ‘PHM has impacted the definition of maintenance in a big way’, machine learning results still ‘need to be explainable’.

Preston Cody presented Woodmac’s new Analytics Lab, a data consortium that works along the lines of similar data sharing efforts in insurance and banking. The Lab uses a ‘give to get’ model where members agree to exchange data with certain other members of the lab. Members only receive comprehensive analyses if they contribute data themselves. Woodmac acts as an independent facilitator that manages and harmonizes data across multiple sources, without bias, and provides ‘statistically sound’ analytics. The Analytics Lab is backed by a cloud-based platform and robust capabilities that enable computationally intensive analytics on large datasets. Woodmac’s parent company Verisk adds its computer security expertise that complies with major security and governance regulations. More from HansonWade.

Click here to comment on this article

Click here to view this article in context on a desktop

© Oil IT Journal - all rights reserved.