Pointcross completed its drilling data server and repository (DDSR) proof of
concept for Chevron earlier this year. The PoC demonstrated the use of
a Hadoop distributed file system and HBase ‘big table’ data repository
for storing and analyzing a multi-terabyte test data set from Chevron’s
drillers.
Currently such data is spread across
multiple data stores and search and retrieval is problematical. There
is a ‘knowledge gap’ between data scientists and domain
specialists. The formers’ algorithms may be running on a
limited data subset, while the latter tend to develop spreadsheet-based
point solutions in isolation.
A big data solution however is
not magic. Pointcross’ specialists have developed a classification
schema for drilling documents, a drilling taxonomy and an
Energistics-derived standard for units of measure. This has enabled
documents and data sources such as LAS well logs, spreadsheets, text
files and Access databases to be harmonized and captured to a single
repository.
One facet of the PoC was the ability to scan volumes of well log curve data to detect ‘patterns of interest,’ an artificial intelligence-type approach to the identification of log signatures. These can be used by drillers to pinpoint issues such as mud losses or stuck pipe. The technique can automate the identification of formation tops.
The sample data set comprised some 6,000 wells with over three billion records and half a million ‘other’ documents. All in all around nine terabytes of data were loaded to the Hadoop cluster, set up in Chevron’s test facility in San Ramon.
A key component of the Pointcross IT stack is a ‘semantic data exchanger,’ that maps and connects disparate data sources to the DDSR. A significant effort was put into a mnemonic harmonization program. This was to compensate for the plethoric terminology and abbreviations that plague upstream data sets causing ‘misperceptions and increased complexity’ for data scientists.
Pointcross is now pitching a
second phase of the PoC with enhanced functionality, more data and use
cases. These include more taxonomic harmonization, data curation and
event extraction from PDF documents, daily reports, trip sheets and
more. Phase two will investigate how other causes of nonproductive time
can be attributed to lithology, bottom hole assemblies, and crew
schedules.
Pointcross is also extending its big data offering
under the Omnia brand, into a ‘total business solution for exploration
and production.’ Omnia includes solution sets for geophysical, shale,
production asset management, big data and enterprise search. More from Pointcross.
© Oil IT Journal - all rights reserved.