CDA’s unstructured data challenge

How to ‘make sense’ of a disparate document archive of 11,000 wells and 2,000 seismic surveys.

Earlier this year, the UK’s joint industry upstream data holder Common Data Access (CDA) announced, by way of its website, an ‘unstructured data challenge,’ asking the industry if it could ‘make sense’ of the documentation describing over 11,000 wells, and 2,000 seismic surveys. A ‘small number of data and information analytics tool vendors’ were invited to apply their expertise to extract information from 50 years’ worth of reports, log images and other unstructured data types.

Speaking at this month’s ECIM data management conference in Haugesund, Norway, Paul Coles presented Schlumberger’s findings in the context of a wider ‘intelligent repository’ project. This has delivered a spatially enabled keyword database capable of delivering automatic composite logs and assembling interpretations. Tools used included Wipro’s Holmes AI platform, Apache Solr enterprise search and other tools for automated text summary and classification. Machine learning was used to establish data relationships and correct around 15% of the data set with ‘esoteric’ curve names or wrongly labeled logs. Automated petrophysical interpretation was combined with cuttings descriptions from reports. All running on a ‘scalable cloud-based platform.’ The result is an automated mapping system for the UKCS, delivered in Petrel, that brings insights into hydrocarbon systems and future business opportunities. More from CDA.

Click here to comment on this article

Click here to view this article in context on a desktop

© Oil IT Journal - all rights reserved.