Service provider Sinequa has developed an information retrieval system running in the cloud leveraging domains-specific knowledge representation.

Mathilde Fourquet presented Total’s ‘Cognitive search’ platform at the 2019 ‘Big Data Paris’ conference. Cognitive search seeks to develop the ‘transversality’ of information retrieval by leveraging unstructured textual data across different fields of expertise, provide support for new uses of information including natural language queries. Cognitive search is said to improve the user experience for all Total employees. Total is a long-time advocate of best practices in data science having engaged France’s Sinequa to develop a refining and chemicals information portal back in 2008. In 2016 Total awarded Sinequa a service contract to develop a dedicated platform to handle some 50 million (now up to 150 million) engineering and geoscience documents.

The work has involved the consolidation of multiple text processing and portal initiatives around the company onto an information retrieval competency center for the Total group. The center is developing technology for dynamic categorization of 20 million geoscience documents, using AI to classify scanned documents by well, basin and field. Natural language processing has been applied in a proof of concept on rotating equipment. This has produced a 70% hit rate for ‘relevant’ document retrieval thanks to a major, domain-specific ontology and semantic description effort. This is now being extended to ten new domains with improved query and results. Initially the solution was deployed on premise but encountered performance issues. It is now running in the cloud.

One so far unsolved problem is cross-discipline retrieval. Total is looking for a broader industry-wide solution that looks at the big picture. This should allow for natural language queries to retrieve relevant documents, images or videos. Total is now working with Sinequa on a federated search platform and a new interface, tuned to new use cases. For Fourquet, ‘the key is to build a model and user experience that integrates with our own DNA and problem set’.

