Watson and the weather

Neil McNaughton asks if the current big data and analytics hoopla is justified. In some fields, like finance, and probably oil and gas, it can be hard to test AI objectively. In weather forecasting though, there is a vast body of data and also, a rich forward modeling tradition that should allow for a head-to-head comparison of the two approaches. Looks like IBM is on the case already.

Hedge funds, day traders and others seeking get rich quick schemes are looking intensely at what has come to be known as big data and analytics. On the information technology side, the analytics approach is being supported by trendy new technologies – data lakes, machine learning, artificial intelligence and so on. Current media reporting focuses on the promise of such technology, the upside; along with the its’ risks to humanity, the downside. What is not so clear is the possibility that the ‘big data and analytics’ (BDA) movement may be another false dawn for artificial intelligence, the blindside.

It is a good idea to step back from the fray and look at what BDA is trying to do. On the one hand, there is the expectation that, by analyzing a very large amount of data, hidden trends and truths may emerge. The day traders are hoping for some a hitherto unseen correlation that foretells the future value of an investment. BDA promises to beat traditional analyses using charts or fundamentals. Unfortunately, finance does not lend itself to rigorous testing because if a new martingale is discovered, it is unlikely to be widely reported. If it is made public, investors will quickly adapt and lessen its effectiveness.

In science and engineering, BDA is up against a different kind of traditional analysis, generally known as forward modeling. If you are manufacturing an airplane wing or other mechanical part, or if you are studying the flow of chemicals in a reactor, the science behind these activities is well understood. Building detailed models of the part or process in the computer is now used to test proposed modifications to deformations, flow rates and such, that were previously done with real physical models. Should there remain uncertainty in a digital model’s validity, other measurements can be added to ‘ground truth’ and refine the model.

Forward modeling is a very powerful tool and it has been used for decades to model fluid flow in an oil reservoir. But this activity is different from the engineering fields above in that models of oil fields are built on incomplete data. A few sample boreholes, flow rate tests on wells, a relatively short period of oil or gas production do not make for a sufficiently detailed numerical model for unequivocal results. Here, forward modeling becomes an exercise in approximation and interpolation, with results that may vary greatly depending on what assumptions were used.

It is this kind of under-specified problem that is the predilection of the BDA enthusiasts. Instead of using forward modeling using physics, just throw all the information you have into the computer and let the machine figure out what is happening using machine learning/artificial intelligence. This approach is presented in some circles as a ‘no-brainer.’ To quote ConocoPhillips’ Richard Barclay, a strong advocate of the use of AI in the oilfield, ‘If you are still using physics-based models [as opposed to BDA/AI] then you are leaving money on the table.’ Strong words indeed, backed up enthusiastically by the IT/consulting community who see a whole new field of opportunities for getting their foot into the door and doing ‘disruption.’

Oilfield modeling is, and will likely remain, closer to financial modeling in that it is difficult to know when you have got something right. There are far too many ways of doing the calculations. Again, truth lies in the future, folks will keep on arguing about the outcomes and how much, if any, money is still ‘on the table.’ Acquiring more data is often prohibitively expensive. In reality, the techniques behind todays’ big data approaches have been used for decades in the oil and gas business and in many other verticals without much apparent ‘disruption.’

Weather forecasting on the other hand is an area that ought to provide a good test of full physics, forward modeling and BDA/AI. Currently, weather forecasters use forward modeling of the physics of air motion, a spinning earth, sunlight, water uptake from the oceans and so on to provide quite accurate forecasts of the weather a few days out. Weather forecasters also collect and have been collecting data, lots of it, for decades if not centuries. All of which should make a great litmus test for forward modeling versus analytics.

I say ‘should make,’ but weather data is not all that easy to access. The national weather authorities have invested heavily in computing resources that generally use the forward modeling approach and may not be all that keen on sharing their data with folks who may use it to compete with them in their commercial forecasting, or perhaps to develop a better, cheaper means of forecasting. There are notable exceptions. The US NOAA has opened its data for public use in its ‘Big data project, ‘but even here, public access is ‘limited to 10% of NOAA’s 20 terabytes of daily output’.

Alongside the ‘official’ met data collected by NOAA, a more grass roots data collection effort is exemplified by WeatherUnderground which links together hundreds of thousands of private individuals’ weather stations (one of them which I own!) In 1995 WU started out as a militant ‘underground’ activity but its potential was spotted early on by The Weather Company (owner of the US TV station The Weather Channel) which acquired WU in 2012. Earlier in 2016, TWC signed with IBM which is now to host its weather data and apply its Jeopardy-winning Watson’s ‘cognitive’ BDA technology to weather forecasting. The question now is, will IBM’s Watson beat the traditional forecasters? If it does, will we even know?


Click here to comment on this article

Click here to view this article in context on a desktop

© Oil IT Journal - all rights reserved.