High performance computing in the geophysical spotlight

Petaflops? They’re here now! IBM, Chevron, Tierra, Convey and Intel slug it out over which hardware will prevail.

From the HPC standpoint, the SEG was overshadowed by the HPC flagship SC2008 event the following week where the big news was the breaking of the Petaflop barrier by the Los Alamos Lab’s ‘Roadrunner.’ According to IBM’s Tom McClure, most oils see the in-house petaflop barrier falling by 2010. But the reality is that one oil company already has a Petaflop installed! The Linux cluster has brought about this revolution but the ‘x86’ architecture is now hitting the ‘power wall’ leading to a proliferation of esoteric architectures in projects such as Repsol’s Kaleidoscope. Here, IBM’s Cell BE gives phenomenal compute bandwidth (120 Gflops in a 22 sq. ft. bay). But the catch is, you need to vectorize your software.

Tamas Nemeth (Chevron) is using a field programmable gate array (FPGA) from Maxeler Technologies. Capacity hikes can be obtained by better technology or parallelism. FPGAs offer both at the expense of some awkward programming. Various memory and bandwidth ‘gotchas’ mean that speedup may be good for some routines and less so for others. Given the amount of work involved setting up a job, it is necessary to predict speed up for different hardware solutions. FPGAs are easy to predict and hard to program, while the reverse is true for the GPU. But there is potential for predictable, high computational speedup.

Alex Loddoch (Chevron) explained how GPUs act as co-processors to 3D acoustic wave equation migration by over 10 fold and heralding the ‘return of vector computing.’ NVIDIA CUDA holds center stage here with today 240 cores per GPU and up to 4GB memory. Memory transfer is good but size is limited – 4GB is too small for a 20GB billion cell model. Data must be sliced, leading to more data transfer and reduced efficiency. Comparisons with ‘equivalent’ CPU-based solutions are hard, but Loddoch reckons ‘an order of magnitude of speedup is achievable and programming is easy.’

Christof Stork (Tierra Geophysical) compared performance of a finite difference algorithm across FPGA, Cell, GPU and x86 architectures. The appeal of this novel hardware stems from the fact that ‘Moore’s law is done, kaput.’ Stork warned that just as MIPS used to be considered ‘meaningless indicator of performance,’ GFlops should be taken with a pinch of salt. Stork’s analysis came out in favor of working hard to speed up Intel’s CPU-based architecture, although ‘heterogeneous cores are the future.’

The SEG finished up with an HPC Workshop Chaired by Keith Gray with around 150 in attendance. Steve Briggs told how Headwave was focused on prestack volume visualization and analysis. Here GPU is ‘much easier now that CUDA is there’ although Briggs warns ‘You can do a lot of cool stuff with GPUs but they are not going to handle full datasets.’ They present many bottlenecks and roadblocks—and this is in the face of expanding data sets.

Steve Wallach (Convey Computer) described programming Cell-like architectures as ‘a nightmare.’ Programmer productivity is where it’s at, performance is a giveaway. Convey is refocusing on uniprocessor performance with a new ‘hybrid core’ computer that will expose standard Fortran, C, C++ to programmers. This uses an Intel socket and a Xilinx FPGA, tuned in hardware for oil and gas, financial services etc..

In the ensuing panel discussion, both Jim Ballew (Appro) and Pradeep Dubey (Intel) were skeptical regarding the advantages of the CPU. For Ballew, ‘it is and always has been about memory. A GPU could be infinitely fast—the issue is bandwidth between processor and GPU.’ Dubey agreed, adding that Intel’s answer, ‘Larabee,’ will be released ‘real soon now.’ Regarding the claimed GPU speedup Dubey suggested checking the (Intel) base line which could be ‘2x or 200x off.’ In a recent sort benchmark an Intel 4 core chip beat all GPU contenders.

This viewpoint was not shared by AMD’s Kevin McGrath who sees a hybrid future. The CPU will dominate the near term for many codes but GPU-like devices are very good at delivering performance. Programmer productivity is a challenge and needs open industry standard, tools like Brook+, OpenCL as opposed to ‘proprietary environments’ (like CUDA?).

This report is an extract from The Data Room’s Technology Watch report of the SEG—more from tw@oilit.com.

Click here to comment on this article

Click here to view this article in context on a desktop

© Oil IT Journal - all rights reserved.