SC09, Microsoft in HPC, Nvidia and AMD/ATI...

Cray at number 1. Microsoft down 10. ATI (not Nvidia) tops-out GPU-accelerators. Brown Deer Technology’s David Ritchie compares CUDA, OpenCL and AMD’s Cypress gigaflop boards.

At the SuperComputing event held last month in Portland, Oregon, Cray’s ‘Jaguar’ system at Oak Ridge National Lab took the top spot with a 1.75 petaflops of linpack performance. Also of note was Microsoft’s slide down the rating scales—from number 10 last year to number 20. The problem for Microsoft is that the Windows HPC 2008-based ‘Dawning’ cluster at the Shanghai Supercomputer Center has not been upgraded in the interim—and in HPC, standing still is not an option.

How does this affect the upstream? Microsoft’s HPC solution specialists Mark Ades, speaking at an IBM-sponsored event at last month’s SEG acknowledged that, ‘Linux is very dominant in this [HPC] space—especially in seismic processing.’ Microsoft is now pushing its agreement with Novell/Suse and its Linux management platform for pure-play HPC, while its own Windows HPC 2008 is to focus on ‘high requirement workstation jobs’ like reservoir engineering and the Excel Runner for humungous spreadsheets.

Intriguingly, the fastest GPU-accelerated machine in the TOP500 is the National SuperComputer Center in Tianjin. This uses ATI’s Radeon HD 4870 accelerators to achieve 563 teraflops and got the number 5 slot. We were curious to see how ATI was shaping up against Nvidia in number crunching and quizzed Brown Deer Technology*’s David Ritchie who provided the following.

’NVIDIA’s CUDA has certainly done well in the GPGPU community and you will have heard a good deal about ‘Fermi,’ due to launch next year, promising 520-630 GFLOPS double and 1.0-1.25 TFLOPS single precision peak performance. On the other side of the fence, AMD/ATI is now shipping Cypress boards providing peak performance of 544 GFLOPS double, 2.7 TFLOPS single, along with a dual GPU board that provides 928 GFLOPS/4.6 TFLOPS! Cypress also provides many of the hardware ‘advances’ that Fermi promises such as fused multiply-add instructions. Paper specs are nice but meaningless if you cannot program the boards. For this reason, OpenCL is a welcome introduction to GPGPU, with its industry-wide support.

From my point of view, the industry standardization of the programming API for GPGPU puts AMD/ATI in a good position with their superior hardware specs since we are likely headed for a time of code portability more akin to what we find with multicores where the battle is between hardware, not SDKs. This should be good for the industry since it puts real competition in place, from a programmer’s point of view, since applications will be portable.

Unlike the situation with multicore, getting good performance with GPGPU will still require a good deal of algorithm tuning since the compilers are far less mature than GCC or Intel compilers.We have had good success with the last generation of AMD/ATI hardware in several projects and are focusing now on exploiting the increased performance of the Cypress boards to accelerate existing algorithms, while transitioning those algorithms to OpenCL.’

* BrownDeer provides HPC/GPU optimizing services to clients including Exxon Mobil and Shell. More from browndeertechnology.com.

Lexicon: GPU—graphics processing unit. GPGPU—general purpose GPU (for computing rather than graphics). CUDA—Nvidia’s GPU programming language. OpenCL—an embryonic cross-platform accelerator language.

Click here to comment on this article

Click here to view this article in context on a desktop

© Oil IT Journal - all rights reserved.