Was the future random? Disk vs. tape-based data archival

Landmark’s Eleanor Jack and David Holmes review progress on random access archive media.

Back in 1999 at the SEG, Eleanor Jack presented a paper entitled ‘The future is random: Archiving seismic data to random access media*.’ In an ECIM workshop, Jack, now with Landmark, teamed with colleague David Holmes to examine the degree to which the 1999 ‘future’ predictions have come true, taking a fresh look at the data archival landscape. In 1999, random access archival media meant CD/DVD or optical disk. Magnetic (spinning) disk was considered too expensive for archival. In 2007, archival is still to tape—either IBM 3590E, 3592 or LTO. 3592 capacity is now 750GB going on a TB and LTO4 is currently 800GB. Will these be the solution in 2015? For Jack, this is far from clear, ‘tapes getting bigger without evolving, just like the dinosaurs!’

MAID

Holmes then surveyed the 2007 random access media landscape. Nearline storage now includes CDROM, DVD, memory stick and portable disk (now a major data delivery mechanism for Landmark—even for prestack seismic although it is unsuitable for archiving). Online options include SLED, ‘single large expensive disk’, JBOD ‘just a bunch of disks’ (for gamers) and RAID (but watch out for deletions, not good for the archive).

Cooling

For all continuously spinning disks, heating and cooling are major issues – in this context, tape is ‘way out ahead.’ One answer to this issue is a new technology called MAID – a ‘massive array of idle drives.’ A typical MAID vendor is Copan whose 42 unit cabinet sports a 700TB capacity. Only a few disks are active at any time, data is spread around drives which are powered down between access. MAID has 6 times the data density of a 3592 tape library. Latency for the first GB is acceptable and much better than a robot. But the system has the same delete issues as RAID. Finally hierarchical storage managers (HSM) bring organization to complex storage architectures. HSM manages back up to disk in a transparent fashion.

Media neutral

Jack recommends media neutral formats for archival. SEGY can be used for tape or CD. The current seismic field data format is SEG D whose Rev 2 has a byte stream version that can be stored on anything. Encapsulation is an important technology for gapped tape formats. Encapsulation options include RODE (although this is a complicated option) and the ATLAS TIF format which replaces tape gaps with a digital marker. Current TIF implementations have a 2GB file size limit but a TIF 8 version with a 24byte address gives a file size limit of 8 PB!

XAM

Returning to the archival issue, Holmes noted that current practices do not meet industry requirements. These are a) to remove the requirement for future remastering and b) to be able to throw away the original tapes. Holmes suggests looking again at the Extensible access method (XAM) that is now supported by the storage industry.

Tape or not?

Jack concluded by noting that tape has disappeared in the entertainment industy and in home computing. ‘The present is random. It’s time for the seismic world to catch up.’ There ensued a healthy debate as to the merits or otherwise of tape as an archival media. The proponents of tape advance the cooling and energy issues as well as the fact that tape technology’s progress shows no sign of slowing in the immediate future.

* http://link.aip.org/link/?SEGEAB/18/683/1

Click here to comment on this article

Click here to view this article in context on a desktop

© Oil IT Journal - all rights reserved.