PUT A VIPER
IN YOUR VAULT

SOHO backup gets supersized on an enterprise scale as new advancements in LTO technology open a golden opportunity to the small business masses.

   
  by Jack Fegreus      
     
 

Linear Tape Open (LTO) is a consortium, which is spearheaded by HP, IBM, and Seagate, that started out targeting the mid-to-high-end tape drive market. The best known of the consortium’s specifications hones in on a single reel cartridge—similar to DLT—dubbed Ultrium. The idea was to concentrate on a tape format for media interchange and compete directly with the likes of SuperDLT from Quantum and Mammoth2 from Exabyte. 

Two years ago, the original Ultrium specification called for a 1/2 inch, linear, bi-directional format that uses a single reel cartridge, which is just slightly smaller—20 percent lower to be precise—than a DLT Tape IV cartridge. While the size difference may seem marginal, in a density conscious computing environment, this difference can make a significant difference in an automation scenario.

 
         
 
openBENCH LABS SCENARIO
UNDER EXAMINATION
Linear Tape Open Ultrium tape drive performance

WHAT WE TESTED

Seagate Viper 200  
http://www.seagate.com


NetVault 6.03 bundle
http://www.bakbone.com

HOW WE TESTED
Dell PowerEdge 2400 Server http://www.dell.com

QLogic QLA12160 Ultra160 SCSI HBA
QLogic QLA2200 Fibre Channel HBA
QLogic SANbox-8 Fibre Channel switch
http://www.qlogic.com

Imperial Technologies MegaRam-2000 http://www.imperialtech.com
Red Hat Linux v7.3
http://www.redhat.com
 openBench Labs oblTape v1.0 benchmark

KEY FINDINGS
  No problems were encountered with media interchange among cartridges from HP, IBM, and Seagate on the Seagate Viper 200.
  The theoretical performance envelope of the Seagate Viper LTO drive was 40% greater than that of an SDLT 220.
  Compared to an Exabyte Mammoth2, the theoretical performance envelope of the Seagate Viper LTO was 40% greater with no compression, but only 10% greater with compressible data.
   Using NetVault, backup performance of the Seagate Viper showed a 15% edge in throughput performance over an Exabyte Mammoth2 with our 5GB test data set.

 

More specifically that original Ultrium specification called for 8 read/write channels and a native cartridge capacity of 100GB. The specification did not call for standards concerning reliability, form factor, power consumption, or performance. All of those important aspects were left open so as to promote healthy competition among competing vendors. It also left open the door for lots of speculation over tape cartridge incompatibilities.

Nonetheless, at the time of the launch, the future directions of Ultrium were rigorously defined for the next 7 years. In particular, the number of tape tracks are about to increase from 384 to 580. This will severely aggravate the problem of signal interference from cross talk on the densely packed Ultrium tape. To dodge this issue, Partial Response Maximum Likelihood (PRML) technology, which originated at NASA to read weak signals from satellites deep in space and is already being used on the current Mammoth-2 and SDLT drives, will be introduced in the second generation of LTO Ultrium drives.

With all this wave of new technology about to break over the high-end of the tape market with its large automated libraries, it is the perfect time for smaller departmental and well-heeled SOHO sites to begin taking advantage of the current generation of LTO technology as the price drops to keep the 2nd-generation of LTO technology competitive from a price-performance perspective.

In this light, openBench Labs accessed a standalone, Ultra2 SCSI, Seagate LTO drive dubbed the Seagate Viper 200. In a very aggressive move, Seagate has bundled this tape drive with a special version of the NetVault backup software from BakBone Software for Linux, Unix, or Windows 2000. Based on version 6.03 of NetVault, this is a full-function version of this backup package—for exclusive use with the standalone Seagate Viper.

In essence, this bundling puts two of the preeminent enterprise backup hardware and software technologies that are renown in tape automation scenarios into the hands of departmental and SOHO users. When it comes to backing up terabytes of data overnight, sophisticated robotics controlling multiple drives is de rigueur.

 
     
 

But if you are looking at only tens or just a few hundred gigabytes of data, sophisticated robotics and multiple drives are the last things you need. What’s necessary is a super fast tape drive that can stream real-world data at 20MB/sec on to a cartridge that can hold a good 150GB of compressed data. And when you can do all that plus backup and restore files with world-class software all for less than $4,500, that’s an option that few would have thought possible.

The hardware that makes this possible is Seagate’s LTO Viper 200 drive. The “200” represents the mythical 200GB of storage capacity possible with “nominal” 2-to-1 compression of data. In the real world where incompressible tar, zip, and JPEG files abound, count on an overall compression ratio of 1.5-to-1 and hope for 1.8-to-1. Nonetheless, LTO drives like the Viper can take the sting out of these problem files with a bit of unique legerdemain.

As part of their repertoire, LTO drives, like HP’s SureStore Ultrium 230e and Seagate’s Viper 200, add an automatic pass-thru mode to their compression circuitry. When the circuitry detects a file that appears to be already compressed, it simple sends the raw data straight to tape. Assuming some reasonable accuracy, this will sacrifice a little capacity for improved throughput as the nightmares of stalled compression queues and multiple tape repositions are avoided.

This is particularly important for LTO drives, which have an architecture that is highly dependent upon the continuous streaming of tape. The Ultrium format calls for recording eight data tracks simultaneously. While data is being written, separate read elements are used to verify that the correct data has been recorded and can be recovered on each individual track. By way of comparison, Exabyte’s Mammoth-2 writes 4 tracks of data while simultaneously reading the previous 4 tracks of data.

 
         

 

Using oblTape, our openBench Labs benchmark, we pegged base throughput without hardware compression at 14.2MB per second. That’s about 30% faster than the baseline for Mammoth-2. What’s more, we got the same results when we interchanged LTO media from Seagate, IBM and HP. Rumors aside, we had no problems whatsoever with media interchangeability.

Underneath the hood, oblTape allocates a large block of memory from which it streams data to the device in block sizes of 2n KB, where n ranges from 0 to 8. By streaming data directly from memory, the benchmark eliminates bus bandwidth contention with other devices. In addition, the oblTape benchmark generates two distinctly different data streams: purely random data and data that falls into a preset frequency pattern that we devised and calibrated to produce a compression ratio on the order of 1.9-to-2.1.

 
Our oblTape benchmark helps define a performance envelope for a tape drive. We use the benchmark to mark three critical performance points. The first point is the native throughput rate of the drive with no boost from data compression firmware. Then, using a stream of patterned data that was calibrated to produce a 1.9-to-1 compression ratio on a DLT 7000 drive, we determine an optimistic practical upper limit. Finally, we determine a pessimistic performance level by streaming random incompressible data to the drive.
     
 

When we ran the compressible data stream with hardware compression, write throughput on the Seagate Viper 200 rose to 28.7MB per second, which is perfectly in tune with the design of the benchmark. Nonetheless, the HP Ultrium 230e soared to 32MB per second, which represents a compression factor of 2.3-to-1. Even more impressive, our tests of Exabyte’s Mammoth-2 provided an average compression ratio of 2.5-to-1 on our patterned data stream. Both of these drives utilize a new Adaptive Lossless Data Compression (ALDC) algorithm that purports to provide a 2.5-to-1 compression ratio across multiple data types.

Even more interesting were the results of openBench Labs' worst-case tape scenario. When purely random data is sent to a tape drive while compression is on, the drive typically attempts to compress this data to no avail. As a result, the drive wastes its own embedded CPU cycles trying to compress incompressible data, the filling of data buffers stalls, and throughput degrades. This synthetic test corresponds closely to what happens when backing up highly compressed files such as the JPEG and Flash image files that proliferate so many web sites.

In normal testing, this loss, which is represented by the difference measured in data throughput writing incompressible data with compression turned on and then turned off, is typically about 10%. With both the Seagate and HP Ultrium drives, performance degradation fell to little more than 0.5%. The oblTape benchmark therefore sets the boundary conditions for any tape subsystem. In particular, the expected performance of the Seagate Viper 200 was pegged to be between 29MB per second with highly compressible data and just under 14MB per second for incompressible data.

 
         

 

We finished our testing by backing up a 5GB data set containing 30,000 files using the bundled version of NetVault that is packaged with the Seagate Viper 200. Using Pkzip, we were able to compress these files by a factor of 2; however, this test took approximately an hour on an otherwise quiet system with a 600MHz CPU. The compression circuitry on the Seagate Viper compressed the data by a factor of 1.4-to-1. This was done while backing up the files with NetVault on a Dell PowerEdge 2400 server running Red Hat Linux 7.3.

Getting this all of the technical pyrotechnics of NetVault into play turned out to be quite easy thanks to two remarkably simple X-Windows GUIs. There is the near-trivial nvconfigurator, which takes care of the standard configuration data, and the main nvgui, which is delightful in its logic and simplicity. Even the most recalcitrant "we-don’t-need-no-stinkin’-manuals" operations staff will have no problem with nvgui.

 
The NetVault GUI is common to all software platforms, Linux, Unix, and Windows. It represents one of the rare instances when a GUI is extraordinarily intuitive and quite powerful.
     
 

The nvgui interface comes up with 8 central options for backup, restore, client management, device management, status monitoring, media management, job management, and log viewing. One of the real strengths of the standard NetVault package is a plethora of modules representing the gamut of tape drives and libraries in play at IT sites. The bundled version has only one of these modules: a standalone Seagate Viper.

To insure that our performance tests would stress the NetVault software and the Viper drive without introducing extraneous I/O concerns, we first needed to choose a disk resource that could handle a sustained I/O rate of 30MB per second—our 2-to-1average compression goal and a peak load of 60MB per second—a 4-to-1 compression ratio, which is not uncommon for few sparse database files.

 
         

 

Our ideal candidate for this was the Imperial Technologies MegaRam-2000 connected to our SAN. In both read and write tests, we were able to sustain 100MB per second throughput rates, which would be essential for measuring the upper performance capabilities of back up and restore operations with NetVault and the Viper.

Finally we needed to make one performance adjustment to the default buffer size that NetVault sets up for any tape drive. That default is 257KB (256 + 1). With 512MB of RAM in our server—you can never have enough—we set our drive’s buffer to 30,721KB.

That done, we fired off a series of backups of our 5GB test collection resident on the MegaRam-2000. The average throughput for NetVault in these tests was 20.1MB per second. That put average compression on our data, which is populated by a lot of compression-unfriendly image files and C++ object and executable files, at a little better that 1.4-to-1. As a result, our upper bound on cartridge capacity for this particular data was about 140GB.

 
Running a real-world backup with a 5GB collection of Open magazine data file well populated with images along with C++ and Java code projects, our backup throughput fell nicely into the center of our performance envelope as defined by oblTape.
     
 

Our final test was another tour de force for NetVault. When it comes to restoring data back to disk, openBench Labs has yet to find any package that can stream data from tape to disk at speeds comparable to NetVault. Once again, NetVault played the Viper like a Stradivarius as it restored our files at a blistering 17.4MB per second.