SCALABLY SUPER
  Backup and recovery are critical operations for any datacenter and too often the most ignored. As client/server, DBMS-based, backup programs rapidly increase in complexity, their chances of running in an HA cluster decrease just as fast. Overland Data's solution: configure a modular SuperDLT tape library, which behaves like a 'cluster'.    
         
  by Jack Fegreus      
 
 

Today multiple choices in both linear and helical scan formats offer users more freedom than ever to choose the tape option that best meets their requirements. For the current generation of enterprise-class tape systems, the battle lines are drawn among SuperDLT, backed by Quantum, LTO, backed by HP, IBM, and Seagate, and Mammoth-2, backed by Exabyte. While all three technologies offer extraordinary throughput speed and cartridge capacity, the explosion in the growth of storage has driven many mid-range enterprises and even large departments that only a few years ago would never have considered tape automation into the market for a tape library. International Data Corp. (IDC) projects that tape automation sales will grow at an annual rate of 25% and create a $5-billion market by 2003.

 
   
 
OPENBENCH LABS SCENARIO
UNDER EXAMINATION
• SuperDLT tape drives
• Modular automated tape library with hardware  fail-over capabilities
• MySQL-based backup software

WHAT WE TESTED
(2) Overland Data Neo Series LXN2000 automated tape libraries
OEMed by Compaq as the MSL5620SL
www.overland.com
www.compaq.com

(2) Quantum SuperDLT tape drives
www.quantum.com

BRU Professional Archive Management System
www.bru.com

HOW WE TESTED
Red Hat Linux v7.1
www.redhat.com

QLogic QLA12160 HBA
www.qlogic.com

obltape v1.0 benchmark

KEY FINDINGS
• BRU-Pro easily managed the Overland library and SuperDLT drives.
• BRU-Pro throughput performance is currently limited by a hard coded data transfer buffer size of 32 KB.

 
 

Quantum’s SuperDLT and the Ultrium line of LTO drives from HP, IBM, and Seagate continue the fundamental difference between linear tape technologies and helical scan as currently represented by Exabyte’s 8mm helical scan Mammoth-2 tape drive line.

To drive throughput, linear tape drives move tape rapidly past stationary heads at a speed of up to 160 inches per second (ips). On the other hand, helical scan drives rely on a slow moving tape—1.8 inches per second (ips)—crossing a fast spinning set of heads mounted on a drum or scanner. The net result is a relative head-to-tape speed for the Mammoth-2 of 547 ips.

The necessity to move the tape rapidly, and stop and reverse direction in order to record on the next set of tracks, places a high degree of stress on the tape media. This stress makes the tape more susceptible to wear and damage. In the case of current DLT drives, there can be as much as 133 grams of tension on the tape, which is more than an order of magnitude greater than the Mammoth-2’s servo-controlled, direct-drive dual-reel mechanism, which only exerts 12 grams on the tape.

This fundamental difference naturally results in different operating characteristics. As the name implies, SuperDLT and LTO write data in long parallel tracks that run the length of the tape. Filling a tape with data requires numerous passes as the tracks wind in a serpentine fashion. Helical scan technology, however, writes data in short angled tracks that run across the width of the tape. Since the axis of rotation is not orthogonal to the tape’s line of motion, the tracks are angled across the width of the tape and all of the tape is utilized on just one pass.

These differences in the way data is laid out on the tape raise interesting implications for performance. The universal truth for all tape drives is that performance is utterly dependent upon the ability to keep the tape streaming across the head. Anything that interrupts the flow of data will significantly impact overall performance as the system is forced to stop and reposition the tape. When such an event occurs, the physical rather than relative speed at which the tape is moving past the head will be a more determining factor in how long it takes to reposition the tape and resume writing data.

 
 
 

To pack 110 GB of uncompressed data on a single cartridge—the previous generation DLT8000 has a native capacity of only 40 GB—SuperDLT tapes are formatted with 448 data tracks with each track written at a density of 133 Kbits per inch. That's an areal density of 166 Mbits per inch2. With the tape streaming by the heads at 160 ips, Quantum faced serious technological hurtles to insure a low error bit rate as well as the ability to scale the areal density over the coming years by upwards of two orders of magnitude. To this end, Quantum pioneered a radical new technology for the SuperDLT drive dubbed Laser Guided Magnetic Recording. For the first time in tape history, Quantum joined laser technology with magnetic technology in a single tape system.

 
   
 

LGMR ensures higher cartridge capacities by servoing from optical targets on the media’s back side, which allows 100% of the magnetic surface, as well as 100% of the magnetic heads, to be used for recording data tracks. Traditional magnetic tape designs use 10-20% of the recording surface to store the servo track information. In contrast, Quantum laser-etches optically-read servo tracks on a specially formulated back coating of the media and utilizes a three-beam hologram configuration for exact tracking.

What's more, the laser servo tracks are indelible, which means the servo information cannot be magnetically erased. The Indelible servo information removes the need for magnetic pre-formatting tapes. It also makes it possible to bulk erase SuperDLT cartridges. Traditional media must have the servo tracks re-recorded after a bulk erase in process which is highly susceptible to environmental variables and therefore highly discouraged.

To handle the higher areal bit density of SuperDLT tape, Quantum introduced Magneto-Resistive Cluster (MRC) read/write heads, which deliver higher data transfer rates and are less susceptible to environmental conditions such as temperature and humidity than traditional heads of equal size.

In addition, The Super DLTtape drive, like the Ultrium LTO and the Mamoth-2, implements advanced Partial Response Maximum Likelihood (PRML) channel technology, which is used by many HDD manufacturers. PRML technology originated at NASA to read weak signals from satellites deep in space. In essence, the channel compares the measured signal from the tape with a known waveform in order to interpret the data. PRML attempts to correctly interpret even small changes in the analog signal, whereas peak detection relies on fixed thresholds. As a result, a drive using PRML can correctly decode weaker signals and read/write data at a higher bit density.

   
 
  Laser Guided Magnetic Recording combines high-density magnetic read/ write data recording with an optically assisted servo system. As the media moves through the Pivoting Optical Servo (POS), a laser follows along on the backside of the media tracking embedded optical targets. The POS assembly pivots around a single mounting point to keep the magnetic read/write heads aligned and has a much lower sensitivity to outside influences.   
   
 

The net result is a very fast tape drive, which is highly dependent on the ability to stream very large blocks of data to keep it from pausing and repositioning the tape. We first calibrated the SDLT drives using the OpenBench Labs benchmark, obltape v1.0, at block sizes of 128 KB.

Our tape benchmark generates two very different types of data stream: purely random data and data that falls into a preset frequency pattern. The patterned data stream was originally devised and calibrated using Exabyte Mammoth-1 and Quantum DLT 7000 tape drives, which  implemented the Digital Liv Zempel (DLZ) compression algorithm in hardware. This algorithm purportedly provided a 2-to-1 compression ratio on normal data and so we devised a means of generating patterned data that consistently  produced a compression ratio on the order of 1.9-to-2.1 on those devices.

The obltape benchmark first allocates a large block of memory from which it then streams either patterned or random data to the device. By streaming data directly from memory, the benchmark eliminates bus bandwidth contention with other devices. The data can be streamed in block sizes of 2n KB, where n ranges from 0 to 8.

 

Click to enlarge

 
 
 
 

Using a QLogic Ultra160 HBA to connect the library via LVD SCSI, we pegged base uncompressed throughput at 128-KB blocks to be 10.7MB per second. This level of performance is actually slightly higher than the drive's rather conservative10MB per second rating. That puts the SDLT in line with the Mammoth-2, but trailing the Ultrium LTO drive from HP, which delivered a native throughput of 13.9MB per second.

When we ran the compressible data stream with hardware compression enabled, throughput on the SuperDLT drive, which implements a DLZ algorithm, doubled its throughput as expected to 20.4MB per second.  In contrast, the Ultrium LTO and Mammoth-2 drives utilize a new Adaptive Lossless Data Compression (ALDC) algorithm that purports to provide an average compression ratio of 2.5-to-1 across multiple data types. In our benchmark, data compression on the HP Ultrium LTO drive was pegged at 2.3-to-1. In addition, the HP Ultrium  implements what HP calls “smart data compression.” HP's compression circuitry has a “pass-thru mode” which switches off compression for incompressible data, which is typical of jpeg, and zip files. According to HP, this “pass-thru mode” can be upwards of 10% more efficient.

 
   
 

This is explicitly demonstrated in the OpenBench Labs worst-case tape scenario. When purely random data is sent to a drive while hardware compression is on, the drive attempts to compress the data, wastes embedded CPU cycles, and throughput degrades to less than the native streaming transfer rate as buffer management becomes problematic. As a result of its pass-thru mode implementation, the HP Ultrium LTO drive showed the least variance in performance—dropping from 13.9 to 13.7MB per second when purely random data was sent to the drive with hardware compression turned on.

As data becomes more mission critical, and downtime becomes more costly, IT site managers are requiring high availability features on tape automation devices. So while the SuperDLT drives provide a technology rich basis for a tape subsystem, the real magic of the Overland Data  Neo Series lies in the automation robotics. In fact, Overland is neutral when it comes to drive technology and provides robotics for LTO, Sony AIT-2 along with other drive technologies.

Each 5U-high Neo Series library module supports up to two drives and  26 media slots, including a mail slot. Along with power supplies, controllers and robotics for high availability,  the library uses "hot pluggable" drive carriers, which allow drive replacement without interrupting backup and restore functions. These hot pluggable drive trays also provide an easy upgrade path towards future SuperDLT drive technologies.

 
OpenBench Labs linked 2  Neo Series modules as a single virtual library. Library monitoring, control and fail-over management can be done from either the front touch screen or a web applet.
 
 

Nonetheless, what separates the Neo Series from the rest of the pack is the ability to scale with multiple units. Up to 8  Neo Series libraries can be linked together into a single logical "virtual library". As a result, a multimodule virtual library can be configured in an industry- standard rack with up to 16 drives and 208 media slots. We created a more modest 2-unit virtual library in our test scenario with each module hosting one SDLT drive.

Modules are linked using the XpressChannel, which adds an elevator mechanism for tape cartridge movement. This unique mechanism moves tapes efficiently from module to module, allowing any tape cartridge to be moved to any available drive or media slot in the system. The XpressChannel is composed of a 10U motor drive assembly for the first two library modules plus extensions for each additional module installed in the rack.

Without doubt, the most  significant benefit of Overland data's architecture is the ability of the robotics to continue to operate during hardware fault conditions. Borrowing from the construct of a computer cluster, one of of the Neo Series modules is configured as the master controller in a multi-module configuration. This master controller module has a fail-over mode whereby another  module can take over as the standby master controller. Fail-over management can be instituted either locally through the library's front panel touch screen or via the library's web interface. 

We set up our 2-module Neo Series library via the front panel touch screen on our master module and then monitored the system over the web. To test the library's fail-over mechanism, we powered down one of the master  module and observed as the standby master automatically took ownership of the web interface. In this process, the SDLT drive in the primary master library module became 'grayed out' and the Status Summary reported that the virtual library could not communicate with that drive.  In addition, the tape slots in that drive, which were now inaccessible, also disappeared from view.

This same information was delivered via the SCSI interface to the library management module to the library management software that we were running on our host server. Here we need to note that we encountered a surprising degree of reticence on the part of enterprise-class backup software ISVs when we discussed running with a SuperDLT drive.

Mouse over to fail library module

 

To test the Overland Neo Series in a real-world backup scenario, we utilized a backup package familiar to many Linux users, BRU—we actually used the new BRU Professional Archive Management System (BRU-Pro), which is built on MySQL. Despite all of the rumors about the demise of BRU, it is alive and well and living in Arizona with the Tolis Group, which was created as a management buyout when the parent company folded.

BRU-Pro integrates the original BRU utility with a MySQL database and an easy-to-use GUI in order to eliminate the need for systems administrators to know complicated command line sequences and data flags. The design goal of BRU-Pro was to allow a systems administrator with no special skills above a general working knowledge of the native operating system to perform backup and restore tasks on a variety of machines [clients] from a single workstation [control consol] writing to a centralized server [tape server].

BRU-Pro will automatically scan all of the SCSI devices on the tape server system and list the appropriate libraries and tape drives. In essence, BRU-Pro relies on the OS to discover and categorize the devices. If the OS recognizes the library, then it should be listed in BRU-Pro, which can invoke a screen that contains a full description of the library and a listing of all of the slots that are present in the drive and their current tape status. So with no effort on our part, BRU-Pro came up recognizing both the Overland virtual library and the SuperDLT drives without a hitch.

To help in cataloging backup archives, BRU-Pro enables a systems administrator to group, name, and assign ownership of library slots. In effect, a large library can be logically partitioned into several smaller virtual libraries. The destination can also be configured to use specific tapes from specific slots.
Click to enlarge

 

With the new GUI and underlying MySQL database, a systems administrator can now easily configure a backup task with a few clicks of a mouse. Files can be selected by selecting entire client machines or directories or by expanding the directory tree and choosing specific files.

To ensure the integrity of a backup archive, BRU defaults to an "Automatically verify" option that triggers the execution of BRU-Pro's checksum verification program immediately at the completion of a backup. A systems administrator can   can also scan an archive 'manually at any time. In addition, a completion report can be automatically sent to a specific email address once a backup has finished—you can never over estimate the value of keeping IT auditors happy.

The enhanced archive cataloging that MySQL brings to BRU is most evident in critical recovery situations. Based on the id of the user logged into BRU, the Restore menu shows a listing of all of the tape servers and client workgroups that have backup archives that the current user can access. Selecting one of these archives then displays a directory tree from which the systems administrator can choose directories or specific files to restore.

Mouse over to switch from a backup to a restore report.
Click to enlarge.

 

The final phase of our testing involved a series of backups and restores performed with BRU-Pro. It was in this phase that the one of the overt legacy limitations currently remaining in the BRU backup engine became clear.

To simplify device configuration in the past, BRU was architected with a hard-coded 32-KB buffer for data transfers. While this was quite adequate for tape drives 2 years ago, today's high speed drives require data buffers on the order of 64KB or higher. As the results of our obltape benchmark indicate, with a 32-KB buffer, top-end performance results suffer across the board.

As a result, BRU was able to keep these drives only as busy as the limitations of a 32-KB buffer would permit. Nonetheless, backup throughput on the order of 15MBs per second is in no way shabby. What's more, the utter simplicity of configuration and ease of use demonstrated by BRU easily make up for the temporary performance penalty, which like certain legacy file-size restrictions, are being rapidly eliminated by the Tolis Group..

Click to enlarge