With the increasing turbulence in the business environment, the forces driving IT activity and reshaping the data center have intensified. Across the corporate landscape, the mandate for peak operations efficiency now has IT redoubling its focus on slowing—if not reversing—rising labor costs by alleviating management complexity. It also brings into the corporate limelight the problem of rising power consumption costs in the data center.
| openBENCH LABS SCENARIO |
|---|
|
UNDER EXAMINATION • Near linear IOPS scalability for Windows Server 2008: • The QLogic 2500 Series HBA supported IOPS loads of 200,000 with 4KB I/O and full duplex throughput at wire speed. • Single Hyper-V VM benchmarked at 145,000 IOPS using 4KB random read requests per second. • With four VMs simultaneously reading and writing data, Hyper-V sustained 159,000 4KB I/O requests per second, which represented a throughput rate 15% greater than the throughput sustainable with a 4Gbps HBA. |
For today's savvy CIO, The solution is not that difficult. Fundamental technology trends—Moore's and Shugart's laws—and sophisticated IT strategies—virtualization and IT service management—are converging to create a revolutionary transformation of the data center that essentially solves the problem through the very sophisticated notion of emphasizing the management and automation of services within a Virtual Operating Environment (VOE). The challenge for IT is to provide a sufficiently robust physical infrastructure that meets a broader spectrum of concerns that now go beyond cost, performance, and backwards compatibility.
The QLogic portfolio of 8Gb-per-second FC infrastructure solutions helps meet that challenge via support for a SAN fabric that can meet evolving virtualization technologies and satisfy demands for simplified management, higher reliability, availability, and serviceability (RAS), and lower power consumption. What’s more, the QLogic 8Gbps FC infrastructure enables a VOE, such as Microsoft’s Hyper-V, to scale I/O for running demanding applications within Virtual Machines (VMs) and running greater numbers of virtual machines on VOE servers. In both scalability scenarios, physical VOE servers require scalable, high-throughput I/O pipes, backed with a Quality of Service (QoS) guarantee
The conundrum for IT is that the same forces that drive the solution also drive the problems. Thanks to Moore's law, multi-core multi-processor commodity servers now sport upwards of 16 CPUs. Servers with that magnitude of processing power, however, are a two-edged sword. Well-conceived IT plans for resource consolidation will dramatically cut run away power and cooling costs. On the other hand, introducing these powerful servers with a weak consolidation scheme will raise rather than lower those environmental costs.
So too, Shugart's law, which drives down the cost of storage hardware, has a very similar effect on storage resources. With 750GB disk drives, multi-terabyte arrays join the family of commodity devices; however, multiple spare 750GB disk drives quickly degrade the level of resource utilization.
This makes a VOE essential for an aggressive resource consolidation initiative. With a VOE in place, one server easily takes on the workloads of multiple servers without disruptions in the way that applications run and managed. At a branch office, a single multi-core multi-processor can be used to consolidate file, print, web, database, and email servers. In so doing, IT can reduce the costs of system management, maintenance, and staffing.
Nonetheless, increasing the degree to which a site’s resources are virtualized also creates greater resource abstraction and that too can increase complexity. To control a virtualized world, IT administrators need a consistent simplified tool set. With Hyper-V, they can reuse their knowledge for managing physical servers with extensions to extend role-based dynamic self-managing systems in the new virtualized environment.
![]() |
We configured both the Hyper-V environment and each of the five VMs running within Hyper-V using the Hyper-V Manager. |
Beyond server consolidation, such a well designed VOE can vastly improve on scalability by leveraging the mobility of VMs to automate sophisticated load balancing while also improving traditional IT RAS concerns. With physical servers, IT must trade off between processing slow downs during periods of peak processing or higher capital and operating costs by provisioning servers for peak usage. On the other hand, when backed by a scalable, high-throughput SAN fabric, like that created by QLogic, virtual machines can be automatically migrated to a more robust VOE server based on real time processing loads.
We setup a Dell PowerEdge 1900 server with a quad-core Xeon processor and four GB of RAM to anchor the infrastructure supporting our VOE. For a SAN fabric, we installed a QLE2560 8Gbps HBA in the Dell PowerEdge server and connected the HBA to a port on a QLogic SANbox 5802V switch. We then installed the 64-bit Datacenter edition of Windows Server 2008. The Datacenter edition provides for the unlimited licensing of VMs hosted under Microsoft’s 64-bit server virtualization technology, dubbed Hyper-V. An integrated feature of the 64-bit version of Windows Server 2008, Hyper-V is available with the full releases—Standard, Enterprise, and Datacenter—of Windows Server 2008 and a thin Core Server release, which only provides a command line interface.
With quad-core processors and fast PCI-e buses for peripherals, commodity PC servers typically host four to eight VMs. In that kind of environment, an 8Gbps infrastructure can provide an immediate pay back through more efficient utilization of storage resources. From the perspective of I/O scalability, a single-ported 8Gbps QLE2560 HBA can provide a Hyper-V server with enough I/O bandwidth needed to support four VMs, with the equivalent of dedicated 2Gbps FC bandwidth. In particular, IT avoids extra costs associated with provisioning multiple HBAs and switch ports for a VOE server.
![]() |
We associated SCSI adapters with the QLogic drivers for the QLE2560 by attaching logical disks connected to the host server via the QLE2560, but not placed online within Windows Server 2008. |
While the primary rational for server virtualization are related to simplified resource management, powerful secondary benefits arise out of the ease with which IT can consolidate resources in a VOE from a physical environment where most server applications require a significant amount of I/O performance. As a result, a VOE server will need to exhibit significant performance and scalability. To provide a platform for multiple server applications, a VOE server must be able to handle the total aggregate I/O required of each consolidated applications server.
I/O scalability and performance is particularly important for Hyper-V, which is at the center of an old controversy over virtualization architecture with regards to the handling of I/O. Hyper-V, like Xen, is built on a hypervisor model. Under that architecture, a VM deals with virtual devices and a microkernel passes I/O requests from virtual devices to real device drivers, which reside in the base Windows Server 2008 or Server Core installation, which is dubbed the Parent Partition in the Hyper-V architecture. The Parent Partition role is played by Domain 0 under Xen.
The result is an efficient I/O delegation scheme that has very low overhead at the VM. Under Hyper-V the main delegation components are Virtual Service Clients (VSCs), which are at the bottom of I/O stacks in child partitions; Virtual Service Providers (VSPs), which invoke the actual device drivers in the Parent Partition; and the VMbus, which sends communications across partitions. While this is the preferred method to support I/O, Hyper-V also provides fully emulated virtual devices such as a Fast (10/100) Ethernet adapter and a virtual IDE adapter.
In contrast to Hyper-V and Xen, VMware ESX utilizes a direct driver model for client VMs. Under this model drivers are embedded in the virtualization engine along with a scheduler built specifically to handle multiple, high I/O workloads, which means I/O calls are not redirected to a special partition. Naturally each side of the I/O debate thinks their approach is the best and will result in higher VM density ratios, which translate directly into higher consolidation ratios.
To test the ability of the QLogic 8Gbps infrastructure to support Hyper-V scalability with respect to I/O, openBench Labs needed to stress the I/O capabilities of the QLogic QLE2560 HBA and I/O delegation within the Hyper-V environment for multiple VMs and multiple logical disks. That made I/O operations per second (IOPS) a critical metric for our benchmarks.
![]() |
We used Iometer to drive I/O throughput in scalability tests of Windows Server 2008 and Hyper-V. All I/O requests and throughput was measured at the single QLogic HBA port. With 4KB I/O requests and multiple logical drives, we reached I/O levels that surpassed the throughput capabilities of a 4Gbps HBA. |
For encountering potential SAN and VOE bottlenecks in IOPS performance, our test scenario for VOE scalability represents an absolute worst case. We deliberately designed a SAN fabric topology that maximized stress on the QLogic QLE2560 by converging eight 4Gbps data paths—four dual-ported 4Gbps HBAs—onto one port of the QLogic 8Gbps HBA. As a result, I/O bottlenecks associated with arrays built with mechanical drives and not germane to SAN transport and VOE issues, would degrade the clarity of our test scenario. That made using a solid state disk (SSD) array, such as the Texas Memory Systems RamSan, essential for our analysis.
Using Hyper-V Manager, openBench Labs configured four identical VMs, which all sported one CPU, 756MB of RAM, and one external disk volume hosted by the RamSan. We used these VMs to test I/O scalability when multiple VMs are running on a VOE server. We also configured a fifth VM (HV10), which sported four CPUs, 3GB of RAM, and four external RamSan-hosted volumes, in order to test I/O scalability for a single VM.
For I/O, Hyper-V provides VMs with virtual SCSI and virtual IDE controllers. By default, Hyper-V requires a virtual IDE controller, which is an emulated device, for the VM boot disk. The VMbus mechanism does not exist at boot time for a VM, which makes booting from a disk associated with a virtual SCSI controller impossible to use at boot time. As a result, Hyper-V does not a automatically configure virtual a SCSI controller.
What’s more, there is a distinct potential for compatibility issues associated with SCSI device drivers, which must reside in the Parent Partition. As a result, system administrators must add virtual SCSI adapters manually via the Hyper-V Manager. Nonetheless, there is less overhead processing involved with a VSC-VSP pair. In addition, virtual SCSI controllers support deeper SCSI command queues with multiple outstanding SCSI commands on the bus at the same time. That means a virtual SCSI controller should provide measurably higher throughput performance within the VM.
![]() |
By increasing the I/O size to 8KB, which is the size used by typical database-driven applications, I/O throughput reached wire speed for an 8Gbps FC SAN. As a result, our scalability tests began to converge on that limit. |
From the perspective of the Hyper-V Parent Partition, disks for VMs can be either container files, which represent fixed, differencing, or dynamically expanding disks, or raw off-line physical disks. Also dubbed pass-through disks, physical disks are nearly invisible to the Parent Partition and have no size limitation other than what is imposed by the VM’s operating system. What’s more, physical disks can easily be accessed by other physical servers as well as other VMs.
The natural affinity for device sharing makes physical disks a key Hyper-V option for IT sites with FC or iSCSI SAN fabrics. For all benchmark testing, openBench Labs used a virtual SCSI controller and physical disks exported by the Texas Memory Systems RamSan. In these tests, our principle concern centered on the number of IOPS that could be sustained, which provides a critical I/O health measure for a VOE. Our secondary concern was the measurement of I/O throughput, which provides the best insight into SAN fabric infrastructure bottlenecks.
In testing I/O scalability, we first wanted to determine the amount of I/O traffic that a single port on an 8Gbps QLogic QLE2560 HBA could sustain. Second, we wanted to determine how well Hyper-V could utilize the QLE2560 to scale I/O levels using multiple VMs. Finally, we also wanted to determine if a single VM could scale to an I/O load level that would require the use of an 8Gbps QLE2560.
We began by setting up three test scenarios, which used the Intel Iometer benchmark to generate I/O requests. In all of our scenarios, we employed 8GB volumes exported on individual 4Gbps controllers from the Texas Memory Systems RamSan. On each volume, we limited the command queue to 30 outstanding I/O requests. We then split I/O between reads and writes in a 75-to-25 ratio. As a direct result of that split in reads and writes, read channel throughput would become saturated as total throughput for the QLE2560 approached 1075 GB per second.
For a baseline test, openBench Labs focused on assessing the capabilities of the QLE2560. With our quad-core server running Windows Server 2008 without Hyper-V, we set up four 8GB disks on the RamSan and ran Iometer on one through four drives. During the benchmarks, we measured I/O throughput in terms of both IOPS and MB per second.
![]() |
When we viewed Iometer performance with four VMs running under Hyper-V and issuing 8KB I/O requests to a dedicated logical drive on the RamSan, we observed that I/O from the QLogic HBA was perfectly balanced across the four ports connected to the four logical drives. |
Our next test scenario focused on assessing the scalability of the Hyper-V VOE in terms of running multiple VMs with the QLogic 8Gbps SAN fabric infrastructure. This is a key to maximizing server resources and achieving a high consolidation ratio. In this VOE test scenario, we configured four identical VMs. We provisioned each VM with one CPU, 756MB of RAM, and one physical test disk from the RamSan connected to a virtual SCSI controller. We then ran Iometer on one through four VMs.
Our final test examined the ability of a single VM to scale in order to handle a large-scale application. In this test we configured a VM with four CPUs and 3GB of RAM. We configured the I/O subsystem for this VM in two ways: First we connected four test disks to one virtual SCSI adapter and then we connected each test disk to its own virtual SCSI adapter. Each VM can be configured with up to four virtual SCSI adapters. We then ran Iometer on the VM using one through four test drives.
All of these tests were run using three different I/O block sizes: 4KB, 8KB, and 64KB. We started with 4KB I/O blocks, which are used by MS Exchange to support a high volume of email transactions based predominantly on short messages. With 4KB blocks, we placed maximum stress on both the VSC-VSP mechanism within Hyper-V to pass I/O requests and the QLogic QLE2560 I/O engine to maintain I/O traffic with four independent 4Gbps controllers connected to SSDs.
Running Windows Server 2008, IOPS performance increased about 40 percent with the addition of another worker process and a new test disk from the RamSan. With four workers and four disks from the RamSan, IOPS processing reached to over 200,000 and the total volume of data throughput exceeded 700MB per second. That meant we had surpassed the data throughput level on reads that a single 4Gbps FC port on our server would be able to support.
Running our multiple VM test, performance closely paralleled our Windows Server 2008 test for the first two VMs. When we added the third and fourth VMs, IOPS performance increased by about 20 percent each time. This brought IOPS performance with four VMs to just over 160,000. Once again, for the read component of the data that we were moving, we had exceeded the capabilities of a 4Gbps HBA.
![]() |
Following our multiple VM scalability tests, we ran tests on the scalability of a multiprocessor VM with multiple drives. With each disk given its own virtual SCSI adapter, I/O performance with a single VM with multiple drives scaled identically to the I/O scaling exhibited with multiple VMs. Nonetheless, with respect to IOPS, four drives on one virtual SCSI adapter did not scale as well.
Next we utilized random 8KB I/O blocks, which typify the I/O transactions found in database-driven applications. With 8KB blocks, we continued to put significant stress on both the VSC-VSP mechanism and the QLogic QLE2560 to make transactions; however, we also doubled the data throughput, which produced a more balanced environment in terms of the importance of IOPS and the volume of data transferred MB per second.
By doubling the amount of data per request, we crossed a threshold not crossed with 4KB requests: we saturated the read channel of our 8Gbps HBA. With just three drives in our initial Windows Server 2008 scalability tests, we exceeded 122,000 IOPS and reached a total throughput of 960MB per second. As a result, adding a fourth disk produced only a 5 percent improvement in IOPS and total data throughput in MB per second. More importantly, all I/O—both reads and writes—coming from the QLogic HBA and measured at the Texas Memory systems array was perfectly balanced across all four logical disks.
We measured that same pattern in all of our VM scalability tests. With 8KB requests, the ultimate rate limiting factor was the 8Gbps HBA. As a result, we began to see our scalability tests converge to just under 129,000 IOPS and just over 1000MB per second data throughput.
In our final tests, we used 64KB I/O blocks, which are found in Business Intelligence (BI) applications such as, On-Line Analytical Processing (OLAP), data mining, and data warehousing. I/O in these applications is the antithesis of I/O in messaging applications: There are a limited number of users and the speed at which large volumes of data can be moved dominates in importance. Now data throughput totally dominated our tests which were identical in all cases as two drives or two VMs were enough to saturate reads on our 8Gbps fabric at wire speed.
The number of IOPS sustained in all of our tests clearly indicates that a Hyper-V VOE based on 8Gbps QLogic FC SAN infrastructure is able to scale and support a high number of VMs, which will easily provide for a high consolidation ratio. Equally important, the scalability that this infrastructure provides a VM enables the hosting of the most I/O-intense applications.