INDUSTRY STANDARD
SERVER BLADES
   
 

With multiple vendors choosing proprietary foundations for blade servers, an impending support problem looms, with proliferating incompatible drivers for proprietary blade server systems and their peripherals.

   
 
by Chet Heath, vice president and CTO, OmniCluster Technologies
     
       
   

Nearly every great achievement of civilization was, at its core, the organized collaborative work of individuals operating as a team. When individual computers form teams to accomplish a common objective, these groups are called clusters. While the topic of clustered servers can be complex technically, it can be understood in terms of ordinary business concepts.

The philosophy of modern business assumes that most operations can be administered by partitioning the scope of the work objective into tasks, which are then delegated to specialized individuals charged with the execution of each component of the whole objective. The efficiency of the system depends on how well the components can delegate responsibilities, intercommunicate, collaborate, and respond effectively to each task.

 
     
 

A small direct-sales organization is a team of specialists working together. Orders come in through a mailroom to a manager. The manager delegates to sales representatives the responsibility of locating inventory in the warehouse, fulfilling the order, and delivering the product to the customer. In the warehouse, workers specialize to pick the order, customize, and pack it for shipment.The organization as a whole can accomplish far more collaboratively than any one individual if working alone.

That example is analogous to the structure of a scaleable server. Multiple servers are arranged, each with a dedicated function, to accomplish a common task, in this case web hosting. One server acts as the receiving agent of requests for web content from users on the Internet by filtering and selecting requests. This function is called a firewall; it protects the server against deliberate intrusion as well. The firewall server passes requests for web pages to a load balancer / dispatcher, which is a manager of several content servers. The load balancer assigns the work of fetching web content to idle content servers.

Content servers then seek the requested content from a disk (warehouse), organizing and decompressing it where necessary, and ship it over the Internet to the requester. The operation is scaled easily, by adding content servers to meet a desired level of demand. Consequently, this structure is the generic foundation of many of the larger, commercial websites installed.

Low-Power Blade Systems

Systems have been recently announced that use low-power processors designed for laptops mounted on blades or large cards within a specialized rack-mounted server. These machines permit many low-power servers, on the order of 35 watts or less, to be installed in multiples inside 5U or smaller host machines. Some of these machines boast that as many as 24 such servers can be mounted inside proprietary server host boxes, with special cooling, achieving extremely high densities.

Clearly, the density inside the box can be achieved but not necessarily deployed to every position in the rack. Twenty four 35 watt servers in a 5U space is 168 watts per U, which is well above the guidelines, even if zero watts is assumed for the host server.

There is another way to achieve even higher densities, however, and one that does not require replacing existing deployed servers with proprietary foundations. The concept is called a SlotServer. As the name implies, this is a complete yet highly miniaturized Linux or Windows server platform in the format of a standard-length PCI card.

The SlotServer card is formed of low-power mobile PC system logic components to permit installation in almost any PCI-based system turning it into a blade server. Each card appears to the system as a network-attached processor connected to the bus as an independent processor on a dedicated segment of a 1-gigabit LAN.

This ultra-fast network is fabricated inside a special ASIC, developed by IBM MicroElectronics. The special chip implements two PCI GigE adapters back to back with network connection, inside the silicon, between them. This Modular Network Interface Chip (MNIC) connects one PCI GigE adapter to the SlotServer’s internal PCI bus and the other PCI GigE adapter to the edge connector of the card.

When the card is plugged into the system, a peer network is formed between Host and SlotServer at bus speed. The SlotServer software drivers then adapt each network interface to popular operating systems such as Linux, Windows, and various forms of Unix.

This architecture permits mix and match of many different operating system environments, inside a common box, thereby resolving many server convergence problems. The card in turn may have external connections through its own dedicated 10/100 Ethernet port on the card bracket RJ-45 connector, connected through its local PCI interface.

BusClustering collects these peer networks into a private high-speed network between all the SlotServer cards and the host system. To software, the cards use drivers to standard operating system layers, which propagate standard network protocols, such as NetBeui or TCPIP across the MNIC. Buscluster System Architecture also permits direct peer-to-peer communication, private to the SlotServer cards, to reduce the inter-communication burden on the host CPU.

Drivers define control blocks that the cards are directed to fetch from main memory. The control blocks can be linked to one another in sequence, such that very large amounts of data can be transported across the modular network interface chip with minimal burden on the host system CPU.

Except for initialization and termination of the large transfers, the host system is relieved of data movement responsibilities. This is a key factor that makes BusCluster architecture scale, such that connectivity and CPU horsepower are added incrementally as multiple cards are added to a system.

SlotServers consume just 10 watts of electric power and typically operate within the nameplate rating of the host system. In effect, they can convert any server to a blade server and multiply the original Web Page per Second performance 2x, 3x or more depending on how many SlotServers are installed. In this way, SlotServers permit zero “U” expansion of existing systems without requiring the purchase of new host systems, racks, or floor space.

Expanding the performance and function from within also requires far less time to deploy. Using traditional means, doubling the capacity of a web hosting enterprise, would require double the servers, double the racks, double the space and double the power. This typically involves a long planning, finance, and construction cycle. Adding SlotServers within the existing boxes can take just hours per server. With teams of installers, this can be reduced to a few days to deploy an entire hosting facility.

Aggregate system performance of SlotServers is governed by three major factors:

CPU: The SlotServer CPU speed is 300Mhz PLUS a portion of the host CPU to perform I/O tasks. Most PC architectures are single processor, so we have become accustomed to comparing CPU clock rate to indicate performance; in this case it isn’t that simple. SlotServers typically attach in multiples, permitting an aggregate of N times 300 Mhz in tasks like Web Hosting, that can be easily separated. This does not hold for a single task

Bus: A very high-speed network allows the burden of I/O to be shifted to a central host system CPU, leaving the SlotServer with a remaining portion of the total task to perform. A content server is a good example. Many tasks (such as a firewall, DNS, DHCP, Load Balancer) can be performed adequately when allocated to a single SlotServer. Even when functioning as an array of content servers, 100 Mbit Ethernet translates into 12.5 Mbyte of continuous operation worst case, assuming the unlikely full availability of the Ethernet port and no compression of the data coming from the file to the card. There is plenty of residual bandwidth in standard PCI at 132MB/s to handle as many cards as there are free sockets in a system.

File: the file system does benefit from upgrade. If a single file system is the focal point of all servers, a SCSI RAID system with striping can be cost justified, given that it replaces all the file devices in multiple servers. When the SlotServer is operated diskless, all file operations occur under the envelope of the host CPU.

With all the "Virtual" disk images are in the same file system and natively accessible by a common application, then a common utility called Virtual Disk Manager can create, deploy, remove, and dynamically associate virtual files to multiple SlotServers in the same host server. This permits deployment of an entire cluster of servers in less than an hour verses a typical day to do the same Ghosting physical drives. Virtual Disk Manager can also use the network and bus protocols to create a KVM window in the host for each SlotServer, and thereby shut down, reboot, and even replace/reset a hung OS in a server within the cluster.