| OPEN SOURCE TKO |
|
|||
by
Nancy Cohen |
![]() |
What can knock out old database notions that proprietary means best? |
||
|
Cost? |
|||
|
All of the above.
More and more CIOs and CTOs like what they hear as Open Source database players spar for contention. But is this really the year that Open Source databases like MySQL and PostgreSQL score on the mindshare and market share enjoyed by Oracle and other proprietary kingpins? |
|
Last year, a number of events broke the mold in what the analysts refer to as the DBMS (database management system) market, valued by Dataquest at $8.8 billion in 2000. Emerging among the bright new kids on the block are commercial distributions of Open Source PostgreSQL and MySQL. We talked to Dalton Han of CommVault
and Jaime Bozza of the twin portals: The Wireless Developer Network (for the wireless web industry) and
GeoCommunity (for the Geographic Information Systems industry). Both Han and Bozza represent able IT planners who
understand that the strengths of MySQL and those of PostgreSQL are not one and the same. |
|
||||||||||||||||
![]() |
Dalton Han is a key technologist in storage management solutions for CommVault Systems. As the man in charge of choosing a database system at a previous job, he easily recalls an Oracle sales rep's visit and price quotes-which sent him running toward a careful review of Open Source options like PostgreSQL and MySQL. For Han's needs, MySQL won-hands down. |
||||||||||||||||
|
"Earlier this year, I had a visiting Oracle sales representative spend about an hour at my former employers, explaining the merits of the Oracle Relational Database Management System. He emphasized how Oracle technology is a necessary component of a highly available and high-performance enterprise data-management system. His pitch impressed me--Oracle reps receive extensive sales training. "I asked him how much all this would cost. The figure was higher than my former employer’s entire hardware investment. We decided to go with a MySQL database for the type of application that we had in mind. Nobody can deny that Oracle has a good product, but because of the type of information that my database would be handling and Oracle’s cost, I implemented the MySQL database, on a two-node Linux cluster using Convolo from Mission Critical Linux." Han is working with metadata, which is a specific kind of information unlike any other: Metadata describes the format of other information. Basically, it is data about data. In particular, metadata improves the searching, processing, and filtering of information over the Internet. As metadata languages such as XML (Extensible Markup Language) and engines become more fully developed, this format will be the basis for organizing machine-understandable information about people, things, and concepts. "Just as the information that powers e-business has become increasingly media-rich, companies are discovering the need to store digital information such as large streaming video files. Relational databases originally developed to store text are now being used to manage these types of data. Metadata is a great format in which to organize such information. The data may be digital media, such as image files and sound or video clips, but it could also be personal, network-status, or quality-of-service information. "In our MySQL scenario, we were dealing with metadata that determines how content is displayed on a web page by a Java engine. When an end user changes the parameters of an element—say, the x and y coordinates of an image—and saves that image, the Java engine will accordingly change the metadata in the database. In essence, we were intent on using metadata for content management." Granted, MySQL is not for everyone, but as metadata comes into focus, so does MySQL’s suitability. While MySQL is a widely used high-performance database, many large corporations have passed over MySQL for transactional and high-availability issues. A transaction is the grouping of SQL statements as one unit of execution. Transactions provide two benefits: maintaining data integrity and helping to keep data concurrency. Data integrity means that if the transaction does not complete, the database will “roll back” to its state before the attempted transaction. This prevents the database from being partially updated. Data concurrency means that the database must control a number of concurrent connections. Transactions let current users complete their data inserts before new users are allowed to insert data. In turn, data integrity and concurrency are important when users are directly accessing the database. With metadata, end users do not send requests directly to the database. Instead, the application (such as a Java engine) uses metadata to manipulate data objects. The application maintains data integrity and manages the concurrency. "High availability is another issue for MySQL, but I bridged that gap fairly easily by buying a Linux clustering solution from Mission Critical Linux. The software, Convolo, allowed me to cluster two systems to form a failover node in case the primary database server goes down. In essence, Convolo lets MySQL exist as a service independent of the servers. The software, which was easy to set up, requires both nodes to share a centralized storage system, for which I chose a Clariion system from EMC. I should also note that I also got good technical support for Convolo." When researching other databases, he briefly looked at PostgreSQL.
"While PostgreSQL supports transactions, we found it slower than MySQL. I decided that the performance difference
was reason enough for me to stick with MySQL. Since metadata does not rely on transactions and companies like
Mission Critical Linux have released high-availability solutions for MySQL clustering, using MySQL for metadata
works nicely. What's more, the whole project actually came in under the planned budget, a fact that greatly
satisfied the company’s CFO." |
|||||||||||||||||
![]() |
Just as opinionated is Jaime Bozza, WDN/GeoCommunity network administrator and a technologist who does the back-end programming and database design. He tells us that PostgreSQL was clearly the superior alternative for their own particular site needs. |
||||||||||||||||
|
What seemed like the right idea a couple of years ago-using Microsoft and Oracle database platforms to run two sites linking up developers to resources in their vertical markets-turned sour. The Wireless Developer Network (for the wireless web industry) and GeoCommunity (for the Geographic Information Systems industry) had security concerns and hefty software costs that made them look for newer options based on Open Source. |
|||||||||||||||||
|
The twin portal’s IT planners eventually tested out what seemed to be the two main Open Source database acts in town, MySQL and PostgreSQL. What made the sun shine brightest on PostgreSQL database? Jaime Bozza, first responds with the word, “Transactions.” “Our sites are heavily visited and interactive,” says Bozza. The PostgreSQL server is driving dynamic applications such as book sales, message boards, mailing lists, and software sharing. Visitors are constantly downloading software, posting messages, and taking part in discussion groups. “We needed a database more robust in its SQL implementation, including transaction support. |
|||||||||
|
At that time, PostgreSQL offered those features that MySQL did not.” (MySQL offers transaction support, though not in all configurations.) Bozza says MySQL always had speed on its side but finds that PostgreSQL release 7.1 turns the tables. “MySQL was always a hands-down winner in speed but PostgreSQL now is comparing very favorably, especially in the concurrent access department, where 7.1 continues to perform well without any noticeable performance loss.” Bozza’s advice for businesses planning a review of Open Source options is to get rid of any notion that software from proprietary giants Microsoft and Oracle are for ‘big’ organizations while PostgreSQL and MySQL are for the little leaguers. “Size is not always the issue,” he says. “Take SourceForge, which uses PostgreSQL on the back end. It’s rather a question of what your business needs to do.” Still, even a PostgreSQL enthusiast like Bozza says some businesses still view Oracle as the system of choice, for high-end transaction tasks like banking and financial services. And while 7.1 is impressive, Bozza recognizes what PostgreSQL still needs if it is to mirror the likes of Oracle: “One of the key features missing is replication and failover support. But some of the replication features are almost ready.” |
||||||||
|
For organizations that are not as transaction-intensive as credit card companies or others in high-end financial services, Bozza believes PostgreSQL 7.1 is a good choice. “I can say PostgreSQL is turning out to be viable even in the most intensive of applications, especially with the recent efforts we’ve seen.” Bozza notes 7.1 has added a write-ahead log, allowing consistency to be maintained in the case of an operating system crash without sacrificing speed, elimination of the 8/32Kb row-length limit, outer joins, and other optimizations. All these “have turned what used to
a great RDBMS into an excellent RDBMS,” asserts Bozza. “If you take reliability and speed as the two big
requirements for a database product, PostgreSQL now has both.” Click to learn why The META Group's Charlie Garry finds Open Source database system adoption to be moving at glacial speed. |
|||||||||