“In some senses, the idea is a very old one,” says Ian Foster, the director of the Computation Institute at the University of Chicago’s Argonne National Laboratory in Chicago, Ill., and considered by most in the field to be the “father of the grid.” “When the Internet first appeared in the late 1960s, some people talked about how you might be able to create computing utilities. But it was with the emergence of high-speed networks in the early 1990s that people really started looking seriously at how you could link systems together.”
Not surprisingly with such an amorphous concept, there’s some confusion about a clear definition of the term “grid computing.” Foster suggests that the best way to think about it is as a set of technologies that closely dovetail with other similar sets of technologies. “It’s really a continuum from the tightly-coupled parallel machines, like IBM’s Blue Gene, to clusters, and then collections of clusters and, in the sciences and some large companies, national-scale grids that link clusters and other systems at many sites.”
Economy and Scale
Grid computing works especially well for repetitive jobs — calculations that need not be made in parallel, but instead can be made sequentially. Purdue University (West Lafayette, Ind.) CIO Gerard McCartney, who oversees a grid of 6,001 Linux, Windows, Solaris and Macintosh machines that talk to each other using the University of Wisconsin’s Condor grid middleware, says one Purdue faculty member grabs images of viruses from an electron microscope, and then processes the images using the grid. “He could do this on a mainframe that costs millions of dollars. Our way, he essentially does it for free.”
While cost savings are an important argument in favor of grid computing, the time element may be even more important. McCartney, who contends that most computers, even when in use, utilize only 20 percent or so of their capacity, cites the case of a materials scientist at Rice University who uses the Purdue grid to analyze zeolite structures. In one day, says McCartney, he may use a CPU year of computation running “fairly small calculations that have to happen thousands of times.” Midway through 2007, he’d already used three million hours of computing time, at very little cost. “These are all waste cycles. That’s the point,” says McCartney.
There are few infrastructure requirements for grid computing. Networking capability, of course, is essential, as is the middleware that enables the machines to distribute tasks intelligently. Foster’s Argonne Laboratory, with support from IBM, developed the open-source Globus software, which, he says, “addresses security, data movement, job submission, data replication” and other challenges for large grids. Other popular middleware offerings include the Sun Grid Engine (SGE) and Condor. This software is platform-independent and has the advantage, says McCartney, of being a “lightweight installation — you’re not hiring a cadre of systems programmers to make this happen.”
Finally, emphasizes Cheryl Doninger, the R&D director of the enterprise computing infrastructure unit at SAS in Cary, N.C., the software actually using the grid — what’s visible on the individual user’s desktop — has to be grid-enabled. She says that SAS added grid capabilities to its software offerings a year and a half ago.
Expanding the Grid
Grid computing is starting to catch on in many enterprises. It is already being used extensively in the financial services, oil and gas, insurance and pharmaceutical industries. Meanwhile, says Doninger, telcos, the travel industry and the entertainment sector, are quickly adopting the technology.
“You can’t really pick an area [of the enterprise] that’s amenable to grid computing,” says McCartney. But, he adds, you can isolate the best uses for the grid. “Science applications work nicely in this environment, Parameter sweeps, statistical analyses and digital rendering also work well.”
Any area of the enterprise that needs lots of computing power, and fast, can benefit. Payroll departments can use the grid to help churn out thousands of paychecks overnight, and then forget about the grid the rest of the time. Programmers at SAS use the grid at night — when they have to quickly process the latest source code builds.
With relatively small initial expenditures, grid computing can enable enterprises to realize extraordinary gains in computing power and efficiency. And in the medium and long term, they can save money. “Grid can run on low-cost commodity and open source operating systems,” says Doninger. “We talk [to our customers] about savings, and a lot of times that’s why they’re starting to look at grid — because of the hardware savings it can bring.”