Peta Computing’s Parallel Universe

Petascale computing is coming of age, opening powerful new modeling
opportunities for CITRIS applications. From the exploration of protein
folding at the atomic level to long-range climate predictions and
turbulence studies, the new computers will give a broad range of users
processing power heretofore reserved for weapons research.

by Gordy Slack

…………………………………………..

CITRIS researchers will soon have access to a new generation of
high-performance supercomputers far more powerful than those available
today. Known as petascale computers, these new machines will be capable
of conducting 10^15 floating-point operations per second (petaflops).
These parallel machines may employ more than a million processors and
will be able to handle huge data sets. Until now, they have been mainly
the domain of military and other national security applications. With
the delivery of a new petascale computer to Lawrence Berkeley National
Laboratory (LBNL), and with the possibility of a Berkeley team helping
to host another one at Lawrence Livermore National Laboratory (LLNL),
researchers working on climate analysis, genomics, environmental
monitoring, protein analysis, earthquake, nanoscience, and other
CITRIS-related fields will gain access to powerful new modeling tools
within the next four years.

By employing a much higher resolution of analysis, seismologists here,
for instance, will be able to do block-by-block modeling of earthquakes
at different intensities, according to James Demmel, Professor of
Mathematics and Computer Science at UC Berkeley and founding Chief
Scientist at CITRIS.

James Demmel, CITRIS founding Chief Scientist.

"Until now, models have said, 'this huge area will vibrate about
like this.' But that is not good enough to figure out which buildings
need which kinds of retrofitting," says Demmel. "But with a petascale
machine, you can refine the resolution of your simulations to determine
which blocks and buildings are especially endangered and how best to
retrofit them. It would enable a science-based approach to earthquake
preparedness and response."

The world of huge parallel computers on the petascale has arrived.
If past is indeed prelude and speed increases continue at current
rates, within a decade, at least half of the world's 500 fastest
computers will probably be petascale.

Access to such processing power will allow researchers in the health
and life sciences to engineer proteins down to the atomic level,
opening new doors to the treatment of several types of diseases.
Climate analysis is another key field where petascale simulation will
lead to much better modeling, enabling science-based approaches to
emissions policy or to predicting the effects of global warming on air
quality, agriculture, wildfires, and water supplies.

Scientists studying energy production and efficiency will also get new
tools, permitting heretofore over-complex modeling of turbulence
conditions and other factors that determine fuel efficiency, for
instance, or the design of bio-fuels.

Before these new giants can be fully exploited, some big challenges
must first be addressed. UC Berkeley computer science professor
Katherine Yelick is working with colleagues in the Parallelism Lab to
bring such CITRIS-type applications and the petascale hardware and
systems software together.
UCB computer science professor Katherine Yelick."We are trying to expose the best features of the underlying hardware to the software," says Yelick. "The
hardware designers are trying to innovate and put in fast networks or
networks with very interesting connectivity patterns, and we want to
take full advantage of that," she says.

Yelick has one foot in the world of system-level software and the other
in that of hardware development, which makes her particularly valuable
to the coordination effort. She and her team have developed new
compilers and programming languages (one based on C and another based
on Java) for the new petascale computers.

One big challenge is the problem of pacing and managing the information
flow through hundreds of thousands of processors. "It is like trying to
get a million people coordinated and doing their jobs at exactly the
same time," says Yelick.

Petascale machines not only have more chips, but each chip has more
processors than earlier generation supercomputers. Coordinating the
flow and sharing of so much activity is a job requiring new algorithms
and new approaches to applications programming, too, says Yelick.

This is a big problem because the work the computer is trying to do is
not equally distributed among all of its processors. In modeling
weather, for example, the Earth's surface can be divided into equal
sized parts, and each given a dedicated processor. But if there is a
hail storm somewhere, for example, there will suddenly be a lot of
significant activity in the processors associated with those parts of
the model. If the rest of the system has to wait for the processors
working on the hailstorm, it can lose a lot of time, says Yelick.

In addition to such load imbalance issues, the team is working to
minimize the time it takes for information to travel around these
computers, some of which can be as big as a tennis court.

"Light travels pretty slowly," explains Demmel. "if processors on
opposite sides of the computer have to send huge amounts of information
back and forth, the time adds up fast."

Racks of servers at UCSC.

While Yelick straddles the gap between the systems-level programming
and hardware design, Demmel straddles that between the applied math and
the applications-level programming. "People who can work across one or
more of those boundaries are very important in making these kinds of
projects hang together," says Yelick.

People like Yelick and Demmel are finding themselves thrust from the
rarified theoretical atmosphere of the high-end research computer world
to what will soon be the center of a revolution in personal computing.
As personal computers are forced to embrace parallel processors, they
will face some of the same challenges as these high-end scientific
computers.

In addition to the NSF bid to design and host a new peta computer for
LLNL, Demmel and Yelick are just now completing another proposal for an
Intel- and Microsoft-funded center for studying parallel computing
applications for personal computers, games, hand-held devices and other
commercial products.

"Now that Moore's Law can no longer be met by making single chips
faster, everything is going to have to be parallel," says Demmel. "The
computer industry will hit a wall unless it figures out how to deal
with large-scale parallelism."