UCSC computer scientists develop solutions for long-term storage of digital data

UCSC computer scientists develop solutions for long-term storage of digital
data

By Tim Stephens (831) 459-2495; stephens@ucsc.edu

 

Although the digital age is well under way, one crucial detail remains to be
worked out–how to store vast amounts of digital information in a way that
allows future generations to recover it.

The team that developed Pergamum includes graduate students Kevin Greenan and Mark Storer and associate professor of computer science Ethan Miller.

 

"The problem is how to build a large-scale data storage system to last 50 to
100 years," said Ethan Miller, associate professor of computer science in the
Baskin School of Engineering at the University of California, Santa Cruz.

 

Tape libraries are widely used for data storage, but digital tape has many
shortcomings as an archival medium. Miller's group has come up with a new
approach, called Pergamum, which uses hard disk drives to provide
energy-efficient, cost-effective storage. The declining cost of hard drives has
made them more competitive with tape, and they offer numerous advantages for
searching and retrieving data. "It's like the difference between a VCR and
TiVo," Miller said.

 

Pergamum, named after the ancient Greek library that made the transition from
fragile papyrus to more durable parchment, is a distributed network of
intelligent, disk-based storage devices. The team that developed it includes
UCSC graduate students Mark Storer and Kevin Greenan, along with researcher
Kaladhar Voruganti of NetApp (formerly Network Appliance), a company that
focuses on storage and data management solutions.

 

Archival storage is a big issue for businesses, partly due to legal
requirements for the preservation of financial and business records, and also
because data mining strategies can turn stored data into a valuable resource.
Long-term storage is also a growing issue for individuals who are filling their
personal computers with digital photos, movies, and documents.

 

"There is a risk that an entire generation's cultural history could be lost
if people aren't able to retrieve that data," Storer said. "Everyone is
switching to digital cameras, but we've never demonstrated that digital data can
be reliably preserved for a long time."

 

The researchers designed the system to provide reliable, energy-efficient
data storage using off-the-shelf components. It also has the ability to evolve
over time as storage technologies change. "You want to avoid 'forklift
upgrades,' where you have to get rid of the old system and transfer all your
data to a whole new system," Miller said.

 

According to Storer, businesses are beginning to recognize that archival
storage is very different from simply backing up their data. "A backup is a
safety net–you hope you won't need it. Archival data you do want to use–it's a
valuable resource and you want to be able to mine it for information," he said.

 

Tapes work well for backups, in which data are written once, rarely read, and
not kept indefinitely. But archival data should be easy to read, query, browse,
and search, and tape has inherent weaknesses in these areas. Existing disk-based
systems offer excellent performance, but rely on power-hungry central
controllers.

 

Pergamum is one of several related projects being developed by researchers in
the Storage Systems Research Center (SSRC) at UCSC's Baskin School of
Engineering.

 

Read the full news release at http://www.ucsc.edu/news_events/text.asp?pid=2130