Superhighway for data-intensive science

The Cave, UC Merced

Imagine watching volcanic plumes erupting miles beneath the ocean’s surface, in real time. Or using machine learning techniques to detect changes in the starry sky night after night. Or zooming in to analyze and visualize our microbiome to improve understanding of cancer and other life-threatening diseases. Data-intensive fields, such as particle physics, astronomy, biomedical sciences, and earth sciences, increasingly involve multiple investigators and institutions working with instruments or supercomputing centers at remote sites. To accelerate discoveries in those fields, researchers require high-performance network connections to transfer large datasets reliably among collaborators or between data collection sites and the lab or desktop.

The Pacific Research Platform (PRP) addresses this increasingly urgent need for advanced cyberinfrastructure. It is designed to create a high-speed “freeway system” for large scientific data sets by connecting campus networks and supercomputing centers on a regional scale. The project has been funded by a five-year $5 million grant from the National Science Foundation (NSF) since 2015, and is led by researchers at UC Berkeley and UC San Diego with a partnership of more than 50 institutions. At Berkeley, the project has been led by CITRIS and the Banatao Institute.

Just as the interstate freeway system transformed the economy, the PRP’s data freeway system has the potential to transform scientific workflows. “Without this, scientists might ask their graduate students to take a flash drive or a disk from one side of campus to the other or from one campus in California to another – physically,” says Camille Crittenden, deputy director of CITRIS and one of the co-principal investigators (PIs) of the Pacific Research Platform. “With this high-speed network infrastructure, we are able to eliminate the need for this ‘sneaker-net’ and reduce the time necessary for transferring and analyzing the data from weeks to days to minutes.”

PRP leverages individual expertise and capacities on each campus to create something much bigger and broader to benefit not only the entire UC system but also academic institutions outside of California.

The PRP is built upon and utilizes the Science DMZ concept and design pattern, a network architecture optimized for high-performance scientific applications developed by the Department of Energy’s ESnet between 2010 and 2013. The PRP connects Science DMZs into a regional Science DMZ, using the fiberoptic infrastructure of CENIC (the Corporation for Educational Networks in California), to upgrade campus networks on a regional scale for data-intensive networking. Overall, the PRP is developing a prototype regional implementation of ESnet’s enhanced Science DMZ architecture, according to the PRP team at CITRIS.

The PRP has been working to help campus-to-campus and inter-campus shared internet/networks achieve speeds of between 10 and 100 Gbps (gigabits per second), which is often 10 to 100 times greater than speeds on the commodity internet. The PRP aims to move data 1,000 times faster by 2020.

The Cave, UC Merced
UC Merced 3D visualization wall shows archeological sites all over the world.

The major goals of the PRP are divided into two categories: first, technology development and implementation, and second, science engagement. Technology development efforts focus on building and managing the actual network and creating purpose-built data-transfer nodes, which has been led largely by UC San Diego. Science engagement aims at recruiting participants, working with teams to determine data needs, sharing knowledge with the scientific community, and using the project as an opportunity for education and workforce development. CITRIS has focused on the science engagement, connecting with faculty members who have problems that the PRP might be able to help solve.

The PRP has potential to benefit not only research, but also teaching and learning. The HearstCAVE, an immersive visualization wall at the Phoebe A. Hearst Museum of Anthropology, offers one example. The HearstCAVE (Cave Automatic Virtual Environment) is a part of a visualization network developed at UC Berkeley, UC Merced, UCLA, and UC San Diego to share and preserve visualizations of cultural heritage sites and materials at risk from warfare, terrorism, or erosion. The 8.5×8 square-foot 3D wall shows archeological sites all over the world, from Egypt to Australia, and lets visitors delve deeper into topics of interest by using an Xbox controller to explore panoramic scenes in 360 degrees.

The HearstCAVE is part of the “At-Risk Cultural Heritage and the Digital Humanities” project funded by the UC President’s Research Catalyst Awards in 2016, which funds multi-campus research in areas of strategic importance to UC that could benefit California and the world. The PRP has been playing a key role in connecting the CAVE kiosks between the four UC campuses through the high-speed network.

At Berkeley, a plan is underway to upgrade the network connection of the HearstCAVE as well as the core infrastructure of the campus from its current 1 Gbps to more than 10 Gbps. If implemented, it is expected to facilitate various tasks and functions between the CAVE kiosks such as remote file mounts for large panoramic images and 3D models, distributed photogrammetry workflows in real time, zoom-based videoconferencing between CAVEs for graduate seminars, and multi-way VR sharing.

CITRIS is working with the Hearst Museum and Berkeley’s Research IT to install a CAVE kiosk in the CITRIS Tech Museum that will connect with the CAVE kiosks in the HearstCAVE and other UC campuses through high-speed and high-throughput network.

“This is part of our experiment related to the PRP,” says Chris Hoffman, associate director of Research IT, who oversees the installation and application of visualizations technologies on campus. Possible uses include visualizations of digital archaeology projects, biomedical applications, visualizations of climate or energy data, astronomy, and earth sciences.

“This is not just something to look at, but a platform to inspire new ways to think about research, teaching, and learning,” says Hoffman. “It’s really about helping people discover new ways to accomplish their goals.”

Interest in the PRP sparked a call to scale the project up, and the first National Research Platform (NRP) conference brought researchers together in 2017. The second NRP conference was held in August in Montana.

“We’re aiming to be a pilot and a proof-of-concept here on the West Coast, to see how we can replicate that with regional networks and partners across the country,” says Crittenden. “We want to eliminate distance as a barrier to collaboration for data-intensive fields and harness the power of new networking technology and high-performance computing to expand our understanding of the natural world and shared cultural heritage.”

Pictured at top: Pacific Research Platform connects visualization walls at UC campuses through a new high-speed network.

Photos by Matt Martin