NSF Supports UC Berkeley in Taming the Data Deluge in Astronomy

A multidisciplinary group from UC Berkeley have been awarded a $1,573,550 grant from the National Science Foundation (NSF) to tackle new challenges confronting astronomers who study the dynamic and variable sky. The grant, lasting 3 years, is part of a new Cyber-Enabled Discovery and Innovation (CDI) initiative designed "to create revolutionary science and engineering research outcomes made possible by innovations and advances in computational thinking."  

The crux of the challenge is one faced by many disciplines where huge streams of data must be quickly assimilated and acted upon. In astronomy, the Berkeley team sees a particularly important transition occurring. As Principal Investigator Associate Professor Joshua S. Bloom (Astronomy Department) explains: "There simply are not enough telescope resources in the world to chase after everything that goes bump in the night. New algorithms and computational frameworks must be put in place to abstract away the traditional scientist role and their proximity to raw data."

On the horizon is the NSF-sponsored Large Synoptic Survey Telescope (LSST) which will be forefront in the effort to collect more astronomy data in the next decade than in all of human history. The "movies" of the changing sky made by LSST will reveal millions of supernova explosions and hundreds of millions of stars that change, some subtly and some dramatically. The haystack is getting simply too large for even an army of needle-hunting humans to find the interesting scientific objects as quickly as they need to be found.

Getting traction on the problem will necessarily draw from research in several arenas outside the traditional purview of astronomers. "The need to understand and react to what is happening in the sky right now with incomplete information presents a particularly rich set of problems in statistics and machine learning. This whole endeavor presents real-world challenges to an otherwise abstract development of algorithms and theory," says Assistant Professor and co-Principal Investigator Noureddine El Karoui (Statistics Department).  

One important complication to the work at hand is that whatever is built to handle today's onslaught of data must be allowed to easily scale to tomorrow's even larger deluge. "So creating algorithms that scale in a parallelized computing environment is particularly important," says Co-Principal Investigator Associate Professor Martin Wainwright (Electrical Engineering and Computer Sciences  and Statistics). To this end, the team in collaboration with the computational science and engineering program at CITRIS, has been granted time to test parallelized algorithms on the cloud-computing environments of Yahoo!, Amazon and Google.

Other senior personnel collaborating on the NSF project include Masoud Nikravesh (CITRIS and LBNL), John Rice (Statistics Department), Peter Nugent (LBNL), and Horst Simon (LBNL).

For more information:

Josh Bloom
Associate Professor of Astronomy
601 Campbell Hall
University of California, Berkeley
Berkeley, CA 94720
Office: (510) 643-3839

Masoud Nikravesh
CITRIS Director for Computational Science and Engineering
Office:  (510) 643-4522
356D Sutardja Dai Hall #1764
Berkeley, CA 94720-1764