yvette@citris-uc.org

Yvette Subramanian

540 Cory Hall, UC Berkeley

Please join us for a talk by Deb Agarwal and Catharine van Ingen in 540 Cory Hall, UC Berkeley.

 


"Early Experience Prototyping a Science Data Service for Environmental Data"
Deb Agarwal, Berkeley Water Center
and
Catharine van Ingen, Microsoft


12:00 p.m. on Wednesday, September 20 in 540 Cory Hall, UC Berkeley. Part of the CITRIS Research Exchange at UC Berkeley. The complete schedule for the fall semester is online at RE-fall2006.

This talk is now archived online at mms://MEDIA.citris.berkeley.edu/Agarwal_and_Ingen_9_20_06.

 

Abstract:
Recognition of the importance of data access as a necessary pre-requisite to scientific analysis has sparked development of data archives incorporating data from a variety of sources. This trend has dramatically improved the availability of data and completeness of data sets in many scientific disciplines. This data when combined with locally collected field observations including sensor data and model results has the potential to enable new science analyses. At the same time, there is an increasing desire to do science at scales larger than a single site or watershed and over times measured in years rather than seasons.

At the Berkeley Water Center, we are using data from the Oak Ridge National Laboratory Ameriflux carbon flux measurement towers to develop and prototype a new server for use by a collaborating group to jointly analyze data across sites. Working with data and metadata from the Ameriflux data repository, we are developing a scientific data server. This prototype server provides a framework to allow easy data download, quality checking, cleaning, and storage. The server also includes scientifically important metadata such as site biome or climate along with the actual data. The prototype is designed to allow data from other related data sets to be included as needed.

Our goal is to facilitate scientific investigations and enable serendipitous science: a carbon researcher should be able to very simply mine the data to explore temporal or spatial data correlation between measurements and across sites. We expect to integrate at some of the routine data processing steps and calculations that are often done repeatedly and manually by each investigator using the same data set. We expect to connect the results to visualization tools that are already commonly used by this community. Our intent is to reduce the barrier currently faced by these scientists when analyzing AmeriFlux data, without forcing familiar desktop analysis tools to be abandoned.
This work is joint effort by the Berkeley Water Center and Microsoft Research

Speaker Bio: Deb Agarwal
Deb Agarwal is a researcher with the Berkeley Water Center and is Distributed Systems Department Head at the Lawrence Berkeley National Laboratory, where she has worked since 1994. Her current projects involve research, development and deployment of computing technologies to support collaborative scientific research in a variety of domains, including providing appropriate controls for securing and sharing access to information and computational resources.  Dr. Agarwal holds a Ph.D. in electrical and computer engineering from UC Santa Barbara and a B.S. in Mechanical Engineering from Purdue. Further details available at <http://dsd.lbl.gov/~deba/

Speaker Bio:  Catharine van Ingen
Catharine van Ingen is an architect in the Microsoft Research Silicon Valley E-Science group. Her research focus is the application of commercial data management technologies to enable new insights in environmental science by cooperating scientists. She has been with Microsoft since 1997. Dr. van Ingen holds a PhD in Civil and Environmental Engineering from the California Institute of Technology, an M.S. in Civil Engineering from UC Berkeley, and a B.S. in Civil and Environmental Engineering from UC Irvine. Her home page is http://research.microsoft.com/~vaningen/.