Tarek Zodhi wants students to know what to do with big data. It is a valuable and abundant resource that’s swirling around all kinds of academic disciplines. In some departments, like engineering and astrophysics, it is put to good use. In others, though, it is spilling unused into the hallways and out the windows. Zohdi wants to teach students from all over campus how to put it to work and how to recognize what it can and cannot do for them.
Chris Holdgraf, for example, is about to enter the third year of his PhD program at Berkeley. His field, neuroscience, is undergoing a revolution of sorts. Traditionally, it has emphasized the observable, anatomical, biological, and physiological aspects of the brain. Even today, Holdgraf could earn his PhD studying nothing but these approaches. But powerful novel tools are opening new windows into the human brain by analyzing complex data sets captured from the billions of neurons that orchestrate human behavior. The ability to analyze the gigabytes of data coming out of the brain of an epilepsy patient with electrocorticography (ECoG) implants, for example, is far beyond the ability of any human observer alone. “It’s just much, much too much data,” says Holdgraf, who works in the lab of neuroscientist Robert Knight.
But through the application of mathematical and computational analytics, all of those data can reveal deep secrets about what’s going on in the human brain. Holdgraf and his colleagues, for example, are launching an effort to try to sort out how areas in the brain zero in on environmental features that deviate from the brain’s predictive models. These are the things—a big cat moving in the bushes, say, or a car driving down the wrong side of the street—that our brains must quickly shine the bright spotlight of attention on. But tracking the hundreds of millions of neuronal firings and interconnections that manifest these shifts of attention requires sophisticated analytic tools that seek patterns of meaning in oceans of noisy data.
Neuroscientists still need to understand physiology, of course, says Holdgraf. But they need to understand the brain’s math as well as its meat. And computer programs, particularly those that can crunch huge data sets and generate meaningful models, hold the key to decoding the brain’s math.
Zohdi, a professor in Berkeley’s Department of Mechanical Engineering, also chairs the Designated Emphasis in Computational Science and Engineering, administered by CITRIS. The DECSE is “essentially a graduate minor designed to give PhD students fluency in modeling, simulation, and data analysis tools,” says Zohdi. The program has been adopted by approximately 120 faculty from over 20 departments and graduate programs as diverse as computer science, mathematics, chemistry, mechanical engineering, astronomy, neuroscience, and political science. The program currently has 40 students enrolled and has graduated five.
Upon successful completion of a departmental graduate program, along with the requirements of the designated emphasis, a student’s transcript and diploma will include the DECSE certification. If all goes as planned, for example, Holdgraf will earn a PhD in Neuroscience with a Designated Emphasis in Computational Science and Engineering. In an academic world increasingly reliant on computational methods at its cutting edge, the designation will add value to a degree, says Zohdi, especially in a field in flux, like neuroscience, where employers can’t yet presume that even freshly minted PhDs know their way around the computational world.
When he entered his graduate program, Holdgraf felt unprepared to exploit the opportunities presented by the huge amount of data now available. His undergraduate degree was in neuroscience and psychology; now he wishes it had been in physics so he’d have less catching up to do. He also found a department divided about the value of crunching data with computer models.
“A lot of old-school electro-physiologists or cell biologists would say, ‘I don’t really believe a fact unless it’s a completely observable thing. Either a neuron is firing or it is not, and you can see that under a microscope.’ But if you’re recording with fMRI then you’re getting a data point every couple of seconds from 20,000-plus different sources all at the same time; you’re not going to be able to figure out what’s going on by observation, by yourself. That’s where these more complicated computational techniques can be really useful. They can quickly churn through data and find patterns and structures that exist that would take an individual person a whole lifetime to make sense of.”
“Berkeley’s DECSE program is meant to build on the student’s graduate expertise in a particular field and augment it with a focused addition on computational science,” says Zohdi, who earned his PhD in Computational and Applied Mathematics at the University of Texas in Austin, where much of the early work on computational modeling programs occurred. Much of that early work was done by oil companies using geological data to explore for underground oil deposits. “For them it was big bucks,” says Zohdi.
Some fields adopted computation much earlier than others, says Zohdi. “We crossed that Rubicon 20 years ago in engineering. And other fields are making that same realization, it’s just taking some a little bit longer.”
Social sciences, such as history and anthropology, have been particularly reluctant to embrace computational methods, says Zohdi. But things like online social media are offering troves of valuable research data that require higher-level mathematical skills to analyze.
“Social dynamics—predicting human behavior, as an advertiser or as a cyber security analyst, say—require a deep understanding of computation as a discipline. But most students cannot get an entire degree in computational science. Second best is to introduce as many students as possible to computing as a tool. We figure the DECSE is the proper mechanism to do that,” says Zohdi.
“New commercial and open-source packages for simulation and computational analysis are handed down now from student to student to attack complex problems with sophisticated algorithms,” says Zohdi. On the one hand, “this allows students to use high-performance computing early in graduate school and to become quickly involved in cutting-edge research using computational tools.” On the other hand, “all too often students don’t fully understand or appreciate the limitations of these codes, which can lead to their misuse where they’re inapplicable,” says Zohdi.
The expertise the DECSE students gain goes both ways, says Zohdi. “The computational tools have a lot to give, but they don’t do everything. We’re partly trying to teach students to view these tools skeptically, to know when they can be successfully applied and when they can’t.”