The Fourth Paradigm = programs to manage and mine enormous data sets

In my work with scientists in mouse-related and ag-biotech related research, a constant challenge is the management of data and information that is collected at a seemingly exponential rate.  The capacity to create this data (knowledge) far out-paces our ability to develop appropriate programs to manage it.  As a result, we are data/information heavy (a good thing) but with no real capacity to optimize its sharing and use (a bad thing), even within the tighter (presumably more manageable) boundaries of a given project.  

Dna

I came across an interesting article in the Harvard Business Review today entitled: “The Big Idea: The Next Scientific Revolution” http://hbr.org/2010/11/the-big-idea-the-next-scientific-revolution/ar/1. According to its author, Tony Hey, experts do have a good understanding of data and they have the ability to see the often invisible links “between the columns”; finding non-obvious or latent connections within or between disciplines that can serve as catalysts for new and innovative possibilities.  But we have almost reached a crucial point.  Experts are now DROWNING in data.  Information is streaming in at a dizzying rate making it challenging to organize, analyze and store. The late Jim Gray (American computer scientist and recipient of the Turing Award in 1998) proposed what he called “the fourth paradigm” for scientific exploration.

“[Gray’s] vision of powerful new tools to analyze, visualize, mine, and manipulate scientific data may represent the only systematic hope we have for solving some of our thorniest global challenges” writes Hey. “The fourth paradigm*… involves powerful computers. But instead of developing programs based on known rules, scientists begin with the data. They direct programs to mine enormous databases looking for relationships and correlations, in essence using the programs to discover the rules. We consider big data part of the solution, not the problem. The fourth paradigm isn’t trying to replace scientists or the other three methodologies, but it does require a different set of skills. Without the ability to harness sophisticated computer tools that manipulate data, even the most highly trained expert would never manage to unearth the insights that are now starting to come into focus.”

This plays in nicely with Ostrom’s work on the “commons”.  have a few blog entries on her and the IAD Framework in “Consider Icarus…” (search term = Ostrom)

_ _ _ _ _

*The first two paradigms are experiment and theory, computation/simulation is the third.