To help prepare for O’Reilly’s upcoming Strata Conference, we set out to talk with
some of the leading innovators working with big data and analytics.
We start with a brief conversation with Stefan Groschupf, the CEO of Datameer,
whose Datameer Analytics Solution sits on top of Hadoop. Groschup, who
has been working with Hadoop for seven years, says he recognized it as
an “amazing” and “very powerful” technology, but one that wasn’t reaching the
people it needed to.
“I saw the power of Hadoop, but I recognized there was a
gap between the data and the people that want to get close to it,” he said. “We’re trying to close this gap.”
Datameer’s solution is one of the commercial tools aiming to fill that gap. Until
recently, tools for crunching big data sets — volumes that far exceed the
capability of something like Excel — have been mostly proprietary and not widely
available in the commercial and open spaces. That’s changing slowly, with open-
source tools and commercials efforts like Datameer’s and IBM’s BigSheets.
As Groschupf explains, there are at least three important steps to working with
big data — or indeed, any data set:
- Getting the data into your computation system (a huge part of the challenge).
- Defining the analytics you want to do.
- Visualizing the data to make it accessible.
Practical application of data will be discussed at the upcoming Strata Conference (Feb. 1-3, 2011). Save 30% on registration with the code STR11RAD.
Groschupf’s products try to make these steps as user-friendly as possible to give
access to the data to a wide group of business analysts.
“What would be great would be giving those insights to the people who need it to
drive product decisions and processes in companies,” Groschupf said, adding
that they aim to avoid the inevitable telephone-like game of communication
between a programmer, data warehouse specialist, analytics person, and
analyst. “Empowering users who need the information is the future.”
You’ll find the full interview in the following video: