We asked the Startup Showcase judges three questions about the big data industry.
The Startup Showcase returns to Strata this month, with 10 startup finalists pitching our panel of judges. We’ve assembled an enviable— and somewhat intimidating— lineup of experts to help narrow down the field.
In the interest of giving our finalists a head start, we asked the judges three questions about the big data industry.
Design compels. Math is proof. Both sides will defend their domains at Strata's next Great Debate.
At Strata Santa Clara later this month, we’re reprising what has become a tradition: Great Debates. These Oxford-style debates pit two teams against one another to argue a hot topic in the fields of big data, ubiquitous computing, and emerging interfaces.
Part of the fun is the scoring: attendees vote on whether they agree with the proposal before the debaters; and after both sides have said their piece, the audience votes again. Whoever moves the needle wins.
This year’s proposition — that design matters more than math — is sure to inspire some vigorous discussion. The argument for math is pretty strong. Math is proof. Given enough data — and today, we have plenty — we can know. “The right information in the right place just changes your life,” said Stewart Brand. Properly harnessed, the power of data analysis and modeling can fix cities, predict epidemics, and revitalize education. Abused, it can invade our lives, undermine economies, and steal elections. Surely the algorithms of big data matter!
But your life won’t change by itself. Bruce Mau defines design as “the human capacity to plan and produce desired outcomes.” Math informs; design compels. Without design, math can’t do its thing. Poorly designed experiments collect the wrong data. And if the data can’t be understood and acted upon, it may as well not have been crunched in the first place.
This is the question we’ll be putting to our debaters: Which matters more? A well-designed collection of flawed information — or an opaque, hard-to-parse, but unerringly accurate model? From mobile handsets to social policy, we need both good math and good design. Which is more critical? Read more…
The cycle of good, bad, and stable has happened at every layer of the stack. It will happen with big data, too.
First, technology is good. Then it gets bad. Then it gets stable.
This has been going on for a long time, likely since the invention of fire, knives, or the printed word. But I want to focus specifically on computing technology. The human race is busy colonizing a second online world and sticking prosthetic brains — today, we call them smartphones — in front of our eyes and ears. And stacks of technology on which they rely are vulnerable.
When we first created automatic phone switches, hackers quickly learned how to blow a Cap’n Crunch whistle to get free calls from pay phones. When consumers got modems, attackers soon figured out how to rapidly redial to get more than their fair share of time on a BBS, or to program scripts that could brute-force their way into others’ accounts. Eventually, we got better passwords and we fixed the pay phones and switches.
We moved up the networking stack, above the physical and link layers. We tasted TCP/IP, and found it good. Millions of us installed Trumpet Winsock on consumer machines. We were idealists rushing onto the wild open web and proclaiming it a new utopia. Then, because of the way the TCP handshake worked, hackers figured out how to DDOS people with things like SYN attacks. Escalation, and router hardening, ensued.
We built HTTP, and SQL, and more. At first, they were open, innocent, and helped us make huge advances in programming. Then attackers found ways to exploit their weaknesses with cross-site scripting and buffer overruns. They hacked armies of machines to do their bidding, flooding target networks and taking sites offline. Technologies like MP3s gave us an explosion in music, new business models, and abundant crowd-sourced audiobooks — even as they leveled a music industry with fresh forms of piracy for which we hadn’t even invented laws. Read more…
The biggest threat that a data-driven world presents is an ethical one.
Since the first of our ancestors chipped stone into weapon, technology has divided us. Seldom more than today, however: a connected, always-on society promises health, wisdom, and efficiency even as it threatens an end to privacy and the rise of prejudice masked as science.
On its surface, a data-driven society is more transparent, and makes better uses of its resources. By connecting human knowledge, and mining it for insights, we can pinpoint problems before they become disasters, warding off disease and shining the harsh light of data on injustice and corruption. Data is making cities smarter, watering the grass roots, and improving the way we teach.
But for every accolade, there’s a cautionary tale. It’s easy to forget that data is merely a tool, and in the wrong hands, that tool can do powerful wrong. Data erodes our privacy. It predicts us, often with unerring accuracy — and treating those predictions as fact is a new, insidious form of prejudice. And it can collect the chaff of our digital lives, harvesting a picture of us we may not want others to know.
The big data movement isn’t just about knowing more things. It’s about a fundamental shift from scarcity to abundance. Most markets are defined by scarcity — the price of diamonds, or oil, or music. But when things become so cheap they’re nearly free, a funny thing happens.
Consider the advent of steam power. Economist Stanley Jevons, in what’s known as Jevons’ Paradox, observed that as the efficiency of steam engines increased, coal consumption went up. That’s not what was supposed to happen. Jevons realized that abundance creates new ways of using something. As steam became cheap, we found new ways of using it, which created demand.
The same thing is happening with data. A report that took a month to run is now just a few taps on a tablet. An unthinkably complex analysis of competitors is now a Google search. And the global distribution of multimedia content that once required a broadcast license is now an upload. Read more…
A compelling crop of companies will present at the Strata Conference + Hadoop World Startup Showcase.
We had a wide range of startups apply for a slot in the Strata Conference + Hadoop World Startup Showcase. Our selection committee, which included investors, entrepreneurs, and executives from SAP — which is sponsoring the event — whittled these down to just a few, who will get a chance to strut their stuff in the Big Apple next week.
All sorts of early-stage firms applied, both those using data as a key differentiator, and those building the next-generation infrastructures that can handle the torrent of information our world produces. We also had applicants who visualize, communicate, and democratize, turning complex, chewy data into bite-sized, interactive nuggets that are easier to digest.
It’s a compelling crop of new entrants into today’s vibrant big data ecosystem, and we’re thrilled to welcome them to next week’s event, where Tim O’Reilly and Fred Wilson face the unenviable task of choosing the top three.
Startup Showcase finalists
Further reading and discussion on the civil rights implications of big data.
A few weeks ago, I wrote a post about big data and civil rights, which seems to have hit a nerve. It was posted on Solve for Interesting and here on Radar, and then folks like Boing Boing picked it up.
I haven’t had this kind of response to a post before (well, I’ve had responses, such as the comments to this piece for GigaOm five years ago, but they haven’t been nearly as thoughtful).
Some of the best posts have really added to the conversation. Here’s a list of those I suggest for further reading and discussion:
Nobody notices offers they don’t get
On Oxford’s Practical Ethics blog, Anders Sandberg argues that transparency and reciprocal knowledge about how data is being used will be essential. Anders captured the core of my concerns in a single paragraph, saying what I wanted to far better than I could:
… nobody notices offers they do not get. And if these absent opportunities start following certain social patterns (for example not offering them to certain races, genders or sexual preferences) they can have a deep civil rights effect
To me, this is a key issue, and it responds eloquently to some of the comments on the original post. Harry Chamberlain commented:
However, what would you say to the criticism that you are seeing lions in the darkness? In other words, the risk of abuse certainly exists, but until we see a clear case of big data enabling and fueling discrimination, how do we know there is a real threat worth fighting?
Looking ahead at big data's role in enterprise business intelligence, civil engineering, and customer relationship optimization.
- Everything is on the Internet.
- The Internet has a lot of data.
- Therefore, everything is big data.
When you have a hammer, everything looks like a nail. When you have a Hadoop deployment, everything looks like big data. And if you’re trying to cloak your company in the mantle of a burgeoning industry, big data will do just fine. But seeing big data everywhere is a sure way to hasten the inevitable fall from the peak of high expectations to the trough of disillusionment.
We saw this with cloud computing. From early idealists saying everything would live in a magical, limitless, free data center to today’s pragmatism about virtualization and infrastructure, we soon took off our rose-colored glasses and put on welding goggles so we could actually build stuff.
So where will big data go to grow up?
Once we get over ourselves and start rolling up our sleeves, I think big data will fall into three major buckets: Enterprise BI, Civil Engineering, and Customer Relationship Optimization. This is where we’ll see most IT spending, most government oversight, and most early adoption in the next few years. Read more…
A glimpse into enterprise use of big data.
Feedback from a recent Strata Online Conference suggests there's a large demand for clear information on what big data is and how it will change business.
Companies that employ data feedback loops are poised to dominate their industries.
We're moving beyond an information economy. The efficiencies and optimizations that come from constant and iterative feedback will soon become the norm for businesses and governments.