Six disruptive possibilities from big data

Specific ways big data will inundate vendors and customers.

Disruptive PossibilitiesMy new book, Disruptive Possibilities: How Big Data Changes Everything, is derived directly from my experience as a performance and platform architect in the old enterprise world and the new, Internet-scale world.

I pre-date the Hadoop crew at Yahoo!, but I intimately understood the grid engineering that made Hadoop possible. For years, the working title of this book was The Art and Craft of Platform Engineering, and when I started working on Hadoop after a stint in the Red Hat kernel group, many of the ideas that were jammed into my head, going back to my experience with early supercomputers, all seem to make perfect sense for Hadoop. This is why I frequently refer to big data as “commercial supercomputing.”

In Disruptive Possibilities, I discuss the implications of the big data ecosystem over the next few years. These implications will inundate vendors and customers in a number of ways, including:

A startup takes on “the paper problem” with crowdsourcing and machine learning

With a new mobile app and API, Captricity wants to build a better bridge between analog and digital.

Unlocking data from paper forms is the problem that optical character recognition (OCR) software is supposed to solve. Two issues persist, however. First, the hardware and software involved are expensive, creating challenges for cash-strapped nonprofits and government. Second, all of the information on a given document is scanned into a system, including sensitive details like Social Security numbers and other personally identifiable information. This is a particularly difficult issue with respect to health care or bringing open government to courts: privacy by obscurity will no longer apply.

The process of converting paper forms into structured data still hasn’t been significantly disrupted by rapid growth of the Internet, distributed computing and mobile devices. Fields that range from research science to medicine to law to education to consumer finance to government all need better, cheaper bridges from the analog to the digital sphere.

Enter Captricity. The startup, which was co-founded by Jeff J. Lin and Kuang Chen, has its roots in the fieldwork on rural health Chen did as part of his PhD program.

“I was looking at the information systems that were available to these low-resource organizations,” Chen said in a recent phone interview. “I saw that they’re very much bound in paper. There’s actually a lot of efforts to modernize the infrastructure and put in mobile phones. Now that there’s mobile connectivity, you can run a health clinic on solar panels and long distance Wi-Fi. At the end of the day, however, business processes are still on paper because they had to be essentially fail-proof. Technology fails all the time. From that perspective, paper is going to stick around for a very long time. If we’re really going to tackle the challenge of the availability of data, we shouldn’t necessarily be trying to change the technology infrastructure first — bringing mobile phones and iPads to where there’s paper — but really to start with solving the paper problem.”

When Chen saw that data entry was a chokepoint for digitizing health indicators, he started working on developing a better, cheaper way to ingest data on forms.

When data disrupts health care

The convergence of data, privacy and cost have created a unique opportunity to reshape health care.

Health care appears immune to disruption. It’s a space where the stakes are high, the incumbents are entrenched, and lessons from other industries don’t always apply.

Yet, in a recent conversation between Tim O’Reilly and Roger Magoulas it became evident that we’re approaching an unparalleled opportunity for health care change. O’Reilly and Magoulas explained how the convergence of data access, changing perspectives on privacy, and the enormous expense of care are pushing the health space toward disruption.

As always, the primary catalyst is money. The United States is facing what Magoulas called an “existential crisis in health care costs” [discussed at the 3:43 mark]. Everyone can see that the current model is unsustainable. It simply doesn’t scale. And that means we’ve arrived at a place where party lines are irrelevant and tough solutions are the only options.

“Who is it that said change happens when the pain of not changing is greater than the pain of changing?” O’Reilly asked. “We’re now reaching that point.” [3:55]

(Note: The source of that quote is hard to pin down, but the sentiment certainly applies.)

This willingness to change is shifting perspectives on health data. Some patients are making their personal data available so they and others can benefit. Magoulas noted that even health companies, which have long guarded their data, are warming to collaboration.

At the same time there’s a growing understanding that health data must be contextualized. Simply having genomic information and patient histories isn’t good enough. True insight — the kind that can improve quality of life — is only possible when datasets are combined.



