Arijit Sengupta of BeyondCore uncovers hidden relationships in public health data
The importance of visualizing data is universally recognized. But, usually the data is passive input to some visualization tool and the users have to specify the precise graph they want to visualize. BeyondCore simplifies this process by automatically evaluating millions of variable combinations to determine which graphs are the most interesting, and then highlights these to users. In essence, BeyondCore automatically tells us the right questions to ask of our data.
In this video, Arijit Sengupta, CEO of BeyondCore, describes how public health data can be analyzed in real-time to discover anomalies and other intriguing relationships, making them readily accessible even to viewers without a statistical background. Arijit will be speaking at Strata Rx 2013 with Tim Darling of Objective Health, a McKinsey Solution for Healthcare Providers, on the topic of this post.
Donald Berwick discusses health care improvement: goals, exemplary organizations,and being at a turning point
A video interview with entrepreneur Colin Hill
Last week, a wide-ranging interview on data in health care took place between Dr. Donald Berwick and Colin Hill of GNS Healthcare. Dr. Berwick and Hill got together in the Cambridge, Mass. office of the Institute for Healthcare Improvement, a health care reform organization founded by Dr. Berwick, to discuss data issues related to O’Reilly’s upcoming Strata Rx conference.
Berwick returned to IHI after his year as administrator of Centers for Medicare & Medicaid Services. Throughout these changes he has maintained his stalwart advocacy for better patient care, a campaign that has always been based on a society’s and a profession’s moral responsibility. Even an IHI course for the “Patient Safety Executive” program puts “Building a just culture” on its agenda.
Among the topics Berwick and and Hill look at in these videos are the importance of transparency or “turning on the lights,” ways of learning from the health provider system itself as well as from clinical trials, types of personalized medicine, the impediments to collecting useful data that can improve care, exemplary organizations that deliver better healthcare, and how long change will take.
The full video appears below.
Report from OpenClinica conference
Although open source has not conquered the lucrative market for electronic health records (EHRs) used by hospital systems and increasingly by doctors, it is making strides in many other important areas of health care. One example is clinical research, as evidenced by OpenClinica in field of Electronic Data Capture (EDC) and LabKey for data integration. Last week I attended a conference for people who use OpenClinica in their research or want to make their software work with it.
At any one time, hundreds of thousands of clinical trials are going on around the world, many listed on an FDA site. Many are low-budget and would be reduced to using Excel spreadsheets to store data if they didn’t have the Community edition of OpenClinica. Like most companies with open-source products, OpenClinica uses the “open core” model of an open Community edition and proprietary enhancements in an Enterprise edition. There are about 1200 OpenClinica installations around the world, although estimation is always hard to do with open source projects.
What is Electronic Data Capture? As the technologically archaic name indicates, the concept goes back to the 1970s and refers simply to the storage of data about patients and their clinical trials in a database. It has traditionally been useful for reporting results to funders, audit trails, printing in various formats, and similar tasks in data tracking.
Report from 2013 Health Privacy Summit
The timing was superb for last week’s Health Privacy Summit, held on June 5 and 6 in Washington, DC. First, it immediately followed the 2000-strong Health Data Forum (Health Datapalooza), where concern for patients rights came up repeatedly. Secondly, scandals about US government spying were breaking out and providing a good backdrop for talking about protection our most sensitive personal information–our health data.
The health privacy summit, now in its third year, provides a crucial spotlight on the worries patients and their doctors have about their data. Did you know that two out of three doctors (and probably more–this statistic cites just the ones who admit to it on a survey) have left data out of a patient’s record upon the patient’s request? I have found that the summit reveals the most sophisticated and realistic assessment of data protection in health care available, which is why I look forward to it each year. (I’m also on the planning committee for the summit.) For instance, it took a harder look than most observers at how health care would be affected by patient access to data, and the practice of sharing selected subsets of data, called segmentation.
What effect would patient access have?
An odd perceptual discontinuity exists around patient access to health records. If you go to your doctor and ask to see your records, chances are you will be turned down outright or forced to go through expensive and frustrating magical passes. One wouldn’t know that HIPAA explicitly required doctors long ago to give patients their data, or that the most recent meaningful use rules from the Department of Health and Human Services require doctors to let patients view, download, and transmit their information within four business days of its addition to the record.
We need to provide data to patients in a form they can understand
Would you take a morning off from work to discuss health care costs and consumer empowerment in health care? Over a hundred people in the Boston area did so on Monday, May 6, for the conference “Empowering Healthcare Consumers: A Community Conversation Conference” at the Suffolk Law School. This fast-paced and wide-ranging conference lasted just long enough to show that hopes of empowering patients and cutting health care costs (which is the real agenda behind most of the conference organizers) run up against formidable hurdles–many involving the provision of data to these consumers.
Review of Mayer-Schönberger and Cukier's Big Data
Measuring a world-shaking trend with feet planted in every area of human endeavor cannot be achieved in a popular book of 200 pages, but one has to start somewhere. I am happy to recommend the adept efforts of Viktor Mayer-Schönberger and Kenneth Cukier as a starting point. Their recent book Big Data: A Revolution That Will Transform How We Live, Work, and Think (recently featured in a video interview on the O’Reilly Strata site) does not quite unravel the mystery of the zeal for recording and measurement that is taking over governments and business, but it does what a good popularization should: alert us to what’s happening, provide some frameworks for talking about it, and provide a launchpad for us to debate the movement’s good and evil.
Because readers of this blog have been grappling with these concerns for some time. I’ll provide the barest summary of topics covered in Mayer-Schönberger and Cukier’s extensive overview, then provide some complementary ideas of my own.
Fit2Cure taps the public's visual skills to match compounds to targets
In the inspiring tradition of Foldit, the game for determining protein shapes, Fit2Cure crowdsources the problem of finding drugs that can cure the many under-researched diseases of developing countries. Fit2Cure appeals to the player’s visual–even physical–sense of the world, and requires much less background knowledge than Foldit.
There about 7,000 rare diseases, fewer than 5% of which have cures. The number of people currently engaged in making drug discoveries is by no means adequate to study all these diseases. A recent gift to Harvard shows the importance that medical researchers attach to filling the gap. As an alternative approach, abstracting the drug discovery process into a game could empower thousands, if not millions, of people to contribute to this process and make discoveries in diseases that get little attention to scientists or pharmaceutical companies.
The biological concept behind Fit2Cure is that medicines have specific shapes that fit into the proteins of the victim’s biological structures like jig-saw puzzle pieces (but more rounded). Many cures require finding a drug that has the same jig-saw shape and can fit into the target protein molecule, thus preventing it from functioning normally.
How the field of genetics is using data within research and to evaluate researchers
Editor’s note: Earlier this week, Part 1 of this article described Sage Bionetworks, a recent Congress they held, and their way of promoting data sharing through a challenge.
Data sharing is not an unfamiliar practice in genetics. Plenty of cell lines and other data stores are publicly available from such places as the TCGA data set from the National Cancer Institute, Gene Expression Omnibus (GEO), and Array Expression (all of which can be accessed through Synapse). So to some extent the current revolution in sharing lies not in the data itself but in critical related areas.
First, many of the data sets are weakened by metadata problems. A Sage programmer told me that the famous TCGA set is enormous but poorly curated. For instance, different data sets in TCGA may refer to the same drug by different names, generic versus brand name. Provenance–a clear description of how the data was collected and prepared for use–is also weak in TCGA.
In contrast, GEO records tend to contain good provenance information (see an example), but only as free-form text, which presents the same barriers to searching and aggregation as free-form text in medical records. Synapse is developing a structured format for presenting provenance based on the W3C’s PROV standard. One researcher told me this was the most promising contribution of Synapse toward the shared used of genetic information.
Observations from Sage Congress and collaboration through its challenge
The glowing reports we read of biotech advances almost cause one’s brain to ache. They leave us thinking that medical researchers must command the latest in all technological tools. But the engines of genetic and pharmaceutical innovation are stuttering for lack of one key fuel: data. Here they are left with the equivalent of trying to build skyscrapers with lathes and screwdrivers.
Sage Congress, held this past week in San Francisco, investigated the multiple facets of data in these field: gene sequences, models for finding pathways, patient behavior and symptoms (known as phenotypic data), and code to process all these inputs. A survey of efforts by the organizers, Sage Bionetworks, and other innovations in genetic data handling can show how genetics resembles and differs from other disciplines.
An intense lesson in code sharing
At last year’s Congress, Sage announced a challenge, together with the DREAM project, intended to galvanize researchers in genetics while showing off the growing capabilities of Sage’s Synapse platform. Synapse ties together a number of data sets in genetics and provides tools for researchers to upload new data, while searching other researchers’ data sets. Its challenge highlighted the industry’s need for better data sharing, and some ways to get there.