ENTRIES TAGGED "open access"
How the field of genetics is using data within research and to evaluate researchers
Editor’s note: Earlier this week, Part 1 of this article described Sage Bionetworks, a recent Congress they held, and their way of promoting data sharing through a challenge.
Data sharing is not an unfamiliar practice in genetics. Plenty of cell lines and other data stores are publicly available from such places as the TCGA data set from the National Cancer Institute, Gene Expression Omnibus (GEO), and Array Expression (all of which can be accessed through Synapse). So to some extent the current revolution in sharing lies not in the data itself but in critical related areas.
First, many of the data sets are weakened by metadata problems. A Sage programmer told me that the famous TCGA set is enormous but poorly curated. For instance, different data sets in TCGA may refer to the same drug by different names, generic versus brand name. Provenance–a clear description of how the data was collected and prepared for use–is also weak in TCGA.
In contrast, GEO records tend to contain good provenance information (see an example), but only as free-form text, which presents the same barriers to searching and aggregation as free-form text in medical records. Synapse is developing a structured format for presenting provenance based on the W3C’s PROV standard. One researcher told me this was the most promising contribution of Synapse toward the shared used of genetic information.
John Wilbanks on health data donation, contextual privacy, and open networks.
As I wrote earlier this year in an ebook on data for the public good, while the idea of data as a currency is still in its infancy, it’s important to think about where the future is taking us and our personal data.
If the Obama administration’s smart disclosure initiatives gather steam, more citizens will be able to do more than think about personal data: they’ll be able to access their financial, health, education, or energy data. In the U.S. federal government, the Blue Button initiative, which initially enabled veterans to download personal health data, is now spreading to all federal employees, and it also earned adoption at private institutions like Aetna and Kaiser Permanente. Putting health data to work stands to benefit hundreds of millions of people. The Locker Project, which provides people with the ability to move and store personal data, is another approach to watch.
The promise of more access to personal data, however, is balanced by accompanying risks. Smartphones, tablets, and flash drives, after all, are lost or stolen every day. Given the potential of mhealth, and big data and health care information technology, researchers and policy makers alike are moving forward with their applications. As they do so, conversations and rulemaking about health care privacy will need to take into account not just data collection or retention but context and use.
Put simply, businesses must confront the ethical issues tied to massive aggregation and data analysis. Given that context, Fred Trotter’s post on who owns health data is a crucial read. As Fred highlights, the real issue is not ownership, per se, but “What rights do patients have regarding health care data that refers to them?”
Would, for instance, those rights include the ability to donate personal data to a data commons, much in the same way organs are donated now for research? That question isn’t exactly hypothetical, as the following interview with John Wilbanks highlights.
Wilbanks, a senior fellow at the Kauffman Foundation and director of the Consent to Research Project, has been an advocate for open data and open access for years, including a stint at Creative Commons; a fellowship at the World Wide Web Consortium; and experience in the academic, business, and legislative worlds. Wilbanks will be speaking at the Strata Rx Conference in October.
Our interview, lightly edited for content and clarity, follows.
Dynamic pricing angers some Uber users, Hadoop hits 1.0, a possible set back for open-access research.
Uber's dynamic pricing worked as intended on New Year's Eve, but not everyone is happy about that. Elsewhere, Hadoop reaches the 1.0 milestone and proposed legislation seeks to repeal an open-access research policy.