The common thread among the Knight Foundation's latest grants: practical application of open data.
Data, on its own, locked up or muddled with errors, does little good. Cleaned up, structured, analyzed and layered into stories, data can enhance our understanding of the most basic questions about our world, helping journalists to explain who, what, where, how and why changes are happening.
Last week, the Knight Foundation announced the winners of its first news challenge on data. These projects are each excellent examples of working on stuff that matters: they’re collective investments in our digital civic infrastructure. In the 20th century, civil society and media published the first websites. In the 21st century, civil society is creating, cleaning and publishing open data.
The grants not only support open data but validate its place in the media ecosystem of 2012. The Knight Foundation is funding data science, accelerating innovation in the journalism and media space to help inform and engage communities, a project that they consider “vital to democracy.”
Why? Consider the projects. Safecast creates networked accountability using sensors, citizen science and open source hardware. LocalData is a mobile method for communities to collect information about themselves and make sense of it. Open Elections will create a free, standardized database stream of election results. Development Seed will develop better tools to contribute to and use OpenStreetMap, the “Wikipedia of maps.” Pop Up Archive will develop an easier way to publish and archive multimedia data to the Internet. And Census.IRE.org will improve the ability of a connected nation and its data editors to access and use the work of U.S. Census Bureau.
The projects hint at a future of digital open government, journalism and society founded upon the principles that built the Internet and World Wide Web and strengthened by peer networks between data journalists and civil society. A river of open data flows through them all. The elements and code in them — small pieces, loosely joined by APIs, feeds and the social web — will extend the plumbing of digital democracy in the 21st century.
The United States National Institutes of Health (NIH) wants to tie development of mobile health apps to evidence-based research, and it hopes to do that with a new grant program. The imperative to align developers with research is urgent, given the strong interest in health IT, mobile health and health data. There are significant challenges for the space, from consumer concerns over privacy and mobile applications to the broader question of balancing health data innovation with patient rights.
To learn more about what’s happening with mobile health apps, health data, behavioral change and cancer research, I recently interviewed Dr. Abdul Sheikh. Our interview, lightly edited for content and clarity, follows.
What led you to your current work at NIH?
Dr. Abdul Sheikh: I’ve always had a strong grounding in public health and population health, but I also have a real passion for technology and informatics. What’s beautiful is, in my current position here as a program director at the National Cancer Institute (NCI), I have a chance to meld these worlds of public health, behavior and communication science with my passion for technology and informatics. Some of the work I did before coming to the NIH was related to the early telemedicine and web-based health promotion efforts that the government of Canada was involved in.
At NCI, I direct a portfolio of research on technology-mediated communication. I’ve also had the chance to get involved and provide leadership on two very cool efforts. One of them is leadership for our division’s Small Business Innovation Research Program (SBIR). I’ve led the first NIH developer challenge competitions as well.
Dr. Stephen Friend on open science and the need for a "GitHub for scientists."
To unlock the potential of health data for the public good, balancing health privacy with innovation will rely on improving informed consent. If the power of big data is to be applied to scientific inquiry in health care, unlocking genetic secrets, finding a cure for breast cancer or “preemptive health care,” changes in scientific culture and technology will both need to occur.
One element of that change could include a health data commons. Another is open access in the research community. Dr. Stephen Friend, the founder of Sage Bionetworks, is one of the foremost advocates of what I think of as “open science.” Earlier in his career, Dr. Friend was a senior vice president at Merck & Co., Inc., where he led the pharmaceutical company’s basic cancer research program.
In a recent interview, Dr. Friend explained what open science means to him and what he’s working on today. For more on the synthesis of open source with genetics, watch Andy Oram’s interview with Dr. Friend and read his series on recombinant research and Sage Congress.
In the age of big data, Deven McGraw emphasizes trust, education and transparency in assuring health privacy.
Society is now faced with how to balance the privacy of the individual patient with the immense social good that could come through great health data sharing. Making health data more open and fluid holds both the potential to be hugely beneficial for patients and enormously harmful. As my colleague Alistair Croll put it this summer, big data may well be a civil rights issue that much of the world doesn’t know about yet.
This will likely be a tension that persists throughout my lifetime as technology spreads around the world. While big data breaches are likely to make headlines, more subtle uses of health data have the potential to enable employers, insurers or governments to discriminate — or worse. Figuring out shopping habits can also allow a company to determine a teenager was pregnant before her father did. People simply don’t realize how much about their lives can be intuited through analysis of their data exhaust.
To unlock the potential of health data for the public good, informed consent must mean something. Patients must be given the information and context for how and why their health data will be used in clear, transparent ways. To do otherwise is to duck the responsibility that comes with the immense power of big data.
In search of an informed opinion on all of these issues, I called up Deven McGraw (@HealthPrivacy), the director of the Health Privacy Project at the Center for Democracy and Technology (CDT). Our interview, lightly edited for content and clarity, follows. Read more…
John Wilbanks on health data donation, contextual privacy, and open networks.
As I wrote earlier this year in an ebook on data for the public good, while the idea of data as a currency is still in its infancy, it’s important to think about where the future is taking us and our personal data.
If the Obama administration’s smart disclosure initiatives gather steam, more citizens will be able to do more than think about personal data: they’ll be able to access their financial, health, education, or energy data. In the U.S. federal government, the Blue Button initiative, which initially enabled veterans to download personal health data, is now spreading to all federal employees, and it also earned adoption at private institutions like Aetna and Kaiser Permanente. Putting health data to work stands to benefit hundreds of millions of people. The Locker Project, which provides people with the ability to move and store personal data, is another approach to watch.
The promise of more access to personal data, however, is balanced by accompanying risks. Smartphones, tablets, and flash drives, after all, are lost or stolen every day. Given the potential of mhealth, and big data and health care information technology, researchers and policy makers alike are moving forward with their applications. As they do so, conversations and rulemaking about health care privacy will need to take into account not just data collection or retention but context and use.
Put simply, businesses must confront the ethical issues tied to massive aggregation and data analysis. Given that context, Fred Trotter’s post on who owns health data is a crucial read. As Fred highlights, the real issue is not ownership, per se, but “What rights do patients have regarding health care data that refers to them?”
Would, for instance, those rights include the ability to donate personal data to a data commons, much in the same way organs are donated now for research? That question isn’t exactly hypothetical, as the following interview with John Wilbanks highlights.
Wilbanks, a senior fellow at the Kauffman Foundation and director of the Consent to Research Project, has been an advocate for open data and open access for years, including a stint at Creative Commons; a fellowship at the World Wide Web Consortium; and experience in the academic, business, and legislative worlds. Wilbanks will be speaking at the Strata Rx Conference in October.
Our interview, lightly edited for content and clarity, follows.
Dyson says it's time to focus on maintaining good health, as opposed to healthcare.
If we look ahead to the next decade, it’s worth wondering whether the way we think about health and health care will have shifted. Will health care technology be a panacea? Will it drive even higher costs, creating a broader divide between digital haves and have-nots? Will opening health data empower patients or empower companies?
As ever, there will be good outcomes and bad outcomes, and not just in the medical sense. There’s a great deal of thought around the potential for mobile applications right now, from the FDA’s potential decision to regulate them to a reported high abandonment rate. There are also significant questions about privacy, patient empowerment and meaningful use of electronic health care records.
When I’ve talked to US CTO Todd Park or Dr. Farzad Mostashari they’ve been excited about the prospect for health data to fuel better dashboards and algorithms to give frontline caregivers access to critical information about people they’re looking after, providing critical insight at the point of contact.
Kathleen Sebelius, the U.S. Secretary for Health and Human Services, said at this year’s Health Datapalooza that venture capital investment in the health care IT area is up 60% since 2009.
Rep. Issa expressed support for reforming FOIA to include personal data held by companies.
The Freedom of Information Act (FOIA), which gives the people and press the right to access information from government, is one of the pillars of open government in the modern age. In the United States, FOIA is relatively new — it was originally enacted on July 4, 1966. As other countries around the world enshrine the principle into their legal systems, new questions about FOIA are arising, particularly when private industry takes on services that previously were delivered by government.
In that context, one of the federal open government initiatives worth watching in 2012 is ‘smart disclosure,’ the targeted release of information about citizens or about services they consume by government and by private industry. Smart disclosure is notable because there’s some “there there.” It’s not just a matter of it being one of the “flagship open government initiatives” under the U.S. National Plan for open government or that a White House Smart Disclosure Summit in March featured a standing room only audience at the National Archives. When compared to other initiatives, there has been relatively strong uptake of data from government and the private sector and its use in the consumer finance sector. Citizens can download their bank records and use them to make different decisions.
Earlier this summer, I interviewed Representative Darrell Issa (R-CA) about a number of issues related to open government, including what he thought of “smart disclosure” initiatives.
If legislative efforts to standardize federal government spending data founder in the U.S. Senate, it's a missed opportunity.
The old adage that “you can’t manage what you can’t measure” is often applied to organizations in today’s data-drenched world. Given the enormity of the United States federal government, breaking down the estimated $3.7 trillion dollars in the 2012 budget into its individual allocations, much less drilling down to individual outlays to specific programs and subsequent performance, is no easy task. There are several sources for policy wonks to turn use for applying open data to journalism, but the flagship database of federal government spending at USASpending.gov simply isn’t anywhere near as accurate as it needs to be to source stories. The issues with USASpending.gov have been extensively chronicled by the Sunlight Foundation in its ClearSpending project, which found that nearly $1.3 trillion of federal spending as reported on the open data website was inaccurate.
If the people are to gain more insight into how their taxes are being spent, Congress will need to send President Obama a bill to sign to improve the quality of federal spending data. In the spring of 2012, the U.S. House passed by unanimous voice vote the DATA Act, a signature piece of legislation from Representative Darrell Issa (R-CA). H.R. 2146 requires every United States federal government agency to report its spending data in a standardized way and establish uniform reporting standards for recipients of federal funds.
The British government further embraces open data as a means to transparency and "prosperity."
The Cabinet Office of the United Kingdom released a notable new white paper on open data and relaunched its flagship open data platfrom, Data.gov.uk. This post features interviews on open data with Cabinet Minister Francis Maude, Tim Berners-Lee and Rufus Pollock.
Michael Flowers explains why applying data science to regulatory data is necessary to use city resources better.
A predictive data analytics team in the Mayor's Office of New York City has been quietly using data science to find patterns in regulatory data that can then be applied to law enforcement, public safety, public health and better allocation of taxpayer resources.