Hypothesis-free data analysis turns up unexpected incidences of illness
This posting was written by guest author Arijit Sengupta, CEO of BeyondCore. Arijit will speak at Strata Rx 2013 on the kinds of data analysis discussed in this post.
Much of the effort in health reform in the United States has gone toward recruiting 18-to-35 year olds into the insurance pool so that the US economy and insurers can afford the Affordable Care Act (ACA). The assumption here is that health care costs will be less for this young population than for other people, but is this true? Our recent analysis of 6.8 million insured young adults, across 200,000 variable combinations, suggests that young adults may be more expensive to insure than we realize.
Our study shows a high occurrence of mental health diseases among 18-to-35 year olds who have insurance and therefore more affordable access to medical care. Moreover, expenses associated with mental health conditions are very high, especially when coupled with a physical ailment. As the previously uninsured 18-to-35 year olds get access to affordable care, we may see a similarly high rate of mental health diagnoses among this population. The bad news is that the true costs of insuring 18-to-35 year olds might be much higher than previously suspected. The good news is that previously undiagnosed and untreated mental health conditions may now actually get diagnosed and treated, creating a significant societal benefit.
A video interview with Colin Hill
Last month, Strata Rx Program Chair Colin Hill, of GNS Healthcare, sat down with Dr. Dennis Ausiello, Jackson Professor of Clinical Medicine at the Harvard Medical School, Co-Director at CATCH, Pfizer Board of Directors Member, and Former Chief of Medicine at the Massachusetts General Hospital (MGH), for a fireside chat at a private reception hosted by GNS. Their insightful conversation covered a range of topics that all touched on or intersected with the need to create smaller and more precise cohorts, as well as the need to focus on phenotypic data as much as we do on genotypic data.
The full video appears below.
An Interview with Julie Steele
A week or two ago, I got to correspond with Danielle Brooks of Disruptive Women in Health Care about the work I do here at O’Reilly. The following interview is reprinted here with their kind permission.
Tell us about your work. What drew you to the area?
I have mostly worked as a book editor, until just a year or two ago. I was working on books about databases, machine learning, visualization, and other relevant topics when O’Reilly launched its Strata conference on data science, and so I became involved in that conference. But as Strata took off, it became apparent to us that certain communities — and certain types of data — were special. Health care is one of those areas: the insights that data analysis can give us about ourselves and the things that ail us are enormous, but the risks of over-sharing and the resulting constraints such as HIPAA also present very real challenges.
In 2012, O’Reilly decided to launch a new edition of its data science conference to focus on health care, and that’s how Strata Rx was born. I was asked to become its Program Chair, along with Colin Hill, CEO of GNS Health care, and so I have spent that last 18 months learning everything I can about the (very complicated!) health care industry. Colin and I are great partners because of the complimentary backgrounds we bring together — Colin from the health care industry side and myself from the technology side. Ultimately, that’s what Strata Rx aims to do, too: we hope that by bringing together professionals from all parts of the industry (payers, providers, researchers, analysts, advocates, developers, investors, and caregivers, just to name a few) we can begin to solve some of the large and complex problems facing us in this area.
How our vision for this important conference is shaping the program we hope to present, and how you can get involved
After a strong inaugural event in October 2012, Strata Rx is heading into its second year. My fellow chair, Colin Hill, and I have spent a lot of time thinking about and discussing what we’d like to see on the program this year, and I thought I’d share some of those thoughts for anyone considering submitting a proposal or attending the event. (The Call for Proposals is currently open until April 10.)
One of the most interesting challenges in creating a program about data science in healthcare has been deciding what to leave out. Topics like genomics and cancer research are so vast and complex that they can and do have entire conferences about just them. While we won’t reject a talk for centering on a topic like this, it has to be relevant to one of our larger goals, as well.
What we hope to accomplish with Strata Rx
So what are those larger goals? Well, here are a few of the key ones.
Promote dialog across silos
Right now, there are already a lot of niche conferences for specific groups in healthcare. There are events for specific areas of research, such as oncology and genomics, as previously mentioned. There are also events for specific kinds of people, like pharmaceutical reps, or insurance providers. Those conferences that do cut across the industry are only for one level of people, such as Chief Officers.
We want Strata Rx to convene a broad swath of people with an interest and a stake in the healthcare system: researchers, funders, providers, application developers, patient advocates, board members, insurers, IT staff, legislators, and everyone in between. By starting conversations among these different specialists, and by combining their relative expertise, we believe we can build a stronger community that is better able to solve problems.
We aim to be fire-starters, igniting connections and conversations.
An interview with Fred Smith of the CDC on their open content APIs.
Health care data liquidity (the ability of data to move freely and securely through the system) is an increasingly crucial topic in the era of big data. Most conversations about data liquidity focus on patient data, but other kinds of information need to be able to move freely and securely, too. Enter several government initiatives, including efforts at agencies within the Department of Health and Human Services (HHS) to make their content more easily available.
Fred Smith is team lead for the Interactive Media Technology Team in the Division of News and Electronic Media in the Office of the Associate Director for Communication for the U.S. Centers for Disease Control and Prevention (CDC) in Atlanta. We recently spoke by phone to discuss ways in which the CDC is working to make their information more “liquid”: easier to access, easier to repurpose, and easier to combine with other data sources.
Which data is available from the CDC APIs?
Fred Smith: In essence, what we’re doing is taking our unstructured web content and turning it into a structured database, so we can call an API into it for reuse. It’s making our content available for our partners to build into their websites or applications or whatever they’re building.
Todd Park likes to talk about “liberating data” — well, this is liberating content. What is a more high-value dataset than our own public health messaging? It incorporates not only HTML-based text, but also we’re building this to include multimedia — whether it’s podcasts, images, web badges, or other content — and have all that content be aware of other content based on category or taxonomy. So it will be easy to query, for example: “What content does the CDC have on smoking prevention?”
Five ways we can improve the information we collect to help us solve hard problems in health care.
I was honored to chair O’Reilly’s inaugural edition of Strata Rx, our conference on data science in health care, this past October along with Colin Hill. As we’re beginning to plan this year’s event, I find myself thinking a lot about a theme that emerged from some of the keynotes last fall: in order to solve the problems we’re facing in health care — to lower costs and provide more personal, targeted treatments to patients — we don’t just need more data; we need better data.
Much has been made about the era of big data we find ourselves in. But though the data we collect is straining the limits of our tools and models, we’re still not making the kind of headway we hoped for in areas like health care. So big data isn’t enough. We need better data.
What does it mean to have better data in health care? Here are some things on my list; perhaps you can think of others. Read more…
At its best, 3D printing can make us more human by making us whole.
Tim O’Reilly recently asked me and some other colleagues which technology seems most like magic to us. There was a thoughtful pause as we each considered the amazing innovations we read about and interact with every day.
My reasons are different than you might think. Yes, it’s amazing that, with very little skill, we can manufacture complex objects in our homes and workshops that are made from things like plastic or wood or chocolate or even titanium. This seems an amazing act of conjuring that, just a short time ago, would have been difficult to imagine outside of the “Star Trek” set.
But the thing that makes 3D printing really special is the magic it allows us to perform: the technology is capable of making us more human. Read more…
The best data visualizations expose something new.
Effective data visualizations go beyond aesthetics; they also allow organizations to make quick and correct decisions from massive amounts of information.
Opera Solutions' Arnab Gupta says human plus machine always trumps human vs machine.
Managing data and extracting meaning require new approaches, new education, and even a new language. Opera Solutions CEO Arnab Gupta discusses each of these areas in the following interview.