ENTRIES TAGGED "data"
What is needed for successful reform of the health care system?
Here’s what we all know: that a data-rich health care future is coming our way. And what it will look like, in large outlines. Health care reformers have learned that no single practice will improve the system. All of the following, which were discussed at O’Reilly’s recent Strata Rx conference, must fall in place.
One of the frequently-asked questions over at the statistics subreddit (reddit.com/r/statistics) is how to test whether a dataset is drawn from a particular distribution, most often the normal distribution.
There are standard tests for this sort of thing, many with double-barreled names like Anderson-Darling, Kolmogorov-Smirnov, Shapiro-Wilk, Ryan-Joiner, etc.
But these tests are almost never what you really want. When people ask these questions, what they really want to know (most of the time) is whether a particular distribution is a good model for a dataset. And that’s not a statistical test; it is a modeling decision.
All statistical analysis is based on models, and all models are based on simplifications. Models are only useful if they are simpler than the real world, which means you have to decide which aspects of the real world to include in the model, and which things you can leave out.
For example, the normal distribution is a good model for many physical quantities. The distribution of human height is approximately normal (see this previous blog post). But human heights are not normally distributed. For one thing, human heights are bounded within a narrow range, and the normal distribution goes to infinity in both directions. But even ignoring the non-physical tails (which have very low probability anyway), the distribution of human heights deviates in systematic ways from a normal distribution.
Archimedes advances evidence-based medicine to foster model-based medicine
This posting is by guest author Tuan Dinh, who will speak about this topic at the Strata Rx conference.
Legendary Silicon Valley investor Vinod Khosla caused quite a stir last year when he predicted at Strata Rx that “Dr. Algorithm”–artificial intelligence driven by large data sets and computational power–would replace doctors in the not-too-distant future. At that point, he said, technology will be cheaper, more accurate and objective, and will ultimately do a better job than the average human doctor at delivering routine diagnoses with standard treatments.
I not only support Khosla’s provocative prophecy, I’ll add one of my own: that Dr. Algorithm (aka Dr. A) will “come to life” in three to five years, by the time today’s first-year med school students are pulling 30-hour shifts as new interns. But what will it take to build the brain of Dr. A? And how can we teach Dr. A to account for increasingly complex medical inputs, such as laboratory tests results, genomic/genetic information, family and personal history, co-morbidities and patient preferences, so he can make optimal clinical decisions for living, breathing patients?
Evolution from a research tool to a platform for patient engagement
Bruce Springer of OneHealth will speak about this topic at the Strata Rx conference. This article was written by Patrick Bane of OneHealth in coordination with Bruce Springer.
According to a recent study performed by the Jesse Brown VA Medical Center and University of Illinois at Chicago, patient-centered care has demonstrated positive outcomes on patients’ health, patients’ self-report of health, and reduced healthcare utilization. The study’s results are consistent with previous research that the patient-centered care model improves the quality of care while simultaneously lowering the cost of care.
OneHealth’s behavior change platform extends the patient-centered model by connecting members anytime, anywhere through mobile and web applications. Member generate data in their daily lives, outside of a clinical setting, which creates a much richer dataset of behaviors that are required to understand the patients’ condition(s), and their readiness to change. Members freely choose what to do and their choices actively generate data in five classes of information:
Data that matters to patients
This article is by guest author Amik Ahmad. He is speaking on this topic at Strata Rx.
Distractions didn’t have a chance. My phone was devoid of reception. The New York Times mobile application searched impossibly for a Wi-Fi connection. Conditions perfect for focus: away from a world always on and connected, noisy, and belligerent with information overload. I could have found joy in a single byte. But instead, I was pushed to the limit of sensory deprivation, and I teetered on the edge of insanity. I spent nine hours of my life in a hospital waiting room.
Hypothesis-free data analysis turns up unexpected incidences of illness
This posting was written by guest author Arijit Sengupta, CEO of BeyondCore. Arijit will speak at Strata Rx 2013 on the kinds of data analysis discussed in this post.
Much of the effort in health reform in the United States has gone toward recruiting 18-to-35 year olds into the insurance pool so that the US economy and insurers can afford the Affordable Care Act (ACA). The assumption here is that health care costs will be less for this young population than for other people, but is this true? Our recent analysis of 6.8 million insured young adults, across 200,000 variable combinations, suggests that young adults may be more expensive to insure than we realize.
Our study shows a high occurrence of mental health diseases among 18-to-35 year olds who have insurance and therefore more affordable access to medical care. Moreover, expenses associated with mental health conditions are very high, especially when coupled with a physical ailment. As the previously uninsured 18-to-35 year olds get access to affordable care, we may see a similarly high rate of mental health diagnoses among this population. The bad news is that the true costs of insuring 18-to-35 year olds might be much higher than previously suspected. The good news is that previously undiagnosed and untreated mental health conditions may now actually get diagnosed and treated, creating a significant societal benefit.
A tool for outreach to patients produces unexpected benefits
The traditional, office-based model for health care is episodic. The provider-patient relationship exists almost completely within the walls of the exam room, with little or no follow-up between visits. Data is primarily episodic as well, based on blood pressure reading done at a specific time or surveys administered there and then, with little collected out of the office. And even the existing data collection tools—paper diaries or clunky meters—are focused more on storing data that on connecting the patient and provider through that data in real time.
There is no way to get in touch when, for instance, a patient’s blood sugar starts varying wildly or pain levels change. The provider often depends on the patient reaching out to them. And even when a provider does put into place an outreach protocol, it is usually very crude, based on a general approach to managing a population as opposed to an understanding of a patient. The end result is a system that, while doing its best within a difficult setting, is by default reactive instead of proactive.
Data journalism’s ‘secret weapon’, data newswires, and the newest data-scraping tools for journalists.
When investigative reporter and journalism instructor Chad Skelton needed help writing a curriculum for a data journalism course, he turned to NICAR-L, the email listerv for the National Institute of Computer Assisted Reporting, for advice. Skelton says that virtually every data journalist in North America is plugged in to the NICAR listserv, making it data journalism’s “secret weapon.”
In 5 tips for a data journalism workflow, the online journalism blog advises newsrooms to find and tap into “data newswires” in the same way newsrooms have used traditional newswires like AP and Reuters.
An Interview with Julie Steele
A week or two ago, I got to correspond with Danielle Brooks of Disruptive Women in Health Care about the work I do here at O’Reilly. The following interview is reprinted here with their kind permission.
Tell us about your work. What drew you to the area?
I have mostly worked as a book editor, until just a year or two ago. I was working on books about databases, machine learning, visualization, and other relevant topics when O’Reilly launched its Strata conference on data science, and so I became involved in that conference. But as Strata took off, it became apparent to us that certain communities — and certain types of data — were special. Health care is one of those areas: the insights that data analysis can give us about ourselves and the things that ail us are enormous, but the risks of over-sharing and the resulting constraints such as HIPAA also present very real challenges.
In 2012, O’Reilly decided to launch a new edition of its data science conference to focus on health care, and that’s how Strata Rx was born. I was asked to become its Program Chair, along with Colin Hill, CEO of GNS Health care, and so I have spent that last 18 months learning everything I can about the (very complicated!) health care industry. Colin and I are great partners because of the complimentary backgrounds we bring together — Colin from the health care industry side and myself from the technology side. Ultimately, that’s what Strata Rx aims to do, too: we hope that by bringing together professionals from all parts of the industry (payers, providers, researchers, analysts, advocates, developers, investors, and caregivers, just to name a few) we can begin to solve some of the large and complex problems facing us in this area.
Imagining what the post-big data world might look like.
This post originally published as a chapter from the free Radar report, Disruptive Possibilities: How Big Data Changes Everything.
The Demise of Counting
So far, I have been discussing how big data differs from previous methods of computing–how it provides benefits and creates disruptions. Even at this early stage, it is safe to predict that big data will become a multibillion-dollar analytics and BI business and possibly subsume the entire existing commercial ecosystem. During that process, it will have disrupted the economics, behavior, and understanding of everything it analyzes and everyone who touches it–from those who use it to model the biochemistry of personality disorders to agencies that know the color of your underwear.
Big data is going to lay enough groundwork that it will initiate another set of much larger changes to the economics and science of computing. (But the future will always contain elements from the past, so mainframes, tape, and disks will still be with us for a while.) This chapter is going to take a trip into the future and imagine what the post-big data world might look like. The future will require us to process zettabytes and yottabytes of data on million-node clusters. In this world, individual haystacks will be thousands of times the size of the largest Hadoop clusters that will be built in the next decade. We are going to discover what the end of computing might look like, or more precisely, the end of counting.
The first electronic computers were calculators on steroids, but still just calculators. When you had something to calculate, you programmed the machinery, fed it some data, and it did the counting. Early computers that solved mathematical equations for missile trajectory still had to solve these equations using simple math. Solving an equation the way a theoretical physicist might is how human brains solve equations, but computers don’t work like brains. There have been attempts at building computers that mimic the way brains solve equations, but engineering constraints make it more practical to build a hyperactive calculator that solves equations through brute force and ignorance.