The last time I spoke with Stephen Goldsmith, he was the Deputy Mayor of New York City, advocating for increased use of “citizensourcing,” where government uses technology tools to tap into the distributed intelligence of residents to understand – and fix – issues around its streets, on its services and even within institutions. In the years since, as a professor at the Ash Center for Democratic Governance and Innovation
at the John F. Kennedy School of Government at Harvard University, the former mayor of Indianapolis has advanced the notion of “preemptive government.”
That focus caught my attention, given that my colleague, Alistair Croll, had published several posts on Radar looking at the ethics around big data. The increasing use of data mining and algorithms by government to make decisions based upon pattern recognition and assumptions regarding future actions is a trend worth watching. Will guaranteeing government data quality become mission-critical, once high profile mistakes are made? Any assessment of the perils and promise of a data-driven society will have to include a reasoned examination of the growing power of these applied algorithms to the markets and the lives of our fellow citizens.
Given some of those concerns, I called Goldsmith up this winter to learn more about what he meant.
Our interview, lightly edited for content and clarity, follows.
When you say “preemptive government,” what do you mean?
Stephen Goldsmith: I’m thinking about the intersection of trends here. One is what we might talk about as big data and data analytics. Inside, government has massive amounts of information located in all sorts of different places that if one looked at analytically, we could figure out which restaurants are most likely to have problems; which contractors are most likely to build bad buildings and the like.
For the first time, through the combination of digital processes, mobile tools and big data analytics, government can make preemptive solutions. Government generally responds to problems and then measures its performance by the number of activities it conducts, as contrasted to the problems it solves. New York City and Chicago have begun to take the lead in this area in specific places. When I was in New York City, we were trying to figure out how to set up a data analytics center. New York has started to do some of that, so that we can predict where the next event’s going to occur and then solve it. That eventually needs to be merged with community sentiment mining, but it’s a slightly different issue.
What substantive examples exist of this kind of approach making a difference?
Stephen Goldsmith: We are now operating a mayoral performance analytics initiative at the Kennedy School, trying to create energy around the solutions. We are featuring people who are doing it well, bringing folks together.
New York City, through a fellow named Mike Flowers, has begun to solve specific problems in building violations and restaurant inspections. He’s overcoming the obstacles where all of the legacy CIOs say they can’t share data. He’s showing that you can.
Chicago is just doing remarkable stuff in a range of areas, from land use planning to crime control, like deciding how to intervene earlier in solving crimes.
Indiana has begun working on child welfare using analytics to figure out best outcomes for children in tough circumstances. They’re using analytics to measure which providers are best for which young adults that are in trouble, what type of kid is most successful with what type of mental health treatment, drug treatment , mentoring or the like.
I think these are all just scratch the surface. They need to be highlighted so that city and state leaders can understand how they can have dramatically better returns on their investments by identifying these issues in advance.
Who else is sharing and consuming data to do predictive data analytics in the way that you’re describing?
Stephen Goldsmith: A lot of well-known staff programs, like ComStat or CityStat, do a really good job of measuring activities. When combined with analytics, they’ll be able to move into managing outcomes more effectively. There are a lot of folks like San Francisco beginning to look at these issues, but I think really, New York City and Chicago are in the lead.
Based upon their example, what kinds of budget, staffing and policy decisions would other cities need to make to realize similar returns?
Stephen Goldsmith: The most restrictive element in government today is the no-longer-accurate impression that legacy data can’t be easily integrated. Every agency has a CIO who often believes it’s his or her job to protect that data. I’m not talking about privacy; I’m just talking about data integrity. We know that there’s a range of tools that will allow relatively easy integration and data mining.
Another lesson is that this really needs to be driven by the mayor or the governor. The answers to problems come from picking up data across agencies, not just managing the data inside your agency. Without city hall or gubernatorial leadership, it’s very difficult to drive data analytics.
What about the risks of ‘preemptive government’ leading to false positives or worse?
Stephen Goldsmith: There is a risk, but let me talk about it in the following way: Government can no longer afford to operate the way it operates. You cannot afford to regulate every business as if it’s equally bad or equally good. Every restaurant is not equally good or equally bad. Every contractor’s not bad or good. There are bad guys and good guys, and good performers and bad performers. There are families that need help and families that don’t need help. We need to allocate our resources most effectively to create solutions. That means we need to look at which solutions work for which problems.
What do we know about which contractors have a history of being bad? I don’t mean “bad” like just how they build — I mean have they paid their taxes right, do they discriminate in the marketplace, whatever those factors are in order to target our resources.
That means that when Flowers did this in New York, we got several hundred buildings that were the most likely to burn down. We knew that from analytics. We’re going to go out and mitigate those buildings. Could we make a mistake and say that ten of those 300 buildings really aren’t that bad? Absolutely, but it’s a much better targeting of resources and it’s the only way government can afford to effectively operate.
There are other issues, too, like personalization, where we have a lot of privacy issues, and “opt-in” and “opt out” where people may want a personal relationship with their government. That’s a little different than predictive analytics, but it raises privacy issues.
Then we have a fascinating question, one that social work communities and criminal justice communities worry about, which is, “Okay, you can predict the likelihood that somebody can be hurt, or that somebody will commit a crime and adjust resources accordingly – but we better be pretty careful because it raises a lot of ethics questions and profiling questions.”
My short answer is that these are important, legitimate questions. We can’t ignore them, but if we continue to do business the way we do it has more negative tradeoffs than not.
Speaking of personalization and privacy, has mining social media for situational awareness during national disasters or other times of crisis become a part of the toolkit for cities?
Stephen Goldsmith: The conversation we’ve had has been about how to use enterprise data to make better decisions, right? That’s basically going to open up a lot of insight, but that model is pretty arrogant. It basically ignores crowd sourcing. It assumes that really smart people in government with a lot of data will make better solutions. But we know that people in communities can co-produce information together. They can crowd source solutions.
In New York City, we actually had some experience with this. One thread was the work that Rachel [Haot] was doing to communicate, but we were also using social media on the operations side. I think we’re barely getting started on how to mine community sentiment, how to integrate that with 311 data for better solutions, how to prioritize information where people have problems, and how to anticipate the problems early.
You may know that Indianapolis, in the 2012 Super Bowl, had a group of college students and a couple of local providers looking at Twitter conversations in order to intervene earlier. They were geotagged by name and curated to figure out where there was a crime problem, where somebody needed parking, where they were looking for tickets and where there’s too much trash on the ground. It didn’t require them to call government. Government was watching the conversation, participating in it and solving the problem.
I think that where we are has lots of potential and a little bit immature. The work now is to incorporate the community sentiment into the analytics and the mobile tools.