Here are the data stories that caught my attention this week.
BigQuery for everyone
Google has released its big data analytics service BigQuery to the public. Initially made available to a small number of developers late last year, now anyone can sign up for the service. A free account lets you query up to 100 GB of data per month, with the option to pay for additional queries and/or storage.
“Google’s aim may be to sell data storage in the cloud, as much as it is to sell analytic software,” says The New York Times’ Quentin Hardy. “A company using BigQuery has to have data stored in the cloud data system, which costs 12 cents a gigabyte a month, for up to two terabytes, or 2,000 gigabytes. Above that, prices are negotiated with Google. BigQuery analysis costs 3.5 cents a gigabyte of data processed.”
The interface for BigQuery is meant to lower the bar for these sorts of analytics — there’s a UI and a REST interface. In the Times article, Google project manager Ju-kay Kwek says Google is hoping developers build tools that encourage widespread use of the product by executives and other non-developers.
If folks are looking for something to cut their teeth on with BigQuery, GitHub’s public timeline is now a publicly available dataset. The data is being synced regularly, so you can query things like popular languages and popular repos. To that end, GitHub is running a data visualization contest.
The Data Journalism Handbook
The Data Journalism Handbook had its release this week at the 2012 International Journalism Festival in Italy. The book, which is freely available and openly licensed, was a joint effort of the European Journalism Centre and the Open Knowledge Foundation. It’s meant to serve as a reference for those interested in the field of data journalism.
In the introduction, “Deutsche Welle’s” Mirko Lorenz writes:
“Today, news stories are flowing in as they happen, from multiple sources, eye-witnesses, blogs, and what has happened is filtered through a vast network of social connections, being ranked, commented and more often than not, ignored. This is why data journalism is so important. Gathering, filtering and visualizing what is happening beyond what the eye can see has a growing value.”
Open data is a joke?
Tim Slee fired a shot across the bow of the open data movement with a post this week arguing that “the open data movement is a joke.” Moreover, it’s not a movement at all, he contends. Slee turns a critical eye to the Canadian government’s open data efforts in particular, noting that: “The Harper government’s actions around ‘open government,’ and the lack of any significant consequences for those actions, show just how empty the word ‘open’ has become.”
Slee is also critical of open data efforts outside the government, calling the open data movement “a phrase dragged out by media-oriented personalities to cloak a private-sector initiative in the mantle of progressive politics.”
Open data activist David Eaves responded strongly to Slee’s post with one of his own, recognizing his own frustrations with “one of the most — if not the most — closed and controlling [governments] in Canada’s history.” But Eaves takes exception with the ways in which Slee characterizes the open data movement. He contends that many of the corporations involved with the open data movement — something Slee charges has corrupted open data — are U.S. corporations (and points out that in Canada, “most companies don’t even know what open data is”). Eaves adds, too, that many of these corporations are led by geeks.
“Just as an authoritarian regime can run on open-source software, so too might it engage in open data. Open data is not the solution for Open Government (I don’t believe there is a single solution, or that Open Government is an achievable state of being — just a goal to pursue consistently), and I don’t believe anyone has made the case that it is. I know I haven’t. But I do believe open data can help. Like many others, I believe access to government information can lead to better informed public policy debates and hopefully some improved services for citizens (such as access to transit information). I’m not deluded into thinking that open data is going to provide a steady stream of obvious ‘gotcha moments’ where government malfeasance is discovered, but I am hopeful that government data can arm citizens with information that the government is using to inform its decisions so that they can better challenge, and ultimately help hold accountable, said government.”
Got data news?
Feel free to email me.