Because of the size, complexity and density of big data, it’s not always easy to find the important insights hiding in all that information. That’s where data visualization comes into play. A great visualization creates meaning where none existed.
Bitsy Bentley (@bitsybot) is the director of data visualization at GfK Custom Research, where she works with information designers to craft meaningful data experiences for a variety of business audiences. In the following interview, she discusses the space between a “wow” response and an “aha” moment, how her team addresses privacy concerns, and why practice is vital for both visualization creators and viewers.
Bentley will explore related visualization topics during her presentation at Strata Conference + Hadoop World in New York City later this month.
Why are data visualizations an effective way to understand the underlying data?
Bitsy Bentley: There is so much beauty and richness in big datasets, and now that we have enough processing power to harness that richness, it’s little wonder that interest in data visualization is exploding. To quote John Tukey: “The greatest value of a picture is when it forces us to notice what we never expected to see.” My clients find that, whether they’re more concerned with numbers or more concerned with stories, an appropriate visual is integral to their understanding of the data.
Visualization unlocks the serendipity of data analysis. It provides a language that is less intimidating than an overwhelming array of digits. Something as simple as a set of histograms breaking down the distribution of a data store makes it easy to find irregularities and outliers in the data.
How do you balance technical versus non-technical requirements when creating data visualization applications?
Bitsy Bentley: From my perspective, it all comes down to communication. I enjoy the luxury of working with a brilliant team of developers that is very open to working in an agile, iterative process. It’s not possible to craft meaningful data experiences in a vacuum. Multi-disciplinary teams are essential to building the best data interactions.
When we’re developing new products, we include the business and technical teams in our user profile and wireframing discussions. These workshops are key to helping the entire team take ownership of the business problems we’re addressing in our data analysis applications. I find that by keeping the end users at the center of all our discussions, we are able to avoid the common pitfall of allowing the technology to wag the design and vice versa. All the requirements hinge on meaningful use cases. It allows the data interactions to push the boundaries of the technical solution, and it allows the technical solution to push the boundaries of the data interactions.
While it’s clear that creating data visualizations provides a way to understand big data in a non-threatening way, what are the challenges around creating a good one?
Bitsy Bentley: The best data visualization is the one that meets the audience’s need, and identifying that need is often a big challenge. I subscribe to the Tukey philosophy that it’s better to approximate an answer to the right question than provide an exact answer to the wrong question. It’s easy to get distracted by all the possible ways of describing a big dataset and forget that we’re often dealing with an audience of statistical novices.
My goal is that the users of products I design will take the data interactions for granted. It’s difficult to balance the “wow” of a visually striking presentation (that tends to sell well) with the “aha” moment when the user is able to easily and simply understand the problem at hand. At the beginning of my design process, I like to do a problem discovery phase to articulate the specific user needs we want to meet. It provides a good framework for measuring the success of the data experience and challenges us to keep the user at the center of the solution.
One of the big issues around using data is privacy. How does visualization provide a way to present big data without revealing details about the underlying data store? Do you typically provide access to source data when showing a visualization?
Bitsy Bentley: The market research industry has a long history of being concerned with data privacy. My company adheres to the Council of American Survey Research Organizations’ (CASRO) code of standards and ethics. These standards outline our responsibilities to our respondents, our clients, and to the public. With those responsibilities in mind, my team and I look to show patterns, not individual people, as we visualize things like a generalized customer journey or overall foot traffic in an area.
There are times that I like using deliberately fuzzy visualization methods, such as heatmaps, to keep the focus on high-level patterns. Much of our work is proprietary, so data access is usually restricted to our clients. In our visual data analysis products, we allow users to download the underlying aggregated data of the charts and graphs in the application. Occasionally our clients request SPSS files, and every so often we provide data through an API.
We are clearly at the beginning of this journey when it comes to understanding big data and illustrating it through visualizations. How can we become better at creating and understanding visualizations?
Bitsy Bentley: Like many things in life, it’s all about practice. I find the visual and statistical literacy of many audiences to be very low, but with just a little exposure, their standards rapidly change.
The best ideas come from dialogue. Every data graphic we make is an opportunity to learn something new and challenge our audience to see something unexpected. Every data interaction we design is an opportunity to change the way our users think about the role data plays in their decision-making process. Every program we write helps us gain a deeper understanding of the relationship between data management and data visualization. If we want to be better at creating and understanding visualizations, we need to make more of them.
This interview was edited and condensed.