<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Strata &#187; Ron Miller</title>
	<atom:link href="http://strata.oreilly.com/ronm/feed" rel="self" type="application/rss+xml" />
	<link>http://strata.oreilly.com</link>
	<description>Making Data Work</description>
	<lastBuildDate>Thu, 23 May 2013 16:47:41 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<item>
		<title>Data journalism: From eccentric to mainstream in five years</title>
		<link>http://strata.oreilly.com/2012/12/simon-rogers-data-journalism.html</link>
		<comments>http://strata.oreilly.com/2012/12/simon-rogers-data-journalism.html#comments</comments>
		<pubDate>Fri, 21 Dec 2012 14:00:33 +0000</pubDate>
		<dc:creator>Ron Miller</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[data journalism]]></category>
		<category><![CDATA[data journalism tools]]></category>
		<category><![CDATA[data reporting]]></category>
		<category><![CDATA[information]]></category>
		<category><![CDATA[media]]></category>
		<category><![CDATA[reporters]]></category>

		<guid isPermaLink="false">http://strata.oreilly.com/?p=53713</guid>
		<description><![CDATA[Simon Rogers (@smfrogers), editor of The Guardian&#8217;s Datablog and Datastore, and a speaker at the upcoming Strata Conference in California, was one of the first data journalists at The Guardian. In the following interview, Rogers discusses the changes he&#8217;s seen &#8230; ]]></description>
				<content:encoded><![CDATA[<p>Simon Rogers (<a href="http://twitter.com/smfrogers">@smfrogers</a>), editor of The Guardian&#8217;s <a href="http://www.guardian.co.uk/datablog">Datablog</a> and <a href="http://www.twitter.com/datastore">Datastore</a>, and a speaker at the upcoming <a href="http://strataconf.com/strata2013/public/schedule/speaker/102764">Strata Conference in California,</a> was one of the first data journalists at The Guardian. </p>
<p>In the following interview, Rogers discusses the changes he&#8217;s seen in data journalism over the last five years and how new tools and increased <a href="http://fivethirtyeight.blogs.nytimes.com">notoriety</a> will shape the data journalism space.</p>
<h2>Why has data become the story for some journalists like yourself?</h2>
<p><strong>Simon Rogers:</strong> It&#8217;s a big change for reporters, to go from being suspicious of numbers to noticing that often data journalism is the only way to get stories from them. I think it&#8217;s a combination &mdash; the huge growth in published data out there combining with things like <a href="http://wikileaks.org">WikiLeaks</a>, which changed the game for news editors to realize this was a new way to get stories.<span id="more-53713"></span></p>
<h2>Why do you think readers like data-centric stories?</h2>
<p><strong>Simon Rogers:</strong> I think it&#8217;s about trust. The public doesn&#8217;t trust reporters any more, but if you can show the workings behind your stories, to be transparent, then it makes your stories stronger. After years of unfettered comment online, there&#8217;s a real desire for facts.</p>
<h2>You&#8217;ve been at data journalism for a while. How has it changed over the last five years?</h2>
<p><strong>Simon Rogers:</strong> It&#8217;s become much more mainstream. The work that I do used to be regarded as a little eccentric by the news desk; now it&#8217;s part of the fabric of The Guardian.</p>
<h2>Are the tools of data journalism getting better?</h2>
<p><strong>Simon Rogers:</strong> Yes and no. If you can code, they are becoming brilliant, with things like <a href="http://misoproject.com">The Miso Project</a> really changing how we can work. But if you&#8217;re a reporter in a hurry, I&#8217;m getting a sense of stalling. We had a flurry of great tools a couple of years ago &mdash; <a href="https://sites.google.com/site/fusiontablestalks/stories">Google Fusion Tables</a>, <a href="http://code.google.com/p/google-refine/">Refine</a> and so on &mdash; but we&#8217;ve also lost previously helpful things like <a href="http://www-958.ibm.com/software/analytics/manyeyes/">Many Eyes</a>. <a href="http://datawrapper.de">Datawrapper</a> is a great new way to generate charts, and I&#8217;m liking <a href="http://cartodb.com">CartoDB</a>, but we still need more that anyone can use.</p>
<h2>How can journalism schools prepare student journalists for our increasingly data-centric world?</h2>
<p><strong>Simon Rogers:</strong> I think it&#8217;s less about tools, as these are always changing. It&#8217;s more about helping create an attitude that looks for stories in numbers in a journalistic way &mdash; asking the same questions as you would in person to a contact.</p>
<h2>Given the role of data analysis in this year&#8217;s US election, and the near mythic status of <a href="http://en.wikipedia.org/wiki/Nate_Silver">Nate Silver</a>, what does this mean for data journalism moving forward?</h2>
<p><strong>Simon Rogers:</strong> It helps with the general feel that data is something to be embraced rather than avoided in journalism. And not all data work has to be as amazing as Nate&#8217;s. Much of what we do is very simple analysis: Has something become bigger or smaller? How does it compare? That sort of thing. But the key point holds: it helps you get stories.</p>
<div style="float: left;border-top: thin gray solid;border-bottom: thin gray solid;padding: 20px;margin: 20px 2px;clear: both"><a href="http://strataconf.com/strata2013?_discount=STRATA20&amp;intcmp=il-strata-stsc13-simon-rogers-data-journalism"><img style="float: left;border: none;padding-right: 10px" src="http://cdn.oreilly.com/radar/images/promos/strataca13-148x178.jpg" /></a><a href="http://strataconf.com/strata2013?_discount=STRATA20&amp;intcmp=il-strata-stsc13-simon-rogers-data-journalism"><strong>Strata Conference Santa Clara</strong></a> &mdash;  Strata Conference Santa Clara, being held Feb. 26-28, 2013 in California, gives you the skills, tools, and technologies you need to make data work today.</p>
<p><a href="http://strataconf.com/strata2013?_discount=STRATA20&amp;intcmp=il-strata-stsc13-simon-rogers-data-journalism"><strong>Save 20% on registration with the code STRATA20</strong></a></div>
<p><strong>Related:</strong></p>
<ul>
<li> <a href="http://radar.oreilly.com/2012/11/investigating-data-journalism.html">Investigating data journalism: O&#8217;Reilly Radar series</a></li>
<li> <a href="http://strata.oreilly.com/2011/09/data-journalism-process-guardian.html">The work of data journalism: Find, clean, analyze, create … repeat</a></li>
<li> <a href="http://strata.oreilly.com/2011/03/simon-rogers-guardian-wikileaks.html">Before you interrogate data, you must tame it</a></li>
<li> <a href="http://strata.oreilly.com/tag/data-journalism">Data journalism profiles and coverage</a></li>
<li> <a href="http://shop.oreilly.com/product/0636920025603.do">The Data Journalism Handbook</a> (book)</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://strata.oreilly.com/2012/12/simon-rogers-data-journalism.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Big data is helping EA level up</title>
		<link>http://strata.oreilly.com/2012/12/big-data-is-helping-ea-level-up.html</link>
		<comments>http://strata.oreilly.com/2012/12/big-data-is-helping-ea-level-up.html#comments</comments>
		<pubDate>Wed, 12 Dec 2012 14:00:39 +0000</pubDate>
		<dc:creator>Ron Miller</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[data company]]></category>
		<category><![CDATA[data product]]></category>
		<category><![CDATA[Electronic Arts]]></category>
		<category><![CDATA[Strata CA 13]]></category>
		<category><![CDATA[strataconf]]></category>
		<category><![CDATA[video games]]></category>

		<guid isPermaLink="false">http://strata.oreilly.com/?p=53370</guid>
		<description><![CDATA[Electronic Arts (EA) isn&#8217;t the first company that comes to mind when you think of big data. Yet the gaming company is collecting increasing amounts of data about its online players, and as this data accumulates and gains steam, it &#8230; ]]></description>
				<content:encoded><![CDATA[<p><a href="http://www.ea.com/">Electronic Arts (EA)</a> isn&#8217;t the first company that comes to mind when you think of big data. Yet the gaming company is collecting increasing amounts of data about its online players, and as this data accumulates and gains steam, it falls under the big data category. </p>
<p>If a game maker like EA is considered a big data company, it could have implications for other companies we might not think of as typical big data generators. With that in mind, I got in touch with <a href="http://www.linkedin.com/pub/rajat-taneja/3/79a/276">Rajat Taneja</a>, chief technology officer at EA and a keynote speaker at the upcoming <a href="http://strataconf.com/strata2013/public/schedule/detail/27603&amp;intcmp=il-strata-stsc13-rajat-taneja-ea-interview">Strata Conference in California</a>. Since Taneja came on board with EA in 2011, he&#8217;s helped steer the company&#8217;s technological initiatives, including understanding the impact this growing data store will have on the firm &mdash; both from a processing standpoint and how to use it to provide games and services customers want most. He says no matter what your company does, if you have constantly connected online services, you are very likely going to be dealing with lots of data.</p>
<p>Our interview follows.<span id="more-53370"></span></p>
<h2>How does big data apply to a game company like EA?</h2>
<p><strong>Rajat Taneja:</strong> When thinking about big data, video games may not immediately come to mind but in reality, video games today are always-on, digital interactive entertainment that transmits huge amounts of data across the network as you play. Our network processes terabytes of data every day just from gameplay events. As our games become even more interconnected and cross-device enabled, the experiences become even richer and telemetry will only increase.</p>
<h2>Where does your data come from and how do you collect it?</h2>
<p><a href="http://strataconf.com/strata2013/public/schedule/detail/27603"><img src="http://s.radar.oreilly.com/wp-files/5/2012/12/1212-rajat-taneja.jpg" alt="Rajat Taneja" width="75" height="100" class="alignright size-full wp-image-53371" /></a><strong>Rajat Taneja:</strong> Gaming is a multi-user online experience now, where friends and the gaming community come together across the globe to play online together. Everything from the game&#8217;s content, to a player&#8217;s progress, social functionalities, payment processes to account information all contribute to the amounts of data that we have to process on our EA network.<br />
 <br />
We have a very sophisticated internal data platform that ingests, processes and efficiently stores the data across a wide variety of systems spanning structured and unstructured data bases. The platform leverages open-source technologies (like the Hadoop stack) as well as vendor technology. This data is then made available through a variety of analytic tools for automated actions, deeper analytics and reporting. This is where data turns into information and insight.</p>
<h2>What kinds of tools do you use to process and view the data?</h2>
<p><strong>Rajat Taneja:</strong> Our data platform was predominantly architected around a very structured pipeline and conventional data warehouse systems. This was primarily geared for complex OLAP and OLTP workloads. However, the nature of data we collect is now changing dramatically, and its use and importance in personalizing game play experiences is becoming critical. We, therefore, have to rethink fundamentally how the pipeline is architected and the scale it has to support. Our new architecture is now focused on far more agile, near real-time processes that are now being predominantly built using off-the-shelf open source tools for storage, compute, modeling and analytics.</p>
<h2>How do you make data accessible to your employees, partners and even your customers? And how can presenting this data enhance your relationships with these parties?</h2>
<p><strong>Rajat Taneja:</strong> <a href="http://store.origin.com">Origin</a>, our direct-to-consumer gaming platform, brings the customer&#8217;s gaming experience into one single application, so no matter what game or device you play on, you can connect all of your experiences. Your friend list lives there, your achievements, and your payment information. This makes it easy for our consumers to control their own data and track their progress, managed securely on our network.<br />
 <br />
Inside the company, the data and derived insight are made available to game developers and studios so they can gain deep insight into playing patterns and feature usage. This helps them tailor the game experiences and capabilities to enhance the enjoyment by our consumers. Our live operations team gets a near real-time view of various game functionality, which provides them with critical information that is used to manage the day-to-day operations of the game. Our customer experience and marketing teams get relevant snap shots of data critical to their efforts and initiatives.<br />
 <br />
Overall, data is a key connective tissue that allows all functions to operate with more confidence and deeper insight into their specific areas.</p>
<h2>If EA is a big data company, what does that mean for other companies? What lessons can other companies take from yours?<br />
</h2>
<p><strong>Rajat Taneja:</strong> So much of our everyday lives today are connected to online data and interactions, and games are becoming no different.  Our industry is going through a massive transition &mdash; a shift from packaged goods to digital delivery and constantly-connected online services &mdash; that has vaulted us into the sphere of big data.  </p>
<p>We have all the big data challenges, but also many, many opportunities to transform games from something you buy to a place that you go. The big key for us is looking at our customer base and wanting to meet and exceed their high expectations for an entertainment experience. They won&#8217;t tolerate network downtime, their data must be secure, and they don&#8217;t want to have multiple logins and passwords. We&#8217;re building the infrastructure for seamless experiences that take them from device to device, platform to platform, where their experience and connection is consistent across time and devices. By putting the consumers at the center of their gaming world, we have set into motion a technical challenge where using big data is a critical component of the solution. Think about your customers, put their needs first, and evaluate if big data is part of your solution as well.</p>
<p><em>This interview was edited and condensed.</em></p>
<div style="float: left;border-top: thin gray solid;border-bottom: thin gray solid;padding: 20px;margin: 20px 2px;clear: both"><a href="http://strataconf.com/strata2013?_discount=STRATA20&amp;intcmp=il-strata-stsc13-rajat-taneja-ea-interview"><img style="float: left;border: none;padding-right: 10px" src="http://cdn.oreilly.com/radar/images/promos/strataca13-148x178.jpg" /></a><a href="http://strataconf.com/strata2013?_discount=STRATA20&amp;intcmp=il-strata-stsc13-rajat-taneja-ea-interview"><strong>Strata Conference Santa Clara</strong></a> &mdash;  Strata Conference Santa Clara, being held Feb. 26-28, 2013 in California, gives you the skills, tools, and technologies you need to make data work today.</p>
<p><a href="http://strataconf.com/strata2013?_discount=STRATA20&amp;intcmp=il-strata-stsc13-rajat-taneja-ea-interview"><strong>Save 20% on registration with the code STRATA20</strong></a></div>
<p><strong>Related:</strong></p>
<ul>
<li> <a href="http://strata.oreilly.com/2011/11/big-data-business-enterprise.html">Big data goes to work</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://strata.oreilly.com/2012/12/big-data-is-helping-ea-level-up.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Every company has a big data issue</title>
		<link>http://strata.oreilly.com/2012/11/big-data-all-companies.html</link>
		<comments>http://strata.oreilly.com/2012/11/big-data-all-companies.html#comments</comments>
		<pubDate>Thu, 01 Nov 2012 13:00:49 +0000</pubDate>
		<dc:creator>Ron Miller</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Big Data]]></category>
		<category><![CDATA[business data]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[data democratization]]></category>
		<category><![CDATA[democratization of data]]></category>
		<category><![CDATA[small business]]></category>

		<guid isPermaLink="false">http://strata.oreilly.com/?p=52730</guid>
		<description><![CDATA[When you bandy about a term like &#8220;big data&#8221; often enough, it tends to lose its meaning. But big data is much more than a marketing term, although it is that, too &#8212; it&#8217;s a means of trying to understand &#8230; ]]></description>
				<content:encoded><![CDATA[<p>When you bandy about a term like &#8220;big data&#8221; often enough, it tends to lose its meaning. But big data is much more than a marketing term, although it is that, too &mdash; it&#8217;s a means of trying to understand and control the sheer volume of information we are seeing inside and outside our organizations.</p>
<p>It&#8217;s easy to dismiss this as a problem for companies like Google and Facebook, which are gathering mountains of data from users. However, as <a href="http://www.gooddata.com/">GoodData</a> CEO Roman Stanek (<a href="https://twitter.com/RomanStanek">@RomanStanek</a>) points out in the following interview, the growing amounts of data from a variety of sources makes big data an issue that has an impact on every company, regardless of size.</p>
<p>Stanek, who has been an entrepreneur for more than 20 years, started GoodData in 2007 as a way to simplify business intelligence by putting it in the cloud. Today, he sees big data as more than a business intelligence problem, and as he has watched his business evolve, he believes companies like his can take big data out of the realm of data scientists and put it into the hands of ordinary business users.</p>
<h2>There is a perception that big data is a big company problem. What role does big data have in small- to medium-size organizations?</h2>
<p><img src="http://s.radar.oreilly.com/wp-files/5/2012/10/1012-roman-stanek.jpg" alt="Roman Stanek" width="92" height="130" class="alignright size-full wp-image-52737" /></a><strong>Roman Stanek:</strong> Big data comes from hundreds of sources, most of which are outside a company&#8217;s firewalls, such as customer interactions, social media and emails. A company&#8217;s size is irrelevant to the volume of big data it has to manage and understand. For example, a company with 100 employees may have to answer thousands of customer-support calls coming in from Facebook, Twitter, email and telephone. That&#8217;s a massive amount of data it has to deal with. </p>
<p>In addition, big data represents tremendous potential wealth for all companies, no matter how small or large those enterprises are. When businesses are smart about leveraging data, they can make better and faster business decisions.<span id="more-52730"></span> </p>
<h2>What factors are contributing to the growing amount of business data?</h2>
<p><strong>Roman Stanek:</strong> To name just a few: inventory levels, sales results, negative comments on Facebook, positive comments on Twitter, shopping habits on Amazon, playlists on Pandora and online search habits. No matter what you call the information or what it describes, it&#8217;s all data being collected. </p>
<p>IDC predicts digital data will grow to <a href="http://cdn.idc.com/research/Predictions12/Main/downloads/IDCTOP10Predictions2012.pdf">2.7 zettabytes in 2012 (PDF)</a>. Thanks to new technologies like Hadoop, once-unquantifiable data &mdash; like Facebook conversations and tweets &mdash; can now be quantified. Nearly everything is measurable. The result is that companies are spending big dollars to collect, store and measure astronomical amounts of data.</p>
<h2>What&#8217;s the difference between today&#8217;s big data and yesterday&#8217;s business intelligence (BI)?</h2>
<p><strong>Roman Stanek:</strong> Traditional BI is antiquated and broken. Current tools cannot cope with the massive amounts of unstructured data coming in from social networks and the cloud. Commonly used BI tools left a long trail of failed implementations and frustrated customers because they required such heavy lifting from IT departments. </p>
<p>In my opinion, big data&#8217;s only real value lies in the ability of businesses to transform data into insights they can act on. Sales managers, for example, can quickly analyze sales reps&#8217; results, view new and lost contracts and compare team performance to the plan they set months earlier. </p>
<p>Help desk staff can see how individual customers affect sales and profit, so they know when to go above and beyond to retain certain customers while allowing the low fliers to churn. Insurance agents can predict the cost and nature of impending damage as hurricanes hurtle toward their region.</p>
<h2>How do you get to a point where big data is not just in the realm of data scientists asking the big questions, but where business users find the answers to do their jobs and drive business growth?</h2>
<p><strong>Roman Stanek:</strong> You shouldn&#8217;t need a PhD in statistics to interpret data. I believe people already know what data they need to dive into to make strategic decisions. If I&#8217;m a chief marketing officer, for example, I&#8217;m dying to learn if Facebook is <em>really</em> driving my sales and, if so, to what extent. To find that out, I need a modern app that pulls in data from myriad sources and then presents that information in a simple, visually intuitive way that lets anyone inside a company make sense of his or her data. The growth and maturity of cloud computing technologies have finally made this combination possible.</p>
<h2>How does business get to the point where big data is driving business strategy looking forward as opposed to what has happened, looking back?</h2>
<p><strong>Roman Stanek:</strong> Using big data to drive business strategy is the next stage of maturity. No company will jump from being anecdotal to analytical in one day. Companies must focus on becoming metrics driven so that they reach that level of maturity in their data analytics. </p>
<p>Traditional BI tools look at historical trends, allowing you to analyze what&#8217;s already happened. For example, I can see how my sales trended in the previous two to three quarters.  The challenge with this approach is it doesn&#8217;t enable you to react to today&#8217;s information so you can influence tomorrow&#8217;s business performance.</p>
<p>In contrast, next-generation analytics leverage the scalability and processing power of the cloud to find insight people can use now. I recently heard about HR apps that analyze big data to determine which personality types are best suited for different types of jobs. That kind of information enables HR departments to create simple but revealing questions to find &mdash; and retain &mdash; the best employees for each position. That reduces churn, which has a direct impact on a company&#8217;s bottom line. </p>
<p>That&#8217;s just an example. The point is that a company&#8217;s size truly doesn&#8217;t matter. Big data offers the key for any company, regardless of size, to find new sources of revenue, increase profit and make smarter decisions, faster. Thanks to new real-time, cloud-based technologies, that ability is already a reality.</p>
<p><em>This interview was edited and condensed.</em></p>
<div style="float: left;border-top: thin gray solid;border-bottom: thin gray solid;padding: 20px;margin: 20px 2px;clear: both"><a href="http://strataconf.com/strata2013/public/regwith/strata20?intcmp=il-strata-stsc13-roman-stanek-interview"><img style="float: left;border: none;padding-right: 10px" src="http://cdn.oreilly.com/radar/images/promos/strataca13-148x178.jpg" /></a><a href="http://strataconf.com/strata2013/public/regwith/strata20?intcmp=il-strata-stsc13-roman-stanek-interview"><strong>Strata Conference Santa Clara</strong></a> &mdash; Strata Conference Santa Clara, being held Feb. 26-28, 2013 in California, gives you the skills, tools, and technologies you need to make data work today.</p>
<p><a href="http://strataconf.com/strata2013/public/regwith/strata20?intcmp=il-strata-stsc13-roman-stanek-interview"><strong>Save 20% on registration with the code STRATA20</strong></a></div>
]]></content:encoded>
			<wfw:commentRss>http://strata.oreilly.com/2012/11/big-data-all-companies.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A search for balance between the &#8220;wow&#8221; and &#8220;aha&#8221; in visualizations</title>
		<link>http://strata.oreilly.com/2012/10/visualization-balance-audience-need.html</link>
		<comments>http://strata.oreilly.com/2012/10/visualization-balance-audience-need.html#comments</comments>
		<pubDate>Wed, 03 Oct 2012 13:00:49 +0000</pubDate>
		<dc:creator>Ron Miller</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[customer]]></category>
		<category><![CDATA[data product]]></category>
		<category><![CDATA[data set]]></category>
		<category><![CDATA[privacy]]></category>
		<category><![CDATA[use case]]></category>
		<category><![CDATA[visualization]]></category>
		<category><![CDATA[visualization creation]]></category>
		<category><![CDATA[visualizations]]></category>

		<guid isPermaLink="false">http://strata.oreilly.com/?p=52208</guid>
		<description><![CDATA[Because of the size, complexity and density of big data, it&#8217;s not always easy to find the important insights hiding in all that information. That&#8217;s where data visualization comes into play. A great visualization creates meaning where none existed. Bitsy &#8230; ]]></description>
				<content:encoded><![CDATA[<p>Because of the size, complexity and density of big data, it&#8217;s not always easy to find the important insights hiding in all that information. That&#8217;s where data visualization comes into play. A great visualization creates meaning where none existed.</p>
<p>Bitsy Bentley (<a href="https://twitter.com/bitsybot">@bitsybot</a>) is the director of data visualization at <a href="http://www.gfkamerica.com">GfK Custom Research</a>, where she works with information designers to craft meaningful data experiences for a variety of business audiences. In the following interview, she discusses the space between a &#8220;wow&#8221; response and an &#8220;aha&#8221; moment, how her team addresses privacy concerns, and why practice is vital for both visualization creators and viewers.</p>
<p>Bentley will explore related visualization topics during her <a href="http://strataconf.com/stratany2012/public/schedule/detail/25493?intcmp=il-strata-stny12-bitsy-hansen-interview">presentation</a> at Strata Conference + Hadoop World in New York City later this month.</p>
<h2>Why are data visualizations an effective way to understand the underlying data?</h2>
<p><a href="http://strataconf.com/stratany2012/public/schedule/detail/25493?intcmp=il-strata-stny12-bitsy-hansen-interview"><img src="http://cdn.oreillystatic.com/en/assets/1/eventprovider/1/_@user_107244.jpg" border="0" alt="Bitsy Bentley" style="float: right;margin: 5px 0 10px 15px" /></a><strong>Bitsy Bentley:</strong> There is so much beauty and richness in big datasets, and now that we have enough processing power to harness that richness, it&#8217;s little wonder that interest in data visualization is exploding. To quote <a href="http://en.wikipedia.org/wiki/John_Tukey">John Tukey</a>: &#8220;The greatest value of a picture is when it forces us to notice what we never expected to see.&#8221; My clients find that, whether they&#8217;re more concerned with numbers or more concerned with stories, an appropriate visual is integral to their understanding of the data. </p>
<p>Visualization unlocks the serendipity of data analysis. It provides a language that is less intimidating than an overwhelming array of digits. Something as simple as a set of histograms breaking down the distribution of a data store makes it easy to find irregularities and outliers in the data. <span id="more-52208"></span></p>
<h2>How do you balance technical versus non-technical requirements when creating data visualization applications?</h2>
<p><strong>Bitsy Bentley:</strong> From my perspective, it all comes down to communication. I enjoy the luxury of working with a brilliant team of developers that is very open to working in an agile, iterative process. It&#8217;s not possible to craft meaningful data experiences in a vacuum. Multi-disciplinary teams are essential to building the best data interactions. </p>
<p>When we&#8217;re developing new products, we include the business and technical teams in our user profile and wireframing discussions. These workshops are key to helping the entire team take ownership of the business problems we&#8217;re addressing in our data analysis applications. I find that by keeping the end users at the center of all our discussions, we are able to avoid the common pitfall of allowing the technology to wag the design and vice versa. All the requirements hinge on meaningful use cases. It allows the data interactions to push the boundaries of the technical solution, and it allows the technical solution to push the boundaries of the data interactions. </p>
<h2>While it&#8217;s clear that creating data visualizations provides a way to understand big data in a non-threatening way, what are the challenges around creating a good one?</h2>
<p><strong>Bitsy Bentley:</strong> The best data visualization is the one that meets the audience&#8217;s need, and identifying that need is often a big challenge. I subscribe to the Tukey philosophy that it&#8217;s better to approximate an answer to the right question than provide an exact answer to the wrong question. It&#8217;s easy to get distracted by all the possible ways of describing a big dataset and forget that we&#8217;re often dealing with an audience of statistical novices. </p>
<p>My goal is that the users of products I design will take the data interactions for granted. It&#8217;s difficult to balance the &#8220;wow&#8221; of a visually striking presentation (that tends to sell well) with the &#8220;aha&#8221; moment when the user is able to easily and simply understand the problem at hand. At the beginning of my design process, I like to do a problem discovery phase to articulate the specific user needs we want to meet. It provides a good framework for measuring the success of the data experience and challenges us to keep the user at the center of the solution.</p>
<h2>One of the big issues around using data is privacy. How does visualization provide a way to present big data without revealing details about the underlying data store? Do you typically provide access to source data when showing a visualization?</h2>
<p><strong>Bitsy Bentley:</strong> The market research industry has a long history of being concerned with data privacy. My company adheres to the Council of American Survey Research Organizations&#8217; (CASRO) <a href="http://www.casro.org/codeofstandards.cfm">code of standards and ethics</a>. These standards outline our responsibilities to our respondents, our clients, and to the public. With those responsibilities in mind, my team and I look to show patterns, not individual people, as we visualize things like a generalized customer journey or overall foot traffic in an area. </p>
<p>There are times that I like using deliberately fuzzy visualization methods, such as heatmaps, to keep the focus on high-level patterns. Much of our work is proprietary, so data access is usually restricted to our clients. In our visual data analysis products, we allow users to download the underlying aggregated data of the charts and graphs in the application. Occasionally our clients request <a href="http://en.wikipedia.org/wiki/SPSS">SPSS files</a>, and every so often we provide data through an API.</p>
<h2>We are clearly at the beginning of this journey when it comes to understanding big data and illustrating it through visualizations. How can we become better at creating and understanding visualizations?</h2>
<p><strong>Bitsy Bentley:</strong> Like many things in life, it&#8217;s all about practice. I find the visual and statistical literacy of many audiences to be very low, but with just a little exposure, their standards rapidly change. </p>
<p>The best ideas come from dialogue.  Every data graphic we make is an opportunity to learn something new and challenge our audience to see something unexpected. Every data interaction we design is an opportunity to change the way our users think about the role data plays in their decision-making process. Every program we write helps us gain a deeper understanding of the relationship between data management and data visualization. If we want to be better at creating and understanding visualizations, we need to make more of them.</p>
<div style="float: left;border-top: thin gray solid;border-bottom: thin gray solid;padding: 20px;margin: 20px 2px;clear: both"><a href="https://en.oreilly.com/stratany2012/public/regwith/RADAR20?intcmp=il-strata-stny12-bitsy-hansen-interview"><img style="float: left;border: none;padding-right: 10px" src="http://cdn.oreilly.com/radar/images/promos/2012-strata-ny-promo.gif" /></a><a href="https://en.oreilly.com/stratany2012/public/regwith/RADAR20?intcmp=il-strata-stny12-bitsy-hansen-interview"><strong>Strata Conference + Hadoop World</strong></a> &mdash;  The O&#8217;Reilly Strata Conference, being held Oct. 23-25 in New York City, explores the changes brought to technology and business by big data, data science, and pervasive computing. This year, Strata has joined forces with Hadoop World.</p>
<p><a href="https://en.oreilly.com/stratany2012/public/regwith/RADAR20?intcmp=il-strata-stny12-bitsy-hansen-interview"><strong>Save 20% on registration with the code RADAR20</strong></a></div>
<p><em>This interview was edited and condensed.</em></p>
<p><strong>Related:</strong></p>
<ul>
<li> <a href="http://strata.oreilly.com/2012/07/visualization-criticism.html">Walking the tightrope of visualization criticism</a></li>
<li> <a href="http://strata.oreilly.com/2012/02/why-data-visualization-matters.html">Why data visualization matters</a></li>
<li> <a href="http://strata.oreilly.com/2012/02/how-to-create-visualization-facebook-vacation.html">How to create a visualization</a></li>
<li> <a href="http://radar.oreilly.com/2012/06/narrative-spreadsheets-data-reporting-analytics.html">Stories over spreadsheets</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://strata.oreilly.com/2012/10/visualization-balance-audience-need.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Data is the real business model for social</title>
		<link>http://strata.oreilly.com/2012/09/social-data-business-model.html</link>
		<comments>http://strata.oreilly.com/2012/09/social-data-business-model.html#comments</comments>
		<pubDate>Mon, 17 Sep 2012 13:00:42 +0000</pubDate>
		<dc:creator>Ron Miller</dc:creator>
				<category><![CDATA[Data]]></category>
		<category><![CDATA[personalization]]></category>
		<category><![CDATA[social ads]]></category>
		<category><![CDATA[social data]]></category>
		<category><![CDATA[social media]]></category>
		<category><![CDATA[social search]]></category>
		<category><![CDATA[targeting]]></category>

		<guid isPermaLink="false">http://strata.oreilly.com/?p=51981</guid>
		<description><![CDATA[As social media websites gather ever-growing data stores, they might be better served by finding ways to make profitable use of that data instead serving ads as their chief means of raising revenue. While the data might give them the &#8230; ]]></description>
				<content:encoded><![CDATA[<p>As social media websites gather ever-growing data stores, they might be better served by finding ways to make profitable use of that data instead serving ads as their chief means of raising revenue. While the data might give them the information they need to serve more targeted ads &mdash; although in my experience they still have a ways to go with that &mdash; the real value in the site could be the data itself. </p>
<p>Of course, if social sites start selling data to the highest bidder that leaves open questions of data ownership and privacy and finding ways to strip personal identifiers.</p>
<p><a href="http://about.me/mariewallace">Marie Wallace</a> (<a href="http://www.twitter.com/marie_wallace">@marie_wallace</a>) is social analytics strategist for the IBM Collaboration Solutions division. She has spent more than a decade at IBM working on content analytics, and her experience uniquely positions her to address questions regarding big data, social media and analytics. Our interview follows.</p>
<h2>Social media&#8217;s real value might not be in selling ads, but in the data they are collecting. Why do you think that is?</h2>
<p><img src="http://s.radar.oreilly.com/wp-files/5/2012/09/0912-marie-wallace.jpg" alt="" title="0912-marie-wallace" width="190" height="191" class="alignright size-full wp-image-51980" /><strong>Marie Wallace:</strong> The reason ad targeting has worked so well for search is because it&#8217;s aligned and supportive to that particular activity; when I am searching for information about products or services I am happy to get ads that may help direct my search. Ads are somewhat analogous to a value-added service and social search makes the ads more personalized and relevant, which is why Google has invested so heavily in Google+. </p>
<p>The key is that in most cases ads only work in a search-like context, however with most social media sites people are not going there to search. They are going to converse with friends and family, which makes ads interruptive and frequently invasive. This is further exacerbated by mobile, where limited real estate makes ads even more offensive as they are distracting and clutter the screen. Social search is one example of a service that sits on top of social data, but there are a whole plethora of other services that social data can drive &mdash; from market research to consumer/brand engagement, social recommenders, information filtering, or expertise location. <span id="more-51981"></span></p>
<h2>It&#8217;s one thing to recognize the value of data, but how do you <em>extract</em> that value?</h2>
<p><strong>Marie Wallace:</strong> Extracting value from data requires a well-described set of scenarios with a clear understanding of what facts would be considered valuable for those scenarios. For example; when looking for a job there are a very specific set of questions that people want asked and answered: employee sentiment, corporate success (revenue, customers, products, growth), location, demographics, technologies, industries, skills, competitors, values, culture. </p>
<p>These are very different to the questions (and hence analysis) that might be pertinent to a different scenario. For example; when deciding where to go on holidays people are likely more interested in the location, activities, accommodation, weather, cost, demographics, or visitor sentiment. The key here is that analysis has to be not only domain-, but scenario-specific, which is why targeted specialist services like <a href="http://www.linkedin.com">LinkedIn</a> or <a href="http://wwww.tripadvisor.com">Tripadvisor</a> are always going to be able to deliver greater analytics value for the specific scenarios they support.</p>
<h2>There are concerns on social networks about the sites abusing the data users are contributing. Is there a reliable way to anonymize data and deliver it in aggregate form that strips out individual user information?</h2>
<p><strong>Marie Wallace:</strong> I think the issue of privacy is a more complex problem, and while anonymizing user information is part of the solution, I don&#8217;t believe it&#8217;s at the heart of the problem. I believe the key social media challenges moving forward will be those of permission, trust, and transparency. People need to know exactly how their data is being used so that they can give permission for that use and that use only. For example; if I have a Tesco loyalty card and I trust them to respect my data, then I might be happy for them to see my Facebook Likes so they can provide me with more relevant special offers. Or if I register on LinkedIn I know that my data is going to be provided to recruiters and hiring companies, but I most definitely don&#8217;t want them to use it for any other undisclosed purpose. </p>
<p>There is also a likelihood that in the future we will see information brokers emerge, which provides a level of indirection (perhaps even obfuscation or anonymization) where they act as mediators on our behalf. This simplifies the authorization model, but does assume that we trust the information brokers and the models that they use for controlling access to our information.</p>
<h2>Have the tools caught up with the amount and variety of data so that services like social networks can begin to manipulate the data they collect?</h2>
<p><strong>Marie Wallace:</strong> Having spent the last decade working on content analytics and semantic technologies, I can confidently say that many of the required tools have been around for years waiting for demand to catch up with supply. The advent of social media, alongside the growth of a new generation of big data platforms, now gives them the perfect business problem, dataset, and execution platform through which to shine. However, I believe the industry does have one significant gap in this otherwise rich landscape of technologies, and it&#8217;s a gap that I believe will impact the value that we can derive from these social networks. </p>
<p>It&#8217;s our handling of massive-scale networks that I believe is going to become a technological challenge as we move rapidly toward massive-scale graphs with social, semantic, temporal, and geospatial characteristics and as we look to apply complex analytics across these networks. There are a number of existing technologies from the linked data world that could morph to fill this gap, or alternatively there is a new generation of graph databases and analytics algorithms emerging focused on tackling this specialized problem. Only time will tell in terms of which technologies will emerge the winners.</p>
<h2>What kinds of uses could you envision social sites finding for their data?</h2>
<p><strong>Marie Wallace:</strong> For the medium-term, I suspect that we will continue to see social analysis being driven by the marketing, sales, and support organizations. Social data will be used for market research, to help expand sales channels, and to improve how brands interact with customers. </p>
<p>As we move from marketing to sales to support, the type of analysis becomes more complex and this will put pressure on the algorithms being used to evaluate the data and derive insights; identity and entity disambiguation, micro-segmentation, influence analysis, sentiment, intent, network information flow, and community dynamics. A growing number of social applications will emerge, each delivering niche value to consumers and generating specialist data for brands. This ecosystem of social networks will drive consumer-brand engagement; everything from consumer feedback systems, customer support, to product and service innovation. Brands will move away from a focus on passive listening/monitoring to one of active engagement, and this will require a broader range of analytics in order to optimize and operationalize those interactions. </p>
<p>Further out I see us expanding the personalization that can be realized. Social data will become increasingly important for personalizing every search and navigation experience from Google, Amazon, Netflix, to Expedia, however search is only the tip of the iceberg. I anticipate that in the longer term social data will be used to personalize a whole range of experiences that cross the physical/digital divide; transforming how we shop, what we think, how we learn, and ultimately how we live. </p>
<p>Just imagine what will happen when we intersect the social web, the semantic web, with the web of data. Then we will really see personalization take on a whole new form!</p>
<p><em>This interview was edited and condensed.</em></p>
<div style="float: left; border-top: thin gray solid; border-bottom: thin gray solid; padding: 20px; margin: 20px 2px; clear: both;"><a href="https://en.oreilly.com/stratany2012/public/regwith/RADAR20?intcmp=il-strata-stny12-marie-wallace-interview-social-media-data"><img style="float: left; border: none; padding-right: 10px;" src="http://cdn.oreilly.com/radar/images/promos/2012-strata-ny-promo.gif" /></a><a href="https://en.oreilly.com/stratany2012/public/regwith/RADAR20?intcmp=il-strata-stny12-marie-wallace-interview-social-media-data"><strong>Strata Conference + Hadoop World</strong></a> &mdash;  The O&#8217;Reilly Strata Conference, being held Oct. 23-25 in New York City, explores the changes brought to technology and business by big data, data science, and pervasive computing. This year, Strata has joined forces with Hadoop World.</p>
<p><a href="https://en.oreilly.com/stratany2012/public/regwith/RADAR20?intcmp=il-strata-stny12-marie-wallace-interview-social-media-data"><strong>Save 20% on registration with the code RADAR20</strong></a></div>
<p><strong>Related:</strong></p>
<ul>
<li> <a href="http://strata.oreilly.com/2011/03/social-data-tools-application.html">Social data is an oracle waiting for a question</a></li>
<li> <a href="http://strata.oreilly.com/2011/11/social-network-analysis.html">Social network analysis isn’t just for social networks</a></li>
<li> <a href="http://strata.oreilly.com/2011/01/data-markets-resellers-gnip.html">Data markets aren’t coming. They’re already here</a></li>
<li> <a href="http://radar.oreilly.com/2010/05/my-contrarian-stance-on-facebook-privacy.html">Tim O&#8217;Reilly: My Contrarian Stance on Facebook and Privacy</a></li>
<li> <a href="http://www.internetevolution.com/author.asp?section_id=1047&#038;doc_id=249815&#038;&#038;utm_source=buffer&#038;buffer_share=f6528">Data, Not Ads, Will Make Social Networks Profitable</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://strata.oreilly.com/2012/09/social-data-business-model.html/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>
