ENTRIES TAGGED "social network analysis"
Graph data is an area that has attracted many enthusiastic entrepreneurs and developers
The popular open source project GraphLab received a major boost early this week when a new company comprised of its founding developers, raised funding to develop analytic tools for graph data sets. GraphLab Inc. will continue to use the open source GraphLab to “push the limits of graph computation and develop new ideas”, but having a commercial company will accelerate development, and allow the hiring of resources dedicated to improving usability and documentation.
While social media placed graph data on the radar of many companies, similar data sets can be found in many domains including the life and health sciences, security, and financial services. Graph data is different enough that it necessitates special tools and techniques. Because tools were a bit too complex for casual users, in the past this meant graph data analytics was the province of specialists. Fortunately graph data is an area that has attracted many enthusiastic entrepreneurs and developers. The tools have improved and I expect things to get much easier for users in the future. A great place to learn more about tools for graph data, is at the upcoming GraphLab Workshop (on July 1st in SF).
Data wrangling: creating graphs
Before you can take advantage of the other tools mentioned in this post, you’ll need to turn your data (e.g., web pages) into graphs. GraphBuilder is an open source project from Intel, that uses Hadoop MapReduce1 to build graphs out of large data sets. Another option is the combination of GraphX/Spark described below. (A startup called Trifacta is building a general-purpose, data wrangling tool, that could help as well. )
WolframAlpha's updated Facebook tool visualizes your network (and makes outliers obvious).
WolframAlpha announced an upgrade to its Personal Analytics for Facebook platform that allows users to visualize a number of aspects of their Facebook ecosystem. John Burnham outlines the updates on the WolframAlpha blog, noting that the popular visualization from the tool’s first release has a number of enhancements that allow users to better visualize how their social network fits together.
A disk-based, single-node, graph analytics system that scales to massive graphs
Designed specifically to run on a single computer with limited memory1 (DRAM), since its release a few months ago GraphChi has been used to analyze graphs with billions of edges. Running on a single machine means deployment and debugging are simpler. In addition it is no longer necessary to find (optimal) graph partitions that minimize communication between compute nodes – the starting point for many distributed graph computations.
The stated goal of GraphChi is to “Compute on graphs with billions of edges, in a reasonable time, on a single PC.” One way to define “reasonable amount of computation time” is to compare against the results produced by other graph processing systems. That’s exactly what GraphChi’s creators did in a recent paper. They found that GraphChi compared favorably to graph analytics packages such as Pegasus and Stanford GPS. While GraphChi was 2-3X slower2 in some cases, it is easier to deploy, easier to debug, and way more energy efficient. Read more…
Social network analysis (SNA) finds meaningful patterns in relationship data.
The scientific methodology of social network analysis (SNA) helps explain not just how people connect, but why they come together as well. Here, "Social Network Analysis for Startups" co-author Maksim Tsvetovat offers a primer on SNA.