Strata Gems: Make beautiful graphs of your Twitter network

Use Gephi and Python to find your personal communities

We’re publishing a new Strata Gem each day all the way through to December 24. Yesterday’s Gem: Explore and visualize graphs with Gephi.

Strata 2011 Where better to start analyzing social networks than with your own? Using the graphing tool Gephi and a little bit of Python script, you can analyze your own Twitter network, revealing the inherent structure among those you follow. It’s also a fun way to learn more about network analysis.

Inspired by the LinkedIn Gephi graphs, I analyzed my Twitter friend network. I took everybody that I followed on Twitter, and found out who among them followed each other. I’ve shared the Python code I used to do this on gist.github.com.

To use the script, you need to create a Twitter application and use command-line OAuth authentication to get the tokens to plug into the script. Writing about that is a bit gnarly for this post, but the easiest way I’ve found to authenticate a script with OAuth is by using the oauth command-line tool that ships with the Ruby OAuth gem.

The output of my Twitter-reading tool is a graph, in GraphML, suitable for import into Gephi. The graph has a node for each person, and an edge for each “follows” relationship. On initial load into Gephi, the graph looks a bit like a pile of spider webs, not showing much information.

I wanted to show a couple of things in the graph: cluster closely related people, and highlight who are the well-connected people. To find related groups of people, you can use Gephi to analyze the modularity of the network, and then color nodes according to the discovered communities. To find the well-connected people, run the “Degree Power Law” statistic in Gephi, which will calculate the betweenness centrality for each person, which essentially computes how much of a hub they are.

These steps are neatly laid out in a great slide deck from Sociomantic Labs on analyzing Facebook social networks. Follow the tips there and you’ll end up with a beautiful graph of your network that you can export to PDF from Gephi.

Social graph
Overview of my social graph: click to view the full PDF version

The final result for my network is shown above. If you download the full PDF, you’ll notice there are several communities, which I’ll explain for interest. The mass of pink is predominantly my O’Reilly contacts, dark green shows the Strata and data community, the lime green the Mono and GNOME worlds, mustard shows the XML and open source communities. The balance of purple is assorted technologist friends.

Finally my sporting interests are revealed: the light blue are cricket fans and commentators, the red Formula 1 motor racing. Unsurprisingly, Tim O’Reilly, Stephen Fry and Miguel de Icaza are big hubs in my network. Your own graphs will reveal similar clusters of people and interests.

If this has whetted your appetite, you can discover more about mining social networks at Matthew Russell’s Strata session, Unleashing Twitter Data For Fun And Insight.

tags: , , , ,
  • Peter Clarke

    Hi,

    This looks great. One point I would make is that the graph handling and output code could be made far simpler using the NetworkX library. This will output your network in GraphML without having to hard code it.

  • http://twitter.com/emileifrem Emil Eifrem

    So if you want to visualize HUGE graphs (billions of nodes) then there’s a project to combine Gephi with http://neo4j.org (Neo4j is open source and the most widely deployed graph database in the world):

    http://bit.ly/dLMOJV

    It’s still a work in progress but very interesting stuff if you have graphy data.

    Disclaimer: /me involved.

    -EE

  • http://blog.ouseful.info Tony Hirst

    I’ve been using a similar approach generate exactly the same sort of graph for what I originally called :hashtag communities” but which I’ve more recently started referring to as “hashtag echo chambers”.

    So rather than just look through my list of friends, and plot the relationships between them, I’ve been looking for folk using a particular hashtag and plotting the relationships between them.

    For example: http://blog.ouseful.info/tag/gephi/

  • http://www.fredtrotter.com Fred Trotter

    Any reason not to offer this through a web interface? I would be willing to allow an app that generated a graph like this to access my twitter account, and then just give me a data dump so that I could skip to the stage where I am using Gephi.

    The only reason for everyone to setup their own twitter application is we all wanted to run slightly different versions of the code….

    -FT

  • http://radar.oreilly.com/edd Edd Dumbill

    Fred, that would be neat. The main reason I’ve not done it is that the way my scraper is written right now, it could take a long time to do its work. I do an API call per person you’re following, and Twitter rate limits those to 300 an hour. If you follow 600 people it’s going to take at least 2 hours to generate your graph file, and then if others use the system, it’s easy to see how you could be waiting weeks

  • http://blog.ouseful.info Tony Hirst

    Edd,

    You can always use the Google Social Graph API rather than the Twitter API to get friends and followers lists (eg I use it to find common friends in the script posted here: http://blog.ouseful.info/2010/12/13/common-friends-on-twitter/ )

    I’m not sure if the Goog API is limited in any way though? Certainly no key is required…

  • drewp

    https://gist.github.com/857151 has a fix to make the twitter module’s return values picklable, and it encodes usernames as utf8.