Anybody else missing the hashtag #SQLbits in their timeline?  I started having a look back over some of the tweets today …. Don’t tell anyone else in the office, but I may have got a little distracted by it!

I started thinking how amazing it is that so many like-minded people can be united in discussion by Twitter.  Deep thinking I know.  More questions sprung to mind … How many #SQLBits tweets were there?  Who tweeted the most?  Which tweets were most popular?  Most useful?  Which twitter users interacted the most with others?  Who was the most influential?

Then I started wondering if there was any way of visually mapping how we all interacted on Twitter over the space of the conference.  It turns out you can.

My internet meandering brought me to The Social Media Research Foundation (, who state their mission as being ‘mapping, measuring and understanding the landscape of social media’.  Their primary project is a free open tool NodeXL, it’s an open network discovery and add-in for Excel that extends the familiar spreadsheet so that it can collect, analyse and visualise complex social networks.  Further information can also be found on Microsoft Research (

It’s a fantastic tool, very easy to install and start working with straight away.  Initially I loaded all the tweets using the hashtag #SQLBits during the week of SQLBits, there was an impressive 5539!

From there it allowed me to analyse and visually map out the network, it starts by mapping out all interactions between network nodes (Twitter Users) for the whole network and then maps them into smaller groups that communicate with each other most frequently.  The three main metrics provided by Node XL are In-Degree, Out-Degree and Betweeness Centrality.

In-Degree = The number of tweets mentioning that node (user)

Out-Degree = The number of tweets from that node mentioning another node

Betweeness Centrality = Is an indicator of a node’s centrality in a network.  It is equal to the number of shortest paths from all vertices to all others that pass through that node.  A node with high betweeness centrality has a large influence on the transfer of items through the network, under the assumption that item transfer follows the shortest path.

I’ve used a ‘Group-in-a-box’ layout of a clustered graph to visually map out the #SQLBits network below.


Figure 1 – #SQLBits clustered network graph



Highlighted are the top ten users with the largest Betweeness Centrality, as seen below.

Figure 2 – Top ten #SQLBits network influencers.



As you can see in the network graph in Figure 1 the network is split into five main groups, who interact together the most, dark blue, light blue, dark green, light green and red.  The users in the red group are mainly companies, the others are smaller social groups who seem to be separated by geography and SQL user groups.

I’ve produced data for the dark blue, light blue and dark green groups below, if anyone has any ideas or suggestions they would like to look into further get in touch with @dataidols on twitter.


Figure 3 – Dark Blue Group




Figure 4 – Light Blue Group




Figure 5 – Dark Green Group





This blog was written by David Loughlan at Data Idols.  David has worked in the IT industry for 18 years, originally as a Digital Unix mainframe operator, before becoming a University webmaster, Windows sys admin and finally a Microsoft certified contract SQL Server DBA.  Now-a-days David works with Data Idols, a knowledge lead recruitment agency focussed on revolutionising the data recruitment industry.  If you would like to get in touch with David to discuss this blog, your career or a requirement within your company he can be reached on @DataIdols or