20120326-NodeXL-Twitter hadoop network

The graph represents a network of up to 1000 Twitter users whose recent tweets contained "hadoop". The network was obtained on Monday, 26 March 2012 at 22:31 UTC. There is an edge for each follows relationship. There is an edge for each "replies-to" relationship in a tweet. There is an edge for each "mentions" relationship in a tweet. There is a self-loop edge for each tweet that is not a "replies-to" or "mentions". The earliest tweet in the network was tweeted on Friday, 23 March 2012 at 18:55 UTC. The latest tweet in the network was tweeted on Monday, 26 March 2012 at 19:46 UTC.

 

The graph is directed.

 

The graph's vertices were grouped by cluster using the Clauset-Newman-Moore cluster algorithm.

 

The graph was laid out using the Harel-Koren Fast Multiscale layout algorithm.

 

The edge colors are based on relationship values. The vertex sizes are based on followers values.

 

Overall Graph Metrics:

Vertices: 1000

Unique Edges: 6078

Edges With Duplicates: 1006

Total Edges: 7084

Self-Loops: 886

Connected Components: 237

Single-Vertex Connected Components: 223

Maximum Vertices in a Connected Component: 747

Maximum Edges in a Connected Component: 6752

Maximum Geodesic Distance (Diameter): 9

Average Geodesic Distance: 3.249811

Graph Density: 0.00584284284284284

Modularity: 0.380253

 

Top 10 Vertices, Ranked by Betweenness Centrality:

@cloudera

@hackingdata

@mikeolson

@al3xandru

@bigdata

@tlipcon

@infochimps

@allcloudnews

@merv

@twitteross

 

Top keyword pairs by frequency of mention

V1V2WEIGHT

bigdata219

addshadoop120

mapradds100

hadoopconnectors100

movehighlights50

amazonmove49

highlightshadoop47

hadoophurdles47

cloudcomputing41

@ulitzer#cloud40

#cloud#cloudexpo40

#cloudexpo#cloudcomputing40

#cloudcomputing#bigdata40

#bigdata@cloudexpo40

@cloudexpo@bigdataexpo40

apache#hbase39

dataprocessing33

opensource32

#codemotion#es30

definitiveguide29

 

More NodeXL network visualizations are here: www.flickr.com/photos/marc_smith/sets/72157622437066929/ and here:

www.nodexlgraphgallery.org/Pages/Default.aspx

 

A gallery of NodeXL network data sets is available here: nodexlgraphgallery.org/Pages/Default.aspx?search=twitter

 

NodeXL is free and open and available from www.codeplex.com/nodexl

 

NodeXL is developed by the Social Media Research Foundation (www.smrfoundation.org) - which is dedicated to open tools, open data, and open scholarship.

 

Donations to support NodeXL are welcome through PayPal: www.paypal.com/cgi-bin/webscr?cmd=_s-xclick&hosted_bu...

 

The book, Analyzing social media networks with NodeXL: Insights from a connected world, is available from Morgan Kaufmann and from Amazon.

www.amazon.com/gp/product/0123822297?ie=utf8&tag=conn...

11,345 views
8 faves
0 comments
Uploaded on March 29, 2012