Visual Exploration of Complex Networks
jweebo writes "Seed magazine has a story on complexity, and how it can be visually represented with fascinating results. From the article:
'Complexity is everywhere. It's a structural and organizational principle that reaches almost every field imaginable, from genetics and social networks to food webs and stock markets ...Collected here are a few of the many intriguing, and often beautiful, images that illustrate how the whole is more than the sum of its parts.'"
-
pixels on the display: 2 million or so.
- insufficiency of the clustering algorithms: showing one pixel per node and random placement, or placement by DFS traversal? for trees, or for graphs where classification is the primary concern, then tree-map or "Csoft" views scale relatively well in this regard, but what about for more general problems?
- implementations (or algorithms) that don't scale: e.g. graphviz uses n^2 (n=#nodes) space for its graph layout!
one must always think about the summarization criteria: what aren't you going to show? how will you indicate that summarization has occured? how do you denote drill-down capability? what will the form of drill-down be? what heuristics should you use to selectively deaggregate, in order to highlight potentially interesting subgraphs? for large-scale info, this is as important as what you will be showing, and how it will be shown! for our stuff, we have graphs with tens of millions of nodes....than a thousand words.
Really, you'd be amazed at how even the simplest graphical interpretation of complex data can really show up points of interest. And it's not difficult to see why: Humans' primary sense is visual and we have evolved some seriously complex neural algorithms to interpret visual data.
A simple graph is a case in point. Now take a large amount of complex data and apply just about any process you care to name to present a graphical representation and you can easily see the overall picture.
A very simple example which illustrates statistical clustering. Even with totally random numbers, you *will* find islands of apparently significant populations. This is a common counter-claim to action groups who claim, say, a correlation between mobile 'phone masts and incidents of child leukaemia*. Anyway:
Generate a stream of random numbers and assign a symbol for n = 0.5, display the symbols in a grid and, hey presto! Look at those clusters!
On a more positive note:
We often use graphical representation in our work. This ranges from CTK representations of molecules we're looking at (xlation - pretty pictures with balls and lines) to grid based colour indexed representation of multi-dimentional data sets. In each case the point is to present data in a way that we humans can quickly spot potential areas of interest and get a "feel" for the data we're looking at.
It's all good stuff. (Sometimes very pretty, too)
* Actually, this is a good example of why I'm always wary of purely statistical "proofs". In this case the *science* (ie. proposed mechanisms for this) don't hold up to current understanding.