Detecting Patterns in Complex Social Networks
Roland Piquepaille writes "So-called social networking is very popular these days, as show the proliferation of services like Friendster, Orkut and dozens of others. But do the companies behind these services have any idea of what is hidden inside their complicated networks? When these networks reach a size of millions of users, it's not an easy task. A researcher at the University of Michigan is trying to help, with a new method for uncovering patterns in complicated networks, from football conferences to food webs. This overview contains more details and references about this non-traditional method. It also includes a spectacular representation of the Internet and another image showing a food web at Little Rock Lake."
I wonder if this will improve search results? All the fake porn sites will be lumped together, thus, hopefully, taking them out of regular, useful searches.
EVERYDAY IS CATURDAY
We see and understand patterns based on the amount of data we can digest (which has gone much further with computers). Knowing that you could always be one data set off defining a pattern makes you wonder if chaos exists at all, hence the replacement of words like chaos with words like "complex".
From football conferences to food webs: U-M researcher uncovers patterns in complicated networks
SEATTLE---The world is full of complicated networks that scientists would like to better understand---human social systems, for example, or food webs in nature. But discerning patterns of organization in such vast, complex systems is no easy task.
"The structure of those networks can tell you quite a lot about how the systems work, but they're far too big to analyze by just putting dots on a piece of paper and drawing lines to connect them," said Mark Newman, an assistant professor of physics and complex systems at the University of Michigan.
One challenge in making sense of a large network is finding clumps---or communities---of members that have something in common, such as Web pages that are all about the same topic, people that socialize together or animals that eat the same kind of food. Newman and collaborator Michelle Girvan, a postdoctoral fellow at the Santa Fe Institute in Santa Fe, New Mexico, have developed a new method for finding communities that reveals a lot about the structure of large, complex networks. Newman will discuss the method and its applications Feb. 15 at the annual meeting of the American Association for the Advancement of Science in Seattle.
"The way most people have approached the problem is to look for the clumps themselves---to look for things that are joined together strongly," said Newman. "We decided to approach it from the other end," by searching out and then eliminating the links that join clumps together. "When we remove those from the network, what we're left with is the clumps."
The researchers tested their method on several networks for which the structure was already known---college football conferences, for example. In college football, teams in the same conference face off more frequently than teams in different conferences. When inter-conference games do occur, they're more likely to be between teams that are geographically close together than between teams that are far apart. Plugging in information on frequency of games between pairs of teams in the 2000 regular season, Newman and Girvan tested their method to see if it could correctly sort the colleges into conferences. "There were a few cases where it made mistakes, but it got well over 90 percent of them right," said Newman. "It gave us the structure we were expecting, so that was encouraging."
Newman and Girvan---and other researchers who've learned about their work---have gone on to apply the technique to systems where the structure is not as well understood, looking at everything from networks of Spanish language web logs to communities of early jazz musicians to a food web of marine organisms living in Chesapeake Bay.
"Networks and other systems that we study are becoming increasingly large and complicated these days," said Newman. "New methods like this help us to make sense of what we see and to understand better how things work."
###
For more information:
Mark Newman -- http://www-personal.umich.edu/~mejn/
American Association for the Advancement of Science -- http://www.aaas.org/
Santa Fe Institute -- http://www.santafe.edu/
The blue node (left center) in this diagram was gettin' some action!
TK
The uses for this software are astounding. It is, essentially, a breed of software designed to recognize and manipulate social class systems.
... imagine that ... a means of actually targetting campaigns and capers directly to the primary delivery mechanisms of word of mouth among a large group. This software can give you that.
... put this in the hands of the right (wrong?) people, and we could see social revolutions targetted and executed with such blinding accuracy and predictability that most of us simply won't know what hit us ...
... maybe its time to unplug.
Imagine a system which tells you, easily enough, who the 'most popular person for subject ___Y___' is, in your neighborhood? Target a campaign of computer-buying to only -3- folks in an area, and end up blanketing the entire region with tuber-like memes...
PR agencies could use this data to identify the core 'gossip leaders', the ones who have massive impact on multiple peers, and then they could target only those people with their campaigns
There are numerous religious theories, also, on the strengths of individuals and groups and the effect that these social connections have on a movement
This is the danger zone. The moment we start using computers to do qualitative analysis of social dynamics, and then using the data for commercial/religious/nefarious purposes, well
; -- the corruption of government starts with its secrets. a truly free people keep no secrets. --
On the other hand, if one is interested in science. . .
I'd be more interested in seeing the data that gets deleted, not the clumps. This isn't to say that the clumps aren't important, especially if you're trying to rebuild oyster populations in the Chesepeake or some such, but plenty of people will be focusing on those. People have an attraction to like objects and group mechanism.
I have an attraction to the exceptions. That's where the really interesting scientific stuff is likely to be happening, and where the Nobels are most likely to be hiding.
Why is this star off the main sequence? How did it get there, what makes it tick? What relevance might that have to stars that are on the main sequence?
KFG
I have an idea. Phone books of mobile phones form another kind of network. Imagine, A has number of B in his/her phone book. B has number of C. E knows both A and B. Chances are, most of GSM users in Latvia are nodes of this network. But this network can be fragmented as well. I think we could study interesting things about society this way.
:)
We have 7-digit phone numbers and two mobile networks here in Latvia. Data can be stored this way:
6787026 -> 9131415
9131415 -> 5956564
etc...
All we need is one hashtable (or MySQL table) and data collection interface
The problem is more complicated, and you touch on one of the main weaknesses of any system where reputation and feedback in involved.
One aspect of the problem is the granularity by which relationships are defined. In many of the sites there is only one state: "friend or non friend". The real world encompases a number of shades and types, from business acquaintance to personal friend, intimate lover, etc.
Another aspect is the incentive to "game" these systems by increasing your friend count. This inevitably leads people loosening their interpretation such that they increase their visibile friend count. If the number if friends you were linked to was not public, there would be less of this (but you can't do that without breaking some of the functionality of the sites)
People have talked about "winning" at friendster or tribe or orkut - but there is no "winning" in these systems, as there should not be competition.
Last, there is no method for verification of any status between peers. Can you "prove" that so and so is really a friend?
There are others, but these are the main three, and not likely to be solved or addressed any time soon.
I remember the first maps of the Internet showed that certain nodes concentrate power in terms of the number of connections they make. Google, perhaps.
A quick reading on Zipf's Law shows that many natural systems (and many artificial ones that obey similar laws of construction and equilibrium) observe 'power rules' where the distribution of power is inverse to the number of entities at any level.
Surprising that earthquakes, cities, businesses, follow the same rules. And yet quite meaningless in any direct sense because we can't manipulate these rules, only observe them.
Human social networks also follow rules that I suspect are quite simple and possibly similar to Zipf's Law. For instance, a person can only maintain a finite number of contacts (technology may increase this number but it remains finite at any given time). Any new contact coming in displaces an existing contact. So a single person's contact list will follow a power law: twice as many contacts used half as often, ten times as many contacts used a tenth as often...
Mapping a contact network would need to take the importance of each contact into account. I may have my grandmother in my list, but I speak to her once a year. My accountant - every week. My wife - twice a day. My girlfriend - every hour.
Next: the differences between individuals in terms of how much time/skill they invest in networking. Gender differences... women do this much more and better than men, in general. Age differences... younger men do it less well than older men. Wealth differences... richer people do more networking, I'd suspect, until a certain point when they start to delegate it. Very poor people do very little networking.
So, the network is not a flat map. It's got two dimensions for the lines, but each line has a thickness, and each node (individual) has a size.
Finally, I'd suspect that the network also maps power in terms of social success. Those people with the most powerful networks (a recursive definition: the networks which involve the most powerful people) will also be the most successful socially / financially.
But they may not be the happiest.
I'd be more interested in seeing the data that gets deleted, not the clumps.
Following data clumping, it's really the interactions or the nexus of contact that is interesting. For instance, from a computer science or informational processing perspective, what draws someone to a piece of information? How does one direct information to be most useful? In biological systems, the nexus points are where life happens. For instance, the small molecular fluxes that are constantly providing for molecular signaling, protein synthesis etc.... Information is not lost per se, rather there are information fluxes.
So, to answer your question of stars, it could simply be that a particular star is off the main sequence because of earlier smaller phenomenon that resulted in its appearance much later off the main sequence. Alterations in gravity? Interactions with a binary star? Alterations of proton-proton chains?
Visit Jonesblog and say hello.
Well, on Slashdot, I get fans because people see and like what I post. (Except for one guy, I think he's just trying to max out his friends list.) I set friends based on whether I like and appreciate what they say, and would like to be reminded that I have them set as "friends" whenever they say something I don't necessarily agree with. It helps me consider other points of view.
Granted, its a set of small steps towards understanding the opposing point of view, but it does help broaden my horizons.
It's actually a very useful system.
tasks(723) drafts(105) languages(484) examples(29106)
Check out the "highschool friendships" diagram.
I think I was the yellow dot on the far left.
In all matters of opinion, our adversaries are insane. -Oscar Wilde
Unfortunately, then only pattern in my social network is the singleton pattern.
Ok, this is the graph of STD transmission among high school students.
Check out that stud on the left who is banging like 8 different girls.
Zoot!
The results have been known by social networks researchers long ago. They are being "discovered" by physicists, complex system scientists, and computer scientists.
R L&_cdi=5969&_auth=y&_acct=C000050221&_version=1&_u rlVersion=0&_userid=10&md5=0dbd43b8d4784bc1532be7b 6c056be81
What is interesting actually is NOT the clumps (the paper is wrong), but the (possibly heterogeneous, multi-modal and dynamic) networks and their various measurements that could reveal lots of things.
The parent is right in pointing a possible method of extracting the results, but ignores how one constructs the data warehouse in the first place and the significance of networks -- especially the social and dynamic ones -- instead of data warehouse, both of which are not trivial problems.
Several websites may enlighten those who are interested in probing social networks deeper:
http://www.sfu.ca/~insna/
INSNA is the professional association for researchers interested in social network analysis.
http://www.casos.cs.cmu.edu/
CASOS brings together computer science, dynamic network analysis and the empirical study of complex socio-technical systems. Computational and social network techniques are combined to develop a better understanding of the fundamental principles of organizing, coordinating, managing and destabilizing systems of intelligent adaptive agents (human and artificial) engaged in real tasks at the team, organizational or social level. Whether the research involves the development of metrics, theories, computer simulations, toolkits, or new data analysis techniques advances in computer science are combined with a deep understanding of the underlying cognitive, social, political, business and policy issues.
http://www.cmu.edu/joss/
The Journal of Social Structure (JoSS) is an electronic journal of the International Network for Social Network Analysis (INSNA). It is designed to facilitate timely dissemination of state-of-the-art results in the interdisciplinary research area of social structure. It publishes empirical, theoretical and methodological articles.
JoSS publishes manuscripts that are focused on social structure-on the patterning of social linkages among actors. These actors could be comprised of different types or levels or analysis, such as animals, humans, artificial agents, groups or organizations. INSNA was founded on the premise that the behavior and lives of social entities are affected by their position in the overall social structure. By examining the etiology and consequences of structural forms overall, of the location of entities within these structures, and of the formation and dynamics of ties that make up these structures, INSNA hopes to learn about the parts of behavior that are uniquely social.
http://www.sciencedirect.com/science?_ob=JournalU
Publication of social networks papers.