Slashdot Mirror


Microsoft Tracking Behavior of Newsgroup Posters

theodp writes "Ever get the feeling your Usenet newsgroup list is being watched? By Microsoft? If so, consider yourself right. An interesting but troubling CNET interview with Microsoft's in-house sociologist goes into how the software giant is keeping a close eye on newsgroups and other public e-mail lists, tracking and rating contributors' social habits and determining "people who the system has shown to have value." Those concerned that it's not a good idea for computers to track their belongings and whereabouts are advised that they may ultimately have to fragment their identities, keeping multiple IDs and e-mail addresses."

2 of 543 comments (clear)

  1. Give it a break by The+Bungi · · Score: 5, Informative
    • This monitoring goes on exclusively in the msnews.microsoft.com domain, plus a few others that are also run by the company. While NetScan is sometimes pointed to MS-oriented news servers (news.devx.com is an example), Microsoft is not "monitoring USENET".
    • Marc Smith is a very sharp guy who has done a lot of interesting work with the social dynamics of online communities. Goggle him for more info. And if you have questions about what NetScan does, give it a whirl and form your own conclusions.
    • At the moment, NetScan is used by the MVP program to follow members' posting history. The MVP program is not exclusive to NNTP, however.
    • I can't see how this goes into the "YRO" section - if Microsoft is monitoring the news servers it operates and that bothers you - don't post there. This is hardly the land of the Microsoft advocate or even user for that matter. This is like reporting that I'm painting my bedroom bright red - WTF do the neighbors care about that?
    Yet another hysterical ad revenue generating headline, brought to you by the Slashdot "editors".
  2. Re:Multiple addresses wont work by pclminion · · Score: 5, Informative
    Bayesian analysis can match writers to messages regardless of the email address.

    You just pulled that out of your ass, and you know it. There are so many gigantic misunderstandings underlying that statement that I can't even begin to attack it, so suffice it to say, a simple Bayesian analysis more than likely cannot identify people based solely on what they write.

    Ok, I'll give you a hint. Suppose we apply this method to Slashdot. There are about 650000 Slashdot readers. You are talking about calculating the class-conditional probability for every user on Slashdot. The differences in class-conditional probability (per user) are going to unbelievably small -- so small that any results you achieve are going to be statistically meaningless.

    Bayesian techniques work okay for classifying when you've only got two or three buckets. But when you try to apply it to say, thirty buckets (much less 650000!!) it breaks down really quickly.

    Also, remember that the true name for the technique is "Naive Bayesian inference." In this case (heh, in most cases) the term "naive" doesn't mean "clever and infallible."

    Yes, I do research on text analysis algorithms with applications to anti-spam filters, so I do have some clue what I'm talking about.