When Metadata Analytics Goes Awry
jfruh writes "When blogger Dan Tynan started seeing lots of Latvians in his LinkedIn People You May Know list, it was pretty funny, considering he'd never been to Latvia or ever met anyone from there. But now that shadowy spy agencies are using algorithms similar to LinkedIn's to see if we're terrorists, mistakes like this are a lot scarier. From the article: 'More than ever -- and online in particular -- who you know can be more important than who you are. In fact, who somebody thinks you know may be more important than who you are, especially if that somebody is a faceless government bureaucracy with limitless power to izjaukt savu dzvi (mess up your life).'"
I created a new gmail id to get price quotes from auto dealers. And now Google keeps telling me I might now someone named Steve Lexus and wants me to add him to my circles. Well, at least they seem to have filtered out Jane Honda and Palvayantheeswaran Toyota and Poponopoulous Mitsubishi.
sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
If the NSA just reverses a similar algorithm, what happens when it says that Mahmoud Ahmadinejad may know me? Especially if I have access to centrifuges.
Then I have to prove a negative, that I do not know this person. All their evidence points to the opposite. "He was in New York at the same time!" (BUT I LIVE THERE) "Doesn't matter". "Your fathe'rs, cousin's, uncle's former roomate went to Iran as an exchange student", etc, etc.
Silence is a state of mime.
Time to worry about the real problems affecting people's lives.
much of left-wing thought is a kind of playing with fire by people who don't even know that fire is hot - George Orwell
When FB or Amazon recommends something/someone, I can usually see some sense behind it. LinkedIn is just plain random. I don't know 95% of the people it seems to want to connect me with. It is a joke.
Stop using social media. Some of the crud I've seen on LinkedIn is as bad as Facebook and I do not want to be associated with it.
Side Note: If you do use LinkedIn; it is not a dating site. Some of my female colleagues have started complaining about unwanted attention. Just because she met you at that training class last month, and accepted your connection, does not mean she is interested in 'knowing' you. Sheesh.
Originally when the concept of "degree of separation was invented" the idea was that everybody was connected to everybody through 6 degree of separation.
At the same time people think that are "the good guy" who does not keep "bad company".
With social media the length of the separation chain has considerably shrunk.
Add to this that most people who "do something interesting" (like making really nice flower arrangement for instance) will tend to travel and meet a "much smaller" crowd of people who "move around".
In this "smaller world" you can make "very short chains" to quite shady people. Actually it is trivial to create a chain from any US politician to "big list of officially evil guy" that at most 4 level deep. (For instance Ex HP Head Carly Fiorina went to KSA and met large HP clients including the heads of SBL managed by the brother of that really bad guy who did get some support from the ex President(s) Bush when he was against the Soviets...
And now comes the "suspicion creep" if you know Fiorina and one or tow of the Bushes, then you know 2 suspicious characters that are 3 or less level away from Really Suspicious guy.
So "one" could be ok, but 2 humm very bad...
So unless you take great pain to avoid anybody that might "be out of the ordinary", you imediatelly are 100% sure to become somehow "in contact" with somebody "suspicious".
Or seen another way, being not completely boring gets you something like 200 contacts, among which you can expect at least 3 "super connectors" who do not really overlap, particularly if you are travelling, so taking in account diminishing returns it is hard to avoid having less the 3M "level 3 contacts"
or 1/1000 of all adults in the world
the probability that less than 2 are "bad guys" is quite low.
so be boring or be afraid, very afraid...
As I used to work for an American company who have an office in Dubai (full of people with Arabic names, and lots of Muslims), a working team in India (very close to Pakistan, never mind the fact that the two countries hate each other almost as much as Chicago Bears and Green Bay Packers fans), and a development/support team in the Phillippines (close to China, with a similar relationship to India and Pakistan, and with their own domestic terrorism issues), clients in sub-Saharan Africa, Russia and Texas, my LinkedIn and Facebook profiles are full of people in those areas.
Given that the NSA does not stop at analyzing your own contacts, I am apparently a person of interest if one of my contacts has any dubious friends, or if one of my contact's contacts has any dubious friends.
Kevin Bacon is indeed going to be screwed, we might as well just lock him up and start waterboarding him now, and save the NSA the trouble.
If you truly concern about this problem, the real question to ask is why on earth do you sign up with linkedin (or g+ or facebook).
The real question isn't who is connected to terrorists, but rather, "Who are the terrorists and their support network?"
The intelligence agencies are not going to be much interested in accumulating every possible association, but rather in narrowing it to the people of interest for the purpose at hand. If I knew the pilot that flies Prime Minister Cameron and his guests, I could be connected to many of his guests with 2 hops. Lets say one of those guests was Angela Merkel. Anyone that cared to look would realize that I did not in fact know or have any influence with Angela Merkel. It would also become obvious pretty quickly that I don't communicate with her. The "connection" may technically exists or be possible, but as a practical matter it is pointless.
I will also point out that when traversing a hierarchical organization, 6 hops doesn't necessarily get you to the top.
much of left-wing thought is a kind of playing with fire by people who don't even know that fire is hot - George Orwell
This is what I, and a host of others, having been screaming about for years. People are blindly using analytics and "big data" to make important decisions decisions about health care, insurance, credit ratings, terrorist affiliations, etc. I have encountered so much bad data in my career the thought that it is take as "gospel" makes me sick. Bad data are out there and cleaning up a polluted data stream, when possible, is expensive and takes a long time.
Then you add in the use of NoSQL databases engines such as MongoDB which are not ACID compliant. You are virtually guaranteeing data will be corrupted. But then again, maybe I "just don't get it". But personally I think contributing to bad data is unethical.
putting the 'B' in LGBTQ+
My work requires me to keep up to date with the computer industry. This means I must be connected with the hacker sites and ipso facto it also demands I am 1 degree separated from Mr. Snowden and many others who the US Government takes a dim view of. Get real people, mere contact isn't criminality, it is in the case of the investigator necessity. This is why the whole concept of Probable Cause is such a necessity!
"You" might "want" to "back off" on the "quotes".
I have heard it said, perhaps apocryphally - If you look at the birth and death records for the State of Florida, you will conclude that a majority of people in that state are born Latino and die Jewish. Having reams of data is a start; but you must also have an accurate model.