Algorithms Claimed To Hunt Terrorists While Protecting the Privacy of Others (vice.com)
An anonymous reader sends this report from Motherboard:
Computer scientists at the University of Pennsylvania have developed an algorithmic framework for conducting targeted surveillance of individuals within social networks while protecting the privacy of untargeted digital bystanders. ... The algorithms are based on a few basic ideas. The first is that every member of a network (a graph) comes with a sequence of bits indicating their membership in a targeted group. If say, the number two bit was set in your personal privacy register, then you might be part of the “terrorist” target population. For an algorithm searching a network for targets, it doesn’t just get to ask to reveal every network member’s bits. It has a budget of sorts, where it can only reveal so many bits and no more. The algorithms work to optimize this scenario such that as many bits-of-interest are revealed as possible. It does this optimization via a notion known as a statistic of proximity (SOP), which is a quantification of how close a given graph node is to a targeted group of nodes. This is what guides the search algorithms.
Depth-first search
When signing up for Facebook, everyone needs to either check or uncheck the "I am a terrorist" box. That way the Government can do detailed searches on terrorsts only, and not invade the privacy of non-terrorists.
And i thought it was just a cool fun show.
Meaning, guilt by association. Yeah, that should work....
“He’s not deformed, he’s just drunk!”
are going to use this to find minorities to beat and arrest.
There is a difference between aggregate data and meta data. I don't have a problem with aggregate data, but I do have a problem searching meta data. It's disgusting that the government is getting away with conducting illegal searches simply because they use the term 'meta data'. Meta data is often very personal information and even as much as the content itself. For instance location data is often called meta data that is associated with a picture. However I disagree with that. It is a key element. If I had taken a note of where a picture was taken conducting a search of the data to identify a picture in the real would would still be an illegal search.
Sounds great, as long as the terrorists all set their "I am a terrorist" bits. Otherwise it is useless.
I'm an American. I love this country and the freedoms that we used to have.
The obvious algorithm is to vacuum up all data from every citizen, in case your other algorithm gets updated you can re-run it more quickly and without risk of some of the data having been deleted since then.
Don't waste your vote! Vote for whoever you want, unless you live in a swing state it won't matter anyways
Meaning, guilt by association. Yeah, that should work....
Not guilt, but suspicion by association, and yes it has worked. A while ago the FBI got phone bill type information ("metadata", both phone numbers, date/time, duration) for known organized crime members and built a graph of all the connections these phone calls revealed. The FBI knew most of the nodes in the graph would be innocents; even criminals make restaurant reservations, see if dry cleaning is ready, etc. However analysis of the graph helped discover people actively involved in organized crime who had been completely unknown. This graph analysis was not simply looking at proximity as you suggest, I believe it looked at the number of unique connections from known organized crime members and various other factors.
Everyone will be profiled and rated in a method completely hidden from them.
Just like the FBI, DOJ, TSA, DHS, etc.. do now in fact. The difference here is that it leaves open the "we outsourced that to Facebook and Google, we didn't have anything to do with their bad decisions." plausible deniability option. Those people can say "We took the algorithm from some professor at some college", so they get the same benefit.
Yup, I am extremely cynical and have become so after being proven correct way too many times.
-The wise argue that there are few absolutes, the fool argues that there are no probabilities.
Ok, I work in academia publishing (other field, though) and this is obviously an interesting math problem solved before somebody declared "hey, if we named those higher potential nodes as 'terrorist' we could apply to a bunch of military funds!"
So what was it? I can think of versions of the travelling salesman (where non "terrorist" nodes act as cities that could be visited when designing a path to an "evil" one) or an electronics routing problem (where signals has to pass around heavy loaded blocks disturbing only the minimum components, i.e. "respecting the privacy" is like ignoring the state of some adyacent circuitery instead having to know exactly what is the exact load of the system)
Maybe they should refer to it as incidental data?
https://www.ietf.org/rfc/rfc35...
So we're officially codifying guilt by association now?
Why exactly would terrorist #1 be connected to terrorist #2 in 50 different ways so that they trip this threshold of this algorithm? Usually its 1 way, e.g. they meet in their regular bar, or some other forum. Sort of the same way I am connected to the man that drives the bus I use regularly. I am connected 1 way to him, and maybe 2 because I also see him in the supermarket sometimes.
Can I suggest they're idiots trying to justify bulk data mining for association by adding a layer of obscuring crap and calling it "privacy protecting"?
Just backdoor the algorithm to show all the bits. I wonder if it is possible to toggle bits as necessary on a profile to eventually get the whole profile.
And the wabbit awuhz whwinz. Shhhhh!
You joke, but this is actually a question on the customs declaration and entry form given to everyone arriving in the United States.
Of course, they don't actually expect anyone to say 'yes' - the idea (as I understand it) is to give the authorities one more thing to charge an actual terrorist with.
If you're a US Citizen they don't make you sign that on entry, at least not normally. I don't know offhand for foreigners of if you're bringing in a lot of goods. They *do* have that on security clearance applications.
And you're right, the idea isn't that you'll answer yes, it's that if you answer no and turn out to be a terrorist or have supported terrorism, etc..., then you've committed a felony by having lied to a federal officer. (YES. Lying to feds is a crime. The First Amendment doesn't protect you from that.) So they can arrest you and throw away the key, at least for a while.
requires something that so far the gov't has shown no interest in...not invading peoples privacy.
Yes. I remember hearing someone had basically developer very similar, very careful tech for the NSA that did one of the surveillance routines they wanted but was *very* careful about user privacy... and they couldn't care less and decided to completely go a different direction. i.e. the one that didn't care about that. Maybe there was a slashdot article on it a few years ago?
Reading the article (gasp!) didn't elucidate things much beyond the summary, although it mentions infectious disease spreading as a possible application while maintaining privacy for unrelated health issues.
In essence the idea is to use artificial scarcity via technological means to create a 'bit budget', where those who access a database of personal info are only allowed a certain amount of flags to search for; this encourages more efficient searching and thus less retrieval of extraneous data. This could be used so that private entities could try to find suitable targets for medical research or advertising, while revealing as little info about as few people as possible; and it might work in that situation. However, there are two big problems with this idea:
1) It assumes the data is only accessible through this one database and can't be accessed in another, more privacy-invading way. If any analysts even suspect that the full dataset will be more useful, then they will use the full dataset if they can and this scheme will be useless. "More data better" seems to be the motto of Big Data despite the well-known haystack problem.
2) Governments are always saying that barriers need to be broken down for their investigators, that they need more/new powers, so there's no way they'll stick to their bit budget. They're gonna ask for more, enough that they have effectively full access to the full dataset, and that's in the unlikely event that they're somehow limited to this access scheme. They're one private 'request', subpoena, or NSL away from full access, anyhow, and political pressure or tax/import/regulatory pressure would make most for-profit entities like Facebook cave in. If this database were maintained by some international nonprofit then it might stand a chance of resisting this.
Corruption is convincing someone that the selfless ideal is the same as their selfish ideal.
The face? Others faces in a picture? Linguistic analysis? Terms used? Non english words? Slang? Tattoos or symbols? First hop of friends and their pics? Second hop of friends and their friends, links?
.. A real person or a friendly clandestine service setting another fake account up to fool contacts into joining a converstaion?
3rd hop? Getting to the maths and scale of total collection yet? 4th hop?
What can be found in a front facing web 2.0 site without the ip logs and support from regional ISP providers to ensure the ip range is even from a real persons computer, desktop, phone or tablet? For that deep telco support is needed. Did they use a VPN for all submissions? Access to the original IP is then needed.. local wifi? CCTV is good for that
The NSA and GCHQ dropped dictionary and friends of friends as its cheaper and much more useful to just collect it all globally.
No clandestine service is going to set boundaries with a "target population" when anyone could be interesting to any friendly nation or agency asking for help over the years.
ie the security services in 5 nations have aspects of every connection, term, scrap of information, ip, image, call collected.
With no limits anyone found to be of interest can be backtracked over any year given a request by any mil or gov or a tip from an NGO, informant or other collection method.
Limits on collection at the front end was only an issue to the US and UK in the 1950-70's when hardware could not keep up with early attempts at collect it all.
Once enough hardware was installed the global telecommunications use was tamed, kept and could be indexed. Putting a filter on what is even considered for collection was of no use.
Too many total strangers with no connections to anyone of interest got listed as been interesting and having information already collected on them was vital.
Also note that a lot of easy to find groups are "turned" or total fronts of Western clandestine service as tools for color revolutions, freedom fighters, politically useful moderates mentioned in the press or vast sock puppet networks to contain other advanced nations.
To bait new members to walk in they have to have all the trappings: slag, flags, music, culture, past glory... that can take years before it becomes a trusted pipeline for the Western clandestine service to collect vast numbers of unique individuals of interest.
All the West has is signals intelligence over the internet, the internet has to be free and open to get people feeling hidden enough to reach out or create profiles... start chatting.. then collect it all can work its magic
An algorithmic framework on the "net" will just alert or shut groups been tracked online and they can return to protected community face to face meetings.
Does the West have a cadre of trusted informants to cover all people of interest in shifts? It takes a few people per shift to watch just one person.
Dont let a simple rush to do "algorithms" and block accounts make the totally observable internet stop chatting.
Domestic spying is now "Benign Information Gathering"
It's dead simple: communist/terrorist/anti-American scum are anyone you've already killed. Women, children, infants, farm animals, trees, it makes no difference. As soon as a victim joins the corpse club they are automatically guilty. It's what's happening in the Middle East right now. It never went away.
This will work the same way. That whole constitutional bullshit about "innocent until found guilty" is obsolete. Based on the "certainty" of the infallible computer, the authorities will use "parallel investigation" to find (i.e. fabricate) evidence to charge you with a crime. Then they lay on criminal counts so severe a conviction means your dead body will still be doing hard time in solitary into the next century. If you plead guilty then you will only do 5 or 10 years, so you will have some life on the outside before you die. Everyone rolls over because the courts are a joke, and innocence will not save you. The game is rigged.
Any questions? There better not be, or you will end up in a Super Max prison under a different name and Social Security number.
Why is Snark Required?
I think we all agree that this "state of the art" represents an automated version of 1984. The tool is there, it just depends on what you use it for, i.e. what your target population is.
Proximity measures can be derived from anything on the Internet, and that opens the gates to widespread use (and abuse).
Take e.g. proximity to known mafia members as a distance measure, and you'll find mafia networks (even though most of their connections are offline. Adding cellphone metadata to the mix will soon cure that).
Take proximity to e.g. farmer Bundy and his son as your measure, intersect that with gun ownership, affinity to guns, right-wing ideological websites, and anti-government activism and you'll find an interesting pool of homegrown "potential terrorirsts".
Take proximity to known Jews, phonecalls to Israel, Israeli embassy personnel, Israeli citizens or pro-Israel-interest websites, and (sensitive defense information or political power including House membership), and you've got a pool of potential "Pollards" .
Take proximity to websites like Salon.com, Bernie Sanders, online articles that bash Tea Party politics, Trump, Cruz, Rubbio, Palin, Koch brothers, NFA etc., and you've identified "militant leftists".
Take visiting of pro-Islam websites, ability the read Arabic, mosque attendance, having a beard, and owning a gun, and you've got potential Jihadi terrorists.
Great huh? Fully automated pre-screening of undesirables of all stripes. Possibilities are endless. I'm sure that Sen. McCarthy and mr. Hoover would have approved wholeheartedly.
Who needs plodding old-fashioned intelligence work and old-fashioned police work now?
All we need now is for someone to relax the standards of evidence needed to prosecute people for suspicious behaviour and we can really get to work on "terrorists". If they're truly innocent the subsequent legal process will clear them, right?
I hear your horn! I hear your horn!
It's still dictatorship and lack of freedom of thought, opinion and expression.
Why shall we be ruled by our nations and governments? Shouldn't they accept that?
I hate my government, the traitors, the media, the immigrants.
I'm open with that.
I haven't attacked shit.
If anything the problem isn't that we aren't allowed to speak and that people don't listen, the set the foundation for "terrorism" but what is "terrorism" anyhow?
Wikipedia: "In a narrower sense, terrorism can be understood to feature a political objective."
My government is clear that my opinion doesn't matter, they won't listen and nothing will change and they ignore people like me.
In the end what is the difference? The ruling authority use threat of violence against those who are against it too. All governments are terrorists for their own agenda?
Are Hamas terrorists? Israel? USA? Al-Assad? ISIS? Iran? France?
Set evil bit to one!
Part of the reason for disliking privacy invasion is personally risk but part of it is structural, there are societal risks to having such data gathered even if no one intends to look at it anyway. With the incentives for legitimate and semi-legitimate users that you mention being so bad already, we can already imagine the intensives for organised crime and foreign intelligence, particularly when it comes to attacking the police regulatory services and the politicians themselves. The forighn security services are particularly an issue since the 5 eyes have "foreigners dont have rights let's spy on each other's people and swap data because violations don't count when other people do them for us", why should the NSA ask for permission to access a British database and visa versa. In short this is not protective at all against the worst offenders.
I may be particularly ignorant on the subject, but what terrorist in his right mind uses social media to plan operations. I suspect the real reason for spying on social media is for the state security apparatus to monitor its own citizens. Who may have deluded themselves into thinking they choose their own governments.
Tuttle... CLOSE ENOUGH.
Bullshit.
The whole point of such algorithms is to determine who the terrorists are which means that if you 'associate' in some way (live near? work near? use a shooting range with? take airplane flying lessons in the same school as? share a clothing store with? visit a website with IS newsletters? ) with one or more terrorists your bits are going to flip.
blindly antisocialist = antisocial
"own" ... he said "own"
wageslave clown, you don't "own" anything, and haven't since about the middle of the last century.
maybe you pay rent for permission to use $thing according to its EULA, but you don't "own" it.
this applies to everyone except the actual ruling class (who are the *actual* owners of everything, and everyone)
I bet you think voting (for $puppet) matters, too?
it's time to cleanse the world with fire. just burn the whole fucking thing down, and start over.
The algorithm for finding criminals while protecting privacy was disclosed in an ancient process called "getting a warrant."
This is my signature. There are many like it, but this one is mine.
This is referred to as differential privacy by Cynthia Dwork. She's an expert on the techniques used to perform data mining without personally identifiable information about "other" people in the dataset. Here's a video of a fascinating talk she gave that outlines her work:
https://www.youtube.com/watch?v=vh2xfgfymHk
remove nospam. to email!
Even if it would work flawless, the problem is "the targeted group". For the NSA, the target is the suspected terrorist (one wrong word in a mail), his friends and the friends of his friends. As TARGETs. So even when all others are spared, its still the average number of friends to the power of two.