Algorithms Claimed To Hunt Terrorists While Protecting the Privacy of Others (vice.com)
An anonymous reader sends this report from Motherboard:
Computer scientists at the University of Pennsylvania have developed an algorithmic framework for conducting targeted surveillance of individuals within social networks while protecting the privacy of untargeted digital bystanders. ... The algorithms are based on a few basic ideas. The first is that every member of a network (a graph) comes with a sequence of bits indicating their membership in a targeted group. If say, the number two bit was set in your personal privacy register, then you might be part of the “terrorist” target population. For an algorithm searching a network for targets, it doesn’t just get to ask to reveal every network member’s bits. It has a budget of sorts, where it can only reveal so many bits and no more. The algorithms work to optimize this scenario such that as many bits-of-interest are revealed as possible. It does this optimization via a notion known as a statistic of proximity (SOP), which is a quantification of how close a given graph node is to a targeted group of nodes. This is what guides the search algorithms.
Reading the article (gasp!) didn't elucidate things much beyond the summary, although it mentions infectious disease spreading as a possible application while maintaining privacy for unrelated health issues.
In essence the idea is to use artificial scarcity via technological means to create a 'bit budget', where those who access a database of personal info are only allowed a certain amount of flags to search for; this encourages more efficient searching and thus less retrieval of extraneous data. This could be used so that private entities could try to find suitable targets for medical research or advertising, while revealing as little info about as few people as possible; and it might work in that situation. However, there are two big problems with this idea:
1) It assumes the data is only accessible through this one database and can't be accessed in another, more privacy-invading way. If any analysts even suspect that the full dataset will be more useful, then they will use the full dataset if they can and this scheme will be useless. "More data better" seems to be the motto of Big Data despite the well-known haystack problem.
2) Governments are always saying that barriers need to be broken down for their investigators, that they need more/new powers, so there's no way they'll stick to their bit budget. They're gonna ask for more, enough that they have effectively full access to the full dataset, and that's in the unlikely event that they're somehow limited to this access scheme. They're one private 'request', subpoena, or NSL away from full access, anyhow, and political pressure or tax/import/regulatory pressure would make most for-profit entities like Facebook cave in. If this database were maintained by some international nonprofit then it might stand a chance of resisting this.
Corruption is convincing someone that the selfless ideal is the same as their selfish ideal.
On the visa application form, https://www.schneier.com/blog/...
"When I first heard Daydream Nation it quite frankly scared the living shit out of me." -- Matthew Stearns