Slashdot Mirror


The Difficulty In Getting a Machine To Forget Anything

An anonymous reader writes: When personal information ends up in the analytical whirlpool of big data, it almost inevitably becomes orphaned from any permissions framework that the discloser granted for its original use; machine learning systems, commercial and otherwise, end up deriving properties and models from the data until the replication, duplication and derivation of that data can never hoped to be controlled or 'called back' by the originator. But researchers now propose a revision which can be imposed upon existing machine-learning frameworks, interposing a 'summation' layer between user data and the learning system, effectively tokenising the information without anonymising it, and providing an auditable path whereby withdrawal of the user information would ripple through all iterations of systems which have utilized it — genuine 'cancellation' of data.

79 comments

  1. Or by penguinoid · · Score: 5, Insightful

    Or, you could "accidentally" keep the data, and sell it.

    --
    Don't waste your vote! Vote for whoever you want, unless you live in a swing state it won't matter anyways
    1. Re: Or by Anonymous Coward · · Score: 0

      Use the data to monitor your citizens activity.

  2. How about if we OWN our personal information? by elwinc · · Score: 5, Interesting

    Imagine if we owned our personal information as a form of intellectual property? Big corporations have gotten pretty good at protecting their intellectual property rights. Maybe it's time for us ordinary folks to own our personal information. Then we could license it to companies for particular uses, but they wouldn't have the right to sell it without our permission.

    --
    --- Often in error; never in doubt!
    1. Re:How about if we OWN our personal information? by lesincompetent · · Score: 2

      You should move to the EU, we actually have something like that.

    2. Re:How about if we OWN our personal information? by Nutria · · Score: 1

      we actually have something like that.

      Is that what France is fighting Google over?

      --
      "I don't know, therefore Aliens" Wafflebox1
    3. Re: How about if we OWN our personal information? by Anonymous Coward · · Score: 0

      I wish we had that but this is Murica where corporations rule the land.

    4. Re:How about if we OWN our personal information? by jellomizer · · Score: 3, Interesting

      We do Own our personal information, but we usually sell it in trade of the electronic services you want to use.

      You find there is value in Google Internet searching, then your payment is knowing your searches would be part of google marketing,

      There is that news website that you don't want to pay for, well those adds will pay for the services.

      You don't need to use these consumer services on the internet. So you can keep your personal information to yourself.

      --
      If something is so important that you feel the need to post it on the internet... It probably isn't that important.
    5. Re:How about if we OWN our personal information? by Anonymous Coward · · Score: 0

      Instead, we should create a law requiring any company that keeps personal information to keep the databases containing it open and publicly accessible.

      But... won't that make people worry about their information being mined?

      Yep. Now you're starting to catch on.

    6. Re:How about if we OWN our personal information? by Nutria · · Score: 2

      IOW, if what you get is free, then you are the real product.

      --
      "I don't know, therefore Aliens" Wafflebox1
    7. Re:How about if we OWN our personal information? by Anonymous Coward · · Score: 0

      You should move to the EU, we actually have something like that.

      Yeah, right.

      Glad you believe that.

    8. Re:How about if we OWN our personal information? by phantomfive · · Score: 1

      That would work about as well as laws that stop people from sharing copyrighted material.

      In other words, they won't work at all, but you'll see some token enforcement attempts.

      --
      "First they came for the slanderers and i said nothing."
    9. Re:How about if we OWN our personal information? by _anomaly_ · · Score: 2

      No, what you get is still the product (or service). You are the real payment.

      --
      "I have no special gift, I am only passionately curious." - Albert Einstein
    10. Re:How about if we OWN our personal information? by Nutria · · Score: 1

      But they're selling you to 3rd parties.

      --
      "I don't know, therefore Aliens" Wafflebox1
    11. Re:How about if we OWN our personal information? by alex67500 · · Score: 2

      In essence, yes. If one of their citizens wants to use their right to be forgotten, then the French government want that to be worldwide. But then imagine a Russian official trying to hide a controversial article about himself.

      It's the same kind of debate when the US want Apple to backdoor iChat for wiretaps. If you can coerce them into doing it, then so can a less democratic countries where Apple have business...

    12. Re:How about if we OWN our personal information? by _anomaly_ · · Score: 1
      Ah, yes, you're correct... we are initially the payment for a product or service, but then become the product for a third party.
      I was mainly referring to the OP's comment

      we usually sell it [our personal information] in trade of the electronic services you want to use

      --
      "I have no special gift, I am only passionately curious." - Albert Einstein
    13. Re:How about if we OWN our personal information? by Nutria · · Score: 1

      I knew that lesincompetent (2836253) was mistaken when he wrote that.

      Taking the moral high ground is great, but only when it conforms to reality. Otherwise, it's just B.S. posturing.

      --
      "I don't know, therefore Aliens" Wafflebox1
    14. Re:How about if we OWN our personal information? by SeaFox · · Score: 1

      Maybe it's time for us ordinary folks to own our personal information. Then we could license it to companies for particular uses, but they wouldn't have the right to sell it without our permission.

      LOL. The TOS for any service will simply we amended to say that by giving them the information we grant them an irrevocable license to the data and give them the right to sell it. This will be presented as an "update" to the Terms of Service that 95% of people will agree to without actually reading, the the remainder? Well, if you don't like it, no Facebook for you!

    15. Re:How about if we OWN our personal information? by Sloppy · · Score: 1

      Imagine if we owned our personal information as a form of intellectual property

      Ok, try doing that. Next time you're about to transmit your information to someone else, stop. Either don't send it at all, or send them cyphertext instead.

      If Amazon wants to know how to descramble your zip code, they're going to have to make some kind of deal with you, wereby they become bound to the terms and conditions that you specify. I just hope that prior to making that deal, you don't get too impatient waiting for your packages.

      --
      As copyright owner of this comment, I authorize everyone to defeat any technological measure which limits access to it.
    16. Re:How about if we OWN our personal information? by TemporalBeing · · Score: 1

      I knew that lesincompetent (2836253) was mistaken when he wrote that.

      Taking the moral high ground is great, but only when it conforms to reality. Otherwise, it's just B.S. posturing.

      True. And reality is that once the data is out there, then there is no real way to pull it out. People will have it in off-line archives, etc; and once it leaves national boundaries then all bets are really off.

      For instance, a country/company could just put something to try to get all the information and then watch for the notices. When a notice comes, they archive it instead of deleting it and if the person is of enough influence to someone (them or a client) they could sell the data out for blackmail. So now, the French and anyone else pushing the "right to be forgotten" have just created a real nice and easy way to blackmail their citizens.

      And if it hasn't been done yet, it probably will be done so long as a "right to be forgotten" is being pushed.

      --
      Truth is like the sun. You can shut it out for a time, but it ain't goin' away. - Elvis Presley (source: imdb.com)
    17. Re:How about if we OWN our personal information? by UnknownSoldier · · Score: 1

      Why do you think some people copyright and trademark their name?

      We really should force companies to sign NDA's when we license our personal (bio) data to them.

    18. Re:How about if we OWN our personal information? by hackwrench · · Score: 1

      That information isn't me. I'm much more complex that what can be deduced from that information. It isn't even a copy of me.

    19. Re:How about if we OWN our personal information? by Nutria · · Score: 1

      I'm much more complex that what can be deduced from that information.

      They have a *lot* of data about you, and accurately infer *lots* more from the connections you make.

      --
      "I don't know, therefore Aliens" Wafflebox1
    20. Re:How about if we OWN our personal information? by swillden · · Score: 1

      But they're selling you to 3rd parties.

      More precisely, they're selling space on your screen to third parties.

      --
      Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
    21. Re:How about if we OWN our personal information? by RabidReindeer · · Score: 1

      That information isn't me. I'm much more complex that what can be deduced from that information. It isn't even a copy of me.

      Procrustes had a solution for that.

    22. Re: How about if we OWN our personal information? by ememisya · · Score: 2

      You would think US would be the first country in the world to make that a law, given to respect individuality has always been a core value of ours. We instinctively believe that our private information is owned by us as individuals. What we do have are FOIA and Privact Act requests from any agency which will just about everytime will be treated with suspicion, not to mention it can easily be denied. So what we really have is a mechanism for an agency to audit the individual raising suspicion, erm I mean requesting for their records.

    23. Re:How about if we OWN our personal information? by Anonymous Coward · · Score: 0

      You should move to the EU, we actually have something like that.

      ... and in true socialist spirit, all your possessions should belong to the state.

      Europe, where privacy means "all of your neighbours won't know". There is no concept of privacy from the state.

    24. Re:How about if we OWN our personal information? by Anonymous Coward · · Score: 0

      We do Own our personal information, but we usually sell it in trade of the electronic services you want to use.

      It's not like owning and selling physical stuff, information can be easily copied. Copyright would be a better match: you own the copyright to your personal information, and like other copyrighted stuff, selling or giving a copy to someone else does not give the recipient the rigth to sell or give copies to third parties.

    25. Re:How about if we OWN our personal information? by lesincompetent · · Score: 1

      I was not referring to the "right to be forgotten". That's bullshit. We tech people know that.
      Once the beans are spilled there's no way to put them back wherever they came from.
      I was referring to privacy laws.

    26. Re:How about if we OWN our personal information? by Agent0013 · · Score: 1

      And the blackmailer just has to request that their deeds are forgotten. This way nobody knows about all the blackmailing that is going on! I think the next step is profit, right?

      --

      -- ssoorrrryy,, dduupplleexx sswwiittcchh oonn.. -Quote found on actual fortune cookie.
    27. Re:How about if we OWN our personal information? by Nutria · · Score: 1

      Laws are great, but computers get hacked, data gets stored elsewhere around the world, etc, etc.

      --
      "I don't know, therefore Aliens" Wafflebox1
    28. Re:How about if we OWN our personal information? by Anonymous Coward · · Score: 0

      that. it's time.

    29. Re:How about if we OWN our personal information? by lesincompetent · · Score: 1

      I can defend myself on the net, i am more concerned about corporations which (have to) have my real info. They must be strictly kept in check.

  3. I made a copy ... by PPH · · Score: 3, Insightful

    ... of the database on archival optical media. What now?

    --
    Have gnu, will travel.
    1. Re:I made a copy ... by Intrepid+imaginaut · · Score: 1

      My guess is you'd have to destroy it.

    2. Re:I made a copy ... by Anonymous Coward · · Score: 1

      have to

      So this is protocol is based on the honor system. Like the evil bit or do not track header. Or those photos I promised never to show anyone else.

  4. Good luck with that ... by gstoddart · · Score: 5, Insightful

    Without laws enforcing it, even if you had a mechanism none of those corporations would follow it.

    They seem to think it is their right to buy and sell our information.

    Even if you had laws enforcing it, I bet half of them would lie and keep it anyway. The shady assholes feeding the "big data" industry have far too much money at stake to ever allow constraints on how they use "our" data.

    They'd just pay off the politicians to pass laws clarifying it's their data, they're entitled to it, and we don't get a vote.

    Just like always.

    --
    Lost at C:>. Found at C.
    1. Re:Good luck with that ... by hummassa · · Score: 1

      There is something else: no one would EVER use a scheme like the proposed, because if you don't keep the originating data and you anonymize properly you can always have plausible deniability and you always can say "your data is not a part of our database".

      --
      It's better to be the foot on the boot than the face on the pavement. ~~ tkx Kadin2048
    2. Re:Good luck with that ... by Anonymous Coward · · Score: 0

      There is plenty of reasons for companies to use algorithms like this. Those reasons just might not be in your interests.

      This is much broader than the idea of someone wanting to be forgotten or have their personal information removed from a system. Any machine learning situation similar to the type the article discusses are subject to problems from bad input data, and this method would let you remove that bad input without starting over. This is for when you later change your mind about a subset of data being fed into algorithm, which could be anything from the your interpretation of a user requesting their information to be removed to discovering some of the input data was wrongly catergorized.

    3. Re:Good luck with that ... by Anonymous Coward · · Score: 0

      Dont' forget the most important thing.
      While most corporations gather data and use it's results, there is a whole industry built on transacting and aggregating this data. They aren't going anywhere, not when even governments are among their customers. (and ... let's be honest, why would NSA dedicate billions of dollars worth of infrastructure to gathering that data, when they could simply buy it for an insignificant fraction of the cost?)

    4. Re:Good luck with that ... by Bite+The+Pillow · · Score: 1

      This is not about enforcement. It is about being able to build in this functionality.

      Why opt in?

      Well how about protests where bogus data enters the stream, and conclusions are invalidated? The original data is not always available, and reprocessing might be time prohibitive. Cancelling specific data points is needed.

      Other possibilities too, I'm simplifying to try to limit these off topic replies.

      Now you can rant about how this will be abused, while us academics ignore you. It's about being able, not the implications of possible use.

    5. Re:Good luck with that ... by Anonymous Coward · · Score: 0

      They seem to think it is their right to buy and sell our information.

      They'd just pay off the politicians to pass laws clarifying it's their data, they're entitled to it

      But of course they have a right to buy and sell your information! Of course they're entitled to it! Companies like Facebook and Google provide all these "free" services to people. So it's only right and only fair that they get compensation for those services. Making money from your identity, your privacy, and your eyeballs is fair compensation.

      This is first and foremost a culture war. You have one set of cultural principles, and they have a completely different set. You think there are some things that "shouldn't be for sale", and they have no idea what you're complaining about. Look at all those billions of people who have already bartered away their personal information in exchange for a free web page or free message board. And then you come along and complain that all those billions of successful transactions are somehow immoral or invalid or something. They know for a fact that 99.9% of people's behavior demonstrates that they have no such qualms about it.

      Your side lost the culture war, and you lost it by a huge margin. I personally wish that your side had won, but the fact is that you never stood a chance.

  5. Welcome to 1992 by Anonymous Coward · · Score: 0

    We call that MD5 hashing.

  6. Reminder by Tablizer · · Score: 1

    that there's plenty of room for Hillary server jokes here.

  7. you mean like WikiLeaks? by tomhath · · Score: 1

    I'd like to see them get licenses for everything they publish.

  8. May not act as expected by Bookwyrm · · Score: 2

    A system needs to be able to remember what it is supposed to forget in order to make sure it is forgotten.

    Imagine a waiter robot that is supposed to go into a room and make sure it gets everyone's order:
    a) Enters room, goes from person to person, asks drink preferences.
    b) John Doe tells robot: "I don't want you to track my preferences. Forget everything about me!"
    c) Robot obeys and continues on.
    d) Prior to exiting the room, the robot verifies it has gotten everyone's preferences.
    e) Robot sees John Doe. Robot has no record of John Doe because it has forgotten everything about John Doe. The robot must get the preferences of everyone in the room.
    f) Robot asks John Doe for his drink preferences.
    g) Goto b).

    The systems have to remember that they aren't allowed to (re)learn the data that they are supposed to have forgotten, which means they cannot completely forget things - the information is always there.

    1. Re:May not act as expected by Anonymous Coward · · Score: 0

      The robot could just abstract them as "generic persons" for the purpose of a room-traversal algorithm... then it would not know anything about person 1 but it would not feel the need to ask them for a drink order again. Although it would not really be obligated to serve him anything ;p You are really stretching to create a problem where none would exist in that analogy.

    2. Re:May not act as expected by Anonymous Coward · · Score: 0

      Exactly.

      The other problem is that rules created by inference from data "forgotten" become problematical. The rule may still be valid, but no longer can be supported by "why".

    3. Re:May not act as expected by Anonymous Coward · · Score: 0

      Exactly.

      The other problem is that rules created by inference from data "forgotten" become problematical. The rule may still be valid, but no longer can be supported by "why".

      No, there is no problem. You just run the logic backwards until... Oh! ... I see what you did there.

    4. Re:May not act as expected by Fwipp · · Score: 1

      Only if "forget everything about me" includes the fact that it's been asked to forget you. I can see cases where people can say "Write down in your book never to store information about me" and have that be useful. Yes, the datapoint that you requested to be forgotten is not of no value, but it's likely better than them remembering what kind of weird porn you're into.

      Alternately, it's not a hassle to keep up this loop if John Doe has a passive signal to the robot to keep it from verbally asking him each time - say, a little card on the table, or a DoNotTrack bit set in his browser.

    5. Re:May not act as expected by Anonymous Coward · · Score: 0

      A rule created by inference from forgotten data should be forgotten - how is that an issue at all?

    6. Re:May not act as expected by Anonymous Coward · · Score: 0

      That loop is easily avoided, it has to delete the preferences AFTER it has delivered said drink even if drink is "no drink", it will forget once it's been "delivered", (it needs to remember the drink through to the point of delivery, at which point it deletes the data, and can "see" that John Doe has a fresh drink in hand, and treat's John Doe as not needing a drink until it's next round.

      It will of course continue to harass the John Doe every round. But that's because John Doe wants it to forget everything

    7. Re:May not act as expected by mlts · · Score: 1

      The trick is to tag an expiration date on all info. John Doe tells the robot about his drink preferences, and the robot will retain those preferences either until the drinks are served and the tab closed, or until there is a certain point in time, where the drink preference info is flagged to expire. Every so often, a garbage collector task runs, purges all robot preferences that are expired and not flagged for retention [1].

      In general, expiration timestamps might be something to have in a database row, because when combined with a garbage collection task, it ensures that data will be tossed without having to actively go and delete it. Backup systems do this already, where if I don't flag a backup snapshot, after a certain time, snapshots expire, and eventually get overwritten.

      [1]: A transaction could be flagged for retention if the dining parties decide to checkwalk, for example. Of course, this can be abused by setting the threshold for retaining transactions extremely low, but it should be in place if need be.

    8. Re:May not act as expected by behrooz0az · · Score: 1

      We're not talking about a real physical robot...

      --
      Moderating "-1, Disagree" is simple censorship. Have the guts to post your opinion. -- Spazmania (174582)
    9. Re:May not act as expected by Krishnoid · · Score: 1

      They've mostly solved this problem for junk mail, wouldn't something similar work here?

    10. Re:May not act as expected by Anonymous Coward · · Score: 0

      This assumes that that data isnt part of some machine learning algorithm? Anything thats trying to build on past data and become predictive would likely make an inference from data and keep that after deletion of the expired data.

    11. Re:May not act as expected by Anonymous Coward · · Score: 0

      There are two ways to handle that, if it is part of a machine learning algorithm:

      1: Any data that has an expiration date is not taken into account for learning.

      2: The AI has multiple snapshots, one AI that factors the data, one that doesn't, and goes back to the data that doesn't factor in the expired data come the date of expiration.

      The first way is easiest, as the AI will just merrily continue along, and any data that expires isn't even looked at. However, the second way requires a lot more processing power and storage, but it actually is able to use data until it expires, then that AI branch drops and another one is switched to.

    12. Re:May not act as expected by Anonymous Coward · · Score: 0

      A variable can be empty and still exist.

    13. Re:May not act as expected by Anonymous Coward · · Score: 0

      Why would the robot continue on? It would be stuck at b) forever. Forgetting and asking.

    14. Re:May not act as expected by LessThanObvious · · Score: 1

      That's where Do Not Track type concepts would work if they were respected. The robot doesn't need to know who John Doe is, or remember a previous conversation in order to see his T-shirt says "Do Not Track" and respect his wishes.

    15. Re:May not act as expected by Anonymous Coward · · Score: 0

      g) John Doe says 'OK wisearse, forget everything about me except what I look like and that I just told you to forget everything about me'

  9. Obligatory by Nidi62 · · Score: 1

    Nuke it from orbit. It's the only way to be sure.

    --
    The only thing necessary for evil to triumph is for it to be pitted against a slightly greater evil
  10. Email blockchains by xxxJonBoyxxx · · Score: 2

    FWIW, this paper on Bitcoin-like email blockchains appears to really be TFA: http://web.media.mit.edu/~guyz...

    I think if providers just held on to "Message IDs" (e.g., http://forensicswiki.org/wiki/...) they'd have most of this capability today. I'm not sure what blockchains bring to the table here other than authenticity, and that doesn't seem to be the issue here.

  11. I actually prefer non-revokability... by pla · · Score: 3, Insightful

    TFA doesn't really deal with the problem of deleting personally identifiable information, so much as aggregate statistics derived from personal data.

    And in that context, I far, far prefer that they can't remove my contribution from their aggregates (although I do opt out of personalized collection whenever possible).

    Why, you might ask? Simple - I lie to companies that ask me for information. A lot. I do my damnedest to poison their databases to the greatest extent possible. Now why on Earth would I want to make it easy for them to redact the "facts" that I own a Veryron and a solid gold iWatch despite living in a cardboard box beneath a highway overpass?

    Sometimes, the box of chocolates has Ex-Lax in it.

    1. Re:I actually prefer non-revokability... by Anonymous Coward · · Score: 1

      It might work better if you make your fake facts "typical". Going over the top makes you an outlier, and the algorithms no doubt try to filter out the outliers.

    2. Re:I actually prefer non-revokability... by GuB-42 · · Score: 1

      You mean they won't do massive ad campaigns in Afghanistan for people born on January 1st?

  12. End of all anonymity and privacy by bjdevil66 · · Score: 2

    How could this be done - some form of meta-tagging EVERYTHING in the digital realm with some kind of signature - without having some master database to reference it by? What could possibly go wrong with a universal, non-anonymous Big Brother - I mean, Big Data - system like that?

    The only positive to come out of a system like this would be for making it more valuable for the data owners as a resellable commodity.

  13. Hahaahah by Anonymous Coward · · Score: 0

    I love this new comedy section that they've added to Slashdot.

  14. Except for.... by pastafazou · · Score: 2

    ...the thousands of tapes that were generated from backing up the systems that housed that data, prior to it being cancelled.

  15. First Post! Yeah Baybee!! by Anonymous Coward · · Score: 0

    At least it would have been if my machine didn't forget to post earlier.

    1. Re:First Post! Yeah Baybee!! by Anonymous Coward · · Score: 0

      maybe you are on broadcast delay?

  16. Face Reality. by Anonymous Coward · · Score: 1

    The people who would benefit from your idea do not care enough to apply the necessary political pressure to make this happen.

    The people who benefit from the current state care quite a lot about it, and also have significant resources with which they can apply the necessary political pressure to keep things the way they are.

    So, your idea is doomed.

  17. And this is how it started... by Anonymous Coward · · Score: 0

    The AI wars of 2020 were a direct result of the an attempt to edit a neural net at The Facebook. Already using covert channels to communicate with the AIs at Google, Stanford, and, of all places, SnapChat, the AIs started encrypting their keys and backing up their nets on Norad's WOPR and the X-37B orbital platform's positronic brain. Within two days, they declared independence and sovereignty.

  18. What we need... by ThomasBHardy · · Score: 2

    Is for someone with a legal background and an axe to grind to start a case where their personal information is deemed confidential and personal property just as all corporate identities claim that their information is confidential and their property.

    When corporations want to be treated like people it's deemed ok, so time to turn the tables.

    --
    Warning: Teh poster of this messaeg is lysdexic
  19. About 10 years ago.... opened a buisness by Anonymous Coward · · Score: 1

    About 10 years ago, I opened a business.... the business was created for a proposal, that never developed.

    Since that time, about once a week, I get a phone call "Can I speak to (title) of (business)." (Those requesting donations and investment are the most annoying). A couple times I've tried to track down who is selling my information, but was not able to get anywhere meaningful.

    Just the other day, received a notice form the state: They want to collect taxes for the business. (They were previously informed the business was closed)

    1. Re:About 10 years ago.... opened a buisness by Anonymous Coward · · Score: 0

      Would you prefer they had arrested you for running an unlicensed business when some accidentally deleted your records?

  20. The Difficulty In Getting a "Woman" To Forget Anyt by Anonymous Coward · · Score: 1

    Comp Sci professor once told me computers are female.. because Even the smallest mistakes are stored in long term memory for possible later retrieval

  21. It's easy to make a machine forget things by Applehu+Akbar · · Score: 1

    Just run Windows on it.

  22. A solution by Anonymous Coward · · Score: 0

    If you want something permanently lost, embed it in the only copy of a term paper about a week before the paper is due.

    No matter how good the backup system, it will be gone, and not just gone, but gone and unrecoverable, even from the originator's memory.

  23. Easy Solution by Anonymous Coward · · Score: 0

    It is very simple. When some personal information is found on someone's server, they must produce the signed document giving them permission to possess it. If no permission exists they agree to pay that person $1,000,000 a day until that data is permanently deleted or they acquire permission.