Siri Keeps Your Data For Two Years
New submitter LeadSongDog writes with news that Apple has provided information on how long it holds onto voice search data used by its digital assistant software Siri. Speaking to Wired, an Apple representative said the data is kept for two years after the initial query.
"Here’s what happens. Whenever you speak into Apple’s voice activated personal digital assistant, it ships it off to Apple’s data farm for analysis. Apple generates a random numbers to represent the user and it associates the voice files with that number. This number — not your Apple user ID or email address — represents you as far as Siri’s back-end voice analysis system is concerned. Once the voice recording is six months old, Apple “disassociates” your user number from the clip, deleting the number from the voice file. But it keeps these disassociated files for up to 18 more months for testing and product improvement purposes."
This information came in response to requests for clarification of Siri's privacy policy, which was not very clear as written. The director of privacy group Big Brother Watch said, "There needs to be a very high justification for retaining such intrusive data for longer than is absolutely necessary to provide the service."
A corrupt slashdot luser has pentrated the moderation system to downmod all my posts while impersonating me.
Nearly 230++ times that I know of @ this point for all of March/April 2013 so far, & others here have told you to stop - take the hint, lunatic (leave slashdot)...
Sorry folks - but whoever the nutjob is that's attempting to impersonate me, & upset the rest of you as well, has SERIOUS mental issues, no questions asked! I must've gotten the better of him + seriously "gotten his goat" in doing so in a technical debate & his "geek angst" @ losing to me has him doing the:
---
A.) $10,000 challenges, ala (where the imposter actually TRACKED + LISTED the # of times he's done this no less, & where I get the 230 or so times I noted above) -> http://it.slashdot.org/comments.pl?sid=3585795&cid=43285307
&/or
B.) Reposting OLD + possibly altered models - (this I haven't checked on as to altering the veracity of the info. being changed) of posts of mine from the past here
---
(Albeit massively repeatedly thru all threads on /. this March/April 2013 nearly in its entirety thusfar).
* Personally, I'm surprised the moderation staff here hasn't just "blocked out" his network range yet honestly!
(They know it's NOT the same as my own as well, especially after THIS post of mine, which they CAN see the IP range I am coming out of to compare with the ac spamming troll doing the above...).
APK
P.S.=> Again/Stressing it: NO guys - it is NOT me doing it, as I wouldn't waste that much time on such trivial b.s. like a kid might...
Plus, I only post where hosts file usage is on topic or appropriate for a solution & certainly NOT IN EVERY POST ON SLASHDOT (like the nutcase trying to "impersonate me" is doing for nearly all of March/April now, & 230++ times that I know of @ least)... apk
P.S.=> here is CORRECT host file information just to piss off the insane lunatic troll:
--
21++ ADVANTAGES OF CUSTOM HOSTS FILES (how/what/when/where/why):
Over AdBlock & DNS Servers ALONE 4 Security, Speed, Reliability, & Anonymity (to an extent vs. DNSBL's + DNS request logs).
1.) HOSTS files are useable for all these purposes because they are present on all Operating Systems that have a BSD based IP stack (even ANDROID) and do adblocking for ANY webbrowser, email program, etc. (any webbound program). A truly "multi-platform" UNIVERSAL solution for added speed, security, reliability, & even anonymity to an extent (vs. DNS request logs + DNSBL's you feel are unjust hosts get you past/around).
2.) Adblock blocks ads? Well, not anymore & certainly not as well by default, apparently, lol - see below:
Adblock Plus To Offer 'Acceptable Ads' Option
http://news.slashdot.org/story/11/12/12/2213233/adblock-plus-to-offer-acceptable-ads-option )
AND, in only browsers & their subprogram families (ala email like Thunderbird for FireFox/Mozilla products (use same gecko & xulrunner engines)), but not all, or, all independent email clients, like Outlook, Outlook Express, OR Window "LIVE" mail (for example(s)) - there's many more like EUDORA & others I've used over time that AdBlock just DOES NOT COVER... period.
Disclaimer: Opera now also has an AdBlock addon (now that Opera has addons above widgets), but I am not certain the same people make it as they do for FF or Chrome etc..
3.) Adblock doesn't protect email programs external to FF (non-mozilla/gecko engine based) family based wares, So AdBlock doesn't protect email programs like Outlook, Outlook Express, Windows "LIVE" mail & others like them (EUDORA etc./et al), Hosts files do. THIS IS GOOD VS. SPAM M
How long are the backups of these systems kept for? Do they require a subpoena to get those backups, or will Apple cheerfully hand it over to any agency that asks?
Big Brother will come along at least once during that period so you can rest assured knowing it's stored for eternity.
is unfortunately in the eye of the beholder... The US government's reliance on it's ability to access private data has helped so much with the Boston suspects, we will wrest these gains into the intrusion of privacy from their cold, dead hands.
Happiness in intelligent people is the rarest thing I know.
Ernest Hemingway
"Siri, how much fuel oil should I mix with 25 pounds of ammonium nitrate?"
Anyone have the timeline for Google's disassociation and destruction of search queries? I'm curious how Apple's policies compare against those.
My guess is the overlap between "people who complained Siri wasn't accurate" and "people who dont want apple keeping any Siri data so they can make it better" is pretty close to perfect.
Google reads your mail. Apple listens to your ravings. Don't like it, don't use it. And they only keep 'your' (ie identifable) data 6 months.
It's becoming exceedingly difficult to keep your search history private. All the major search companies keep it, Apple keeps Siri searches, etc. DuckDuckGo I believe keeps things as anonymous as you can get. There are also some hacks you can do if you are careful, privacy mode/ incognito is a start, but even then it's easy to tip your hand. If you are truly doing something crazy, use a bootable USB and do your searches from a random public wifi hotspot.
I am getting tired of Apples continuing Privacy abused, first they sell their customers to the highest bidder now this.
Even Siri was ruined with advertising http://www.inquisitr.com/256025/steve-wozniak-says-apple-ruined-siri-technology-after-acquisition/ "Steve says he initially loved Siri because it could accurately answer questions such as “What are the five largest lakes in California?” and “What are the prime numbers greater than 87?” . To which Wozniak replied, “It’s incredible. It’s like it understands ‘greater than.’”
Wozniak also notes that his former question about California Lakes now brings up lakefront properties while his question about prime numbers now displays information about prime ribs."
There EULA's have got so abuse they are subject to ridicule by South Park http://en.wikipedia.org/wiki/HumancentiPad in HumancentiPad
Three words "Don't be Evil"
Anyone have the timeline for Google's disassociation and destruction of search queries? I'm curious how Apple's policies compare against those.
Ironically they sell Apple customers to Google currently for a $400Billion although they are allegedly selling its customers to Yahoo next. So for now *exactly the same* because they are the same :)
Everyone I've ever spoken to or read about in the field of voice recognition tells me that having samples of people's voices is critical to improving it... and getting those samples (mainly the raw quantity of samples) is the biggest problem they face.
So it doesn’t surprise me at all that anyone keeps a massive archive of samples... the sample data can be critical in improving voice recognition.
As an aside: Google Voice's voice mail feature does more or less the same thing... and the reasoning is the same also: More sample data means better voice recognition.
I can't help but shake my head at the comparison:
Google samples user voices, reads (and transcribes) voice mail, reads your email, your stock information and then feeds it into their advertising engine, and does this for four years and counting; reaction: Meh...
Apple samples voices, anonymizes it, uses it it improve voice recognition over a period of two years; reaction: EVIL! APPLE MUST DIE!
-- Sometimes you have to turn the lights off in order to see.
...and have since 2007 These two great blog posts cover the details "Taking steps to further improve our privacy practices" http://googleblog.blogspot.co.uk/2007/03/taking-steps-to-further-improve-our.html and "
How long should Google remember searches? " http://googleblog.blogspot.co.uk/2007/06/how-long-should-google-remember.html an example from it "By anonymizing our server logs after 18-24 months, we think we’re striking the right balance between two goals: continuing to improve Google’s services for you, while providing more transparency and certainty about our retention practices." Google are suprisingly forthcoming about how and what they do with your data, which clashes sharply with Apple(pretend the don't) or Microsoft(who run hate campaigns)
See here, explains it all -> http://tech.slashdot.org/comments.pl?sid=3561925&cid=43223585
* :)
I.E./Summary: Trolls had a challenge put to them to validly disprove my points in the post I just replied to - result? Trolls FAIL... lol!
APK
P.S.=> That's what makes me LAUGH harder than ANYTHING ELSE on this forums (full of "FUD" spreading trolls) - When you hit trolls with facts & truths they CANNOT disprove validly on computing tech based grounds, this is the result - Applying unjustifiable downmods to effetely & vainly *try* to "hide" my posts & facts/truths they extoll!
Hahaha... lol, man: Happens nearly every single time I post such lists (proving how ineffectual these trolls are), only showing how solid my posts of that nature are...
Ah yes "geek angst" @ it's 'finest' (not), vs. facts & truths = downmod by /. weak trolls!
... apk
So not only does she put the wrong things and is entirely useless, she remembers all the times she is wrong
In reference to an earlier question about Google's data retention policies, one of the comments provided a great link to a 2007 Google blog post that describes why Google holds onto their data for 18 months before they anonymize it. One of the interesting things that was said was:
However, we must point out that future data retention laws may obligate us to raise the retention period to 24 months.
Given that the blog post was written back in 2007, isn't it now possible that 24 months is simply the earliest that a company like Apple is allowed to delete the query, given the various data retention regulations that are in place around the world? That they disassociate it after 6 months still puts them ahead of Google's 18 months, though voice data is significantly less anonymous than the text of a query, generally speaking, so that they keep it at all is not something I like the idea of.
If I were in charge of Siri, I'd do the same thing. That kind of real-world data is vital for regression testing. If you don't have a strong corpus of sample data, when you make changes to the code, you've got no idea if what you are doing is improving the situation for some cases, while damaging them for others. You would see people complaining about things like "Well Siri used to work for X query but now it doesn't". When you have this data, you can update the code, run the test suite, and see if it fails a large number of existing cases.
If Apple do anything to mitigate this, it will probably be some form of opt-out, but they are unlikely to make it the default, because I would imagine that building a corpus of representative speech from a thousand different accents talking about tens of thousands of different subjects is nigh on impossible otherwise, especially as jargon comes and goes so quickly these days.
Bogtha Bogtha Bogtha
Don't like it? Don't use it.
"Could A Yahoo-Apple Deal Spell Trouble For Google?" http://www.webpronews.com/iphones-and-ipads-could-soon-get-a-big-dose-of-yahoo-2013-04 its a great article, about Yahoo! (Who share there data with Microsoft) and Apple, but from the Article...although its common news "An analyst at Macquarie Capital estimated that Google was making $1.3 billion annually in paid search revenue from iOS devices. Macquarie speculated that Google returned about $1 billion of that to Apple as part of the agreement that made Google the default search engine on the Safari browser.Another financial analyst has come up with a similar annual estimate of the value of Google’s default iOS search deal with Apple: $1 billion. Morgan Stanley’s Scott Devitt is responsible for the new estimateDevitt disagreed with Macquarie, arguing that the structure of the relationship is probably not a “revenue sharing” deal but instead a straight fee-per-device payment from Google to Apple. Devitt believes that Google pays Apple roughly $3.20 per iOS device, which would avoid the accounting issues arising from a revenue sharing agreement."
save them for years. So it's a whole different ballgame.
I know your Angry with Apple and confused right now, You bought an Apple phone and Apple still sold you to Google. You paid a mark-up of 50% on a $650 phone just to be sold for a measly $3.20, who would have thought you were so cheap.
A "high justification"? How about speech recognition that actually works?
Training speech recognisers requires data. The biggest reason why speech recognition has improved in the recent years: lots of data.
Speech recognition in the cloud has given companies like Apple and Google a reason/excuse to gather masses of training data. They have put it to good use: speech recognition is much better than it was. If you like speech recognition, use it, meanwhile donating your data and helping the rest of us. If you don't, don't use it. As long as users are aware of this, I don't really see the problem.
somewhere in a data warehouse with only a few humans, there are millions of disassociated voices crying out to be heard. "But it keeps these disassociated files for up to 18 more months for testing and product improvement purposes."
In NSA America social networks join you!
Why does apple need to save siri data at all beyond processing time? ,what should be, secure data etc
The concern with siri vs search engine queries is that siri is being used to enter and queries personal data. Possibilities is balances of accounts, passwords, contract or business deals, anything you do on an ipad or iphone - its not just "search for clowns" etc.
A random number is only a random number - it still is linked to your phone somewhere or the service would work. Your phone is linked to your account, so the data IS linked direct to your personal account! Its irrelevant if they store a random number with it, when that random number is linked to you anyway!
Apple can still track everything about its users, including their search queries, plans, business data and anything else they type or say into the ispy products.
Apple have an appalling record for privacy, as seen in the past with the itune debacle, gps tracking data, and now personal searches, queries, data, and other
Google at least gives the option to turn off the tracking and are open about the information they have on you!
As the map you voice recordings to an id and map that id to your apple id, I find it very strange they can claim it's anonymous!
From a better article:
http://arstechnica.com/apple/2013/04/apple-remembers-where-you-wanted-to-get-drunk-for-up-to-2-years/
Muller pointed out, however, that the identifiers are deleted immediately—"along with any associated data"—when a user turns Siri off on his or her device. (You can do this by going to Settings > General > Siri on a supported iOS device.)
If you can delete the identifiers and associated data when disabling Siri, it is not anonymous.
Also voice recognition has been working fine for many years now, if they want to find your voice clips it shouldn't be much trouble.
And yet, NONE of that says anything about Apple selling their customer's data to Google or anyone else, which is what you've been alleging all along. Google paying Apple for the right to be the default search engine on iOS does not mean that Apple is giving them customer data (Apple's customer is the user, after all, so it's not in their best interests to sell that data), especially so when you consider that the user is fully capable of changing their search engine in the settings for iOS. The only data Google is getting is the data they collect themselves, which is no different than how it is if I use google.com. If I use google.com in Safari or Internet Explorer, that doesn't mean that Apple and Microsoft are being paid by Google for my information.