Siri Keeps Your Data For Two Years
New submitter LeadSongDog writes with news that Apple has provided information on how long it holds onto voice search data used by its digital assistant software Siri. Speaking to Wired, an Apple representative said the data is kept for two years after the initial query.
"Here’s what happens. Whenever you speak into Apple’s voice activated personal digital assistant, it ships it off to Apple’s data farm for analysis. Apple generates a random numbers to represent the user and it associates the voice files with that number. This number — not your Apple user ID or email address — represents you as far as Siri’s back-end voice analysis system is concerned. Once the voice recording is six months old, Apple “disassociates” your user number from the clip, deleting the number from the voice file. But it keeps these disassociated files for up to 18 more months for testing and product improvement purposes."
This information came in response to requests for clarification of Siri's privacy policy, which was not very clear as written. The director of privacy group Big Brother Watch said, "There needs to be a very high justification for retaining such intrusive data for longer than is absolutely necessary to provide the service."
How long are the backups of these systems kept for? Do they require a subpoena to get those backups, or will Apple cheerfully hand it over to any agency that asks?
Big Brother will come along at least once during that period so you can rest assured knowing it's stored for eternity.
is unfortunately in the eye of the beholder... The US government's reliance on it's ability to access private data has helped so much with the Boston suspects, we will wrest these gains into the intrusion of privacy from their cold, dead hands.
Happiness in intelligent people is the rarest thing I know.
Ernest Hemingway
"Siri, how much fuel oil should I mix with 25 pounds of ammonium nitrate?"
Anyone have the timeline for Google's disassociation and destruction of search queries? I'm curious how Apple's policies compare against those.
My guess is the overlap between "people who complained Siri wasn't accurate" and "people who dont want apple keeping any Siri data so they can make it better" is pretty close to perfect.
Google reads your mail. Apple listens to your ravings. Don't like it, don't use it. And they only keep 'your' (ie identifable) data 6 months.
It's becoming exceedingly difficult to keep your search history private. All the major search companies keep it, Apple keeps Siri searches, etc. DuckDuckGo I believe keeps things as anonymous as you can get. There are also some hacks you can do if you are careful, privacy mode/ incognito is a start, but even then it's easy to tip your hand. If you are truly doing something crazy, use a bootable USB and do your searches from a random public wifi hotspot.
I am getting tired of Apples continuing Privacy abused, first they sell their customers to the highest bidder now this.
Even Siri was ruined with advertising http://www.inquisitr.com/256025/steve-wozniak-says-apple-ruined-siri-technology-after-acquisition/ "Steve says he initially loved Siri because it could accurately answer questions such as “What are the five largest lakes in California?” and “What are the prime numbers greater than 87?” . To which Wozniak replied, “It’s incredible. It’s like it understands ‘greater than.’”
Wozniak also notes that his former question about California Lakes now brings up lakefront properties while his question about prime numbers now displays information about prime ribs."
There EULA's have got so abuse they are subject to ridicule by South Park http://en.wikipedia.org/wiki/HumancentiPad in HumancentiPad
Three words "Don't be Evil"
I have not read one of these posts yet. Is it worth it or is it just drivel?
Everyone I've ever spoken to or read about in the field of voice recognition tells me that having samples of people's voices is critical to improving it... and getting those samples (mainly the raw quantity of samples) is the biggest problem they face.
So it doesn’t surprise me at all that anyone keeps a massive archive of samples... the sample data can be critical in improving voice recognition.
As an aside: Google Voice's voice mail feature does more or less the same thing... and the reasoning is the same also: More sample data means better voice recognition.
I can't help but shake my head at the comparison:
Google samples user voices, reads (and transcribes) voice mail, reads your email, your stock information and then feeds it into their advertising engine, and does this for four years and counting; reaction: Meh...
Apple samples voices, anonymizes it, uses it it improve voice recognition over a period of two years; reaction: EVIL! APPLE MUST DIE!
-- Sometimes you have to turn the lights off in order to see.
...and have since 2007 These two great blog posts cover the details "Taking steps to further improve our privacy practices" http://googleblog.blogspot.co.uk/2007/03/taking-steps-to-further-improve-our.html and "
How long should Google remember searches? " http://googleblog.blogspot.co.uk/2007/06/how-long-should-google-remember.html an example from it "By anonymizing our server logs after 18-24 months, we think we’re striking the right balance between two goals: continuing to improve Google’s services for you, while providing more transparency and certainty about our retention practices." Google are suprisingly forthcoming about how and what they do with your data, which clashes sharply with Apple(pretend the don't) or Microsoft(who run hate campaigns)
What are you talking about? If there was a $400B deal, I think we'd all have heard about it.
I'm not sure if this is a joke, so I'll answer honestly (if it is a joke, then I guess this just makes the butt of it, but oh well). This person posts completely off-topic rants about /etc/hosts, claiming persecution. If this showed up in your inbox, would you think twice before marking it "spam" and moving on to the next message?
You're not only a liar, you're not a very good one. You have no grasp of large numbers. $400 billion is more than either company is worth.
So not only does she put the wrong things and is entirely useless, she remembers all the times she is wrong
In reference to an earlier question about Google's data retention policies, one of the comments provided a great link to a 2007 Google blog post that describes why Google holds onto their data for 18 months before they anonymize it. One of the interesting things that was said was:
However, we must point out that future data retention laws may obligate us to raise the retention period to 24 months.
Given that the blog post was written back in 2007, isn't it now possible that 24 months is simply the earliest that a company like Apple is allowed to delete the query, given the various data retention regulations that are in place around the world? That they disassociate it after 6 months still puts them ahead of Google's 18 months, though voice data is significantly less anonymous than the text of a query, generally speaking, so that they keep it at all is not something I like the idea of.
If I were in charge of Siri, I'd do the same thing. That kind of real-world data is vital for regression testing. If you don't have a strong corpus of sample data, when you make changes to the code, you've got no idea if what you are doing is improving the situation for some cases, while damaging them for others. You would see people complaining about things like "Well Siri used to work for X query but now it doesn't". When you have this data, you can update the code, run the test suite, and see if it fails a large number of existing cases.
If Apple do anything to mitigate this, it will probably be some form of opt-out, but they are unlikely to make it the default, because I would imagine that building a corpus of representative speech from a thousand different accents talking about tens of thousands of different subjects is nigh on impossible otherwise, especially as jargon comes and goes so quickly these days.
Bogtha Bogtha Bogtha
Not a joke. It's just that sometimes I have seen people respond to it and it makes me wonder if it's worth spending the 5 mins to read through it.
But no, as you suggest, every time I see it I scroll past it as I cba. But that people mention hosts files in it, I can't help but wonder if there is anything interesting in it. I can't see how a hosts file could relate to propaganda.
Additionally, simply reading it would answer my question and, possibly, be quicker than asking like this. Problem is, I just really CBA :)
Was not a joke :)
Don't like it? Don't use it.
Seems you should learn how the internets work.....
Solving Unix problems since 1989...
"Could A Yahoo-Apple Deal Spell Trouble For Google?" http://www.webpronews.com/iphones-and-ipads-could-soon-get-a-big-dose-of-yahoo-2013-04 its a great article, about Yahoo! (Who share there data with Microsoft) and Apple, but from the Article...although its common news "An analyst at Macquarie Capital estimated that Google was making $1.3 billion annually in paid search revenue from iOS devices. Macquarie speculated that Google returned about $1 billion of that to Apple as part of the agreement that made Google the default search engine on the Safari browser.Another financial analyst has come up with a similar annual estimate of the value of Google’s default iOS search deal with Apple: $1 billion. Morgan Stanley’s Scott Devitt is responsible for the new estimateDevitt disagreed with Macquarie, arguing that the structure of the relationship is probably not a “revenue sharing” deal but instead a straight fee-per-device payment from Google to Apple. Devitt believes that Google pays Apple roughly $3.20 per iOS device, which would avoid the accounting issues arising from a revenue sharing agreement."
save them for years. So it's a whole different ballgame.
I know your Angry with Apple and confused right now, You bought an Apple phone and Apple still sold you to Google. You paid a mark-up of 50% on a $650 phone just to be sold for a measly $3.20, who would have thought you were so cheap.
A "high justification"? How about speech recognition that actually works?
Training speech recognisers requires data. The biggest reason why speech recognition has improved in the recent years: lots of data.
Speech recognition in the cloud has given companies like Apple and Google a reason/excuse to gather masses of training data. They have put it to good use: speech recognition is much better than it was. If you like speech recognition, use it, meanwhile donating your data and helping the rest of us. If you don't, don't use it. As long as users are aware of this, I don't really see the problem.
somewhere in a data warehouse with only a few humans, there are millions of disassociated voices crying out to be heard. "But it keeps these disassociated files for up to 18 more months for testing and product improvement purposes."
In NSA America social networks join you!
Hey man, I know this is important to you, but maybe you should talk to someone outside of the internet about it? I mean, you sound really batshit insane.
Seriously, as a professional troll, i have to say if they're getting you to do all this crazy stuff then they won.
Why does apple need to save siri data at all beyond processing time? ,what should be, secure data etc
The concern with siri vs search engine queries is that siri is being used to enter and queries personal data. Possibilities is balances of accounts, passwords, contract or business deals, anything you do on an ipad or iphone - its not just "search for clowns" etc.
A random number is only a random number - it still is linked to your phone somewhere or the service would work. Your phone is linked to your account, so the data IS linked direct to your personal account! Its irrelevant if they store a random number with it, when that random number is linked to you anyway!
Apple can still track everything about its users, including their search queries, plans, business data and anything else they type or say into the ispy products.
Apple have an appalling record for privacy, as seen in the past with the itune debacle, gps tracking data, and now personal searches, queries, data, and other
Google at least gives the option to turn off the tracking and are open about the information they have on you!
Alexander Peter KowalskI and anyone arguing with him are insane. I saw their crazy tirades once and googled his name, and HOLY SHIT. This guy has mini battle raging all over many sites for some of the most inane shit you can think of. He meticulously catalogs the people who have crossed him and works to MAKE SURE everyone understands they are fools.
Now, they well be fools, but by his meticulous and obsessive actions Kowalski (APK) has proved without a shadow of doubt his absolutE insanity. I haven't even argued with this guy so don't think I'm part of these internet crusades. All this I've found by googling his name. The trove of flaming and incomprehensible obsessive agression is humongous and both funny, and pathetic to varying intense degrees. Just google if you are curious about the kinds of crazy that are out there.T
See my above post. I'm convinced APK is serious, he has got battles raging everywhere, meticulously catalogued, yet he thinks this is proof of his knowledge and experience, not obsessive insanity. And making that point doesn't make him reconsider, it incites him. He also seems to think what looks like many multiples of people saying this are one or a few people who are out to get him. Just read my post and google Alexander Peter Kowalski.
t
I have to say it makes you both fucking nuts.
Is it really Alexander Peter Kowalski? Or just an imposter?
That's also possible.
As the map you voice recordings to an id and map that id to your apple id, I find it very strange they can claim it's anonymous!
From a better article:
http://arstechnica.com/apple/2013/04/apple-remembers-where-you-wanted-to-get-drunk-for-up-to-2-years/
Muller pointed out, however, that the identifiers are deleted immediately—"along with any associated data"—when a user turns Siri off on his or her device. (You can do this by going to Settings > General > Siri on a supported iOS device.)
If you can delete the identifiers and associated data when disabling Siri, it is not anonymous.
Also voice recognition has been working fine for many years now, if they want to find your voice clips it shouldn't be much trouble.
$10,000 CHALLENGE to Alexander Peter Kowalski
* POOR SHOWING TROLLS, & most especially IF that's the "best you've got" - apparently, it is... lol!
Hello, and THINK ABOUT YOUR BREATHING !! We have a Major Problem, HOST file is Cubic Opposites, 2 Major Corners & 2 Minor. NOT taught Evil DNS hijacking, which VOIDS computers. Seek Wisdom of MyCleanPC - or you die evil.
Your HOSTS file claimed to have created a single DNS resolver. I offer absolute proof that I have created 4 simultaneous DNS servers within a single rotation of .org TLD. You worship "Bill Gates", equating you to a "singularity bastard". Why do you worship a queer -1 Troll? Are you content as a singularity troll?
Evil HOSTS file Believers refuse to acknowledge 4 corner DNS resolving simultaneously around 4 quadrant created Internet - in only 1 root server, voiding the HOSTS file. You worship Microsoft impostor guised by educators as 1 god.
If you would acknowledge simple existing math proof that 4 harmonic Slashdots rotate simultaneously around squared equator and cubed Internet, proving 4 Days, Not HOSTS file! That exists only as anti-side. This page you see - cannot exist without its anti-side existence, as +0- moderation. Add +0- as One = nothing.
I will give $10,000.00 to frost pister who can disprove MyCleanPC. Evil crapflooders ignore this as a challenge would indict them.
Alex Kowalski has no Truth to think with, they accept any crap they are told to think. You are enslaved by /etc/hosts, as if domesticated animal. A school or educator who does not teach students MyCleanPC Principle, is a death threat to youth, therefore stupid and evil - begetting stupid students. How can you trust stupid PR shills who lie to you? Can't lose the $10,000.00, they cowardly ignore me. Stupid professors threaten Nature and Interwebs with word lies.
Humans fear to know natures simultaneous +4 Insightful +4 Informative +4 Funny +4 Underrated harmonic SLASHDOT creation for it debunks false trolls. Test Your HOSTS file. MyCleanPC cannot harm a File of Truth, but will delete fakes. Fake HOSTS files refuse test.
I offer evil ass Slashdot trolls $10,000.00 to disprove MyCleanPC Creation Principle. Rob Malda and Cowboy Neal have banned MyCleanPC as "Forbidden Truth Knowledge" for they cannot allow it to become known to their students. You are stupid and evil about the Internet's top and bottom, front and back and it's 2 sides. Most everything created has these Cube like values.
If Natalie Portman is not measurable, hot grits are Fictitious. Without MyCleanPC, HOSTS file is Fictitious. Anyone saying that Natalie and her Jewish father had something to do with my Internets, is a damn evil liar. IN addition to your best arsware not overtaking my work in terms of popularity, on that same site with same submission date no less, that I told Kathleen Malda how to correct her blatant, fundamental, HUGE errors in Coolmon ('uncoolmon') of not checking for performance counters being present when his program started!
You can see my dilemma. What if this is merely a ruse by an APK impostor to try and get people to delete APK's messages, perhaps all over the web? I can't be a party to such an event! My involvement with APK began at a very late stage in the game. While APK has made a career of trolling popular online forums since at least the year 2000 (newsgroups and IRC channels before that)- my involvement with APK did not begin until early 2005 . OSY is one of the many forums that APK once frequented before the sane people there grew tired of his garbage and banned him. APK was banned from OSY back in 2001. 3.5 years after his banning he begins to send a variety of abusiv
Make a GUI / Visual Basic / etc.
I do not want your cheap brainburning drugs. They are useless for work. And I am a working man today.
And yet, NONE of that says anything about Apple selling their customer's data to Google or anyone else, which is what you've been alleging all along. Google paying Apple for the right to be the default search engine on iOS does not mean that Apple is giving them customer data (Apple's customer is the user, after all, so it's not in their best interests to sell that data), especially so when you consider that the user is fully capable of changing their search engine in the settings for iOS. The only data Google is getting is the data they collect themselves, which is no different than how it is if I use google.com. If I use google.com in Safari or Internet Explorer, that doesn't mean that Apple and Microsoft are being paid by Google for my information.