Siri Team Didn't Learn About HomePod Until 2015, After Amazon Echo Debuted (9to5mac.com)
The Information (paywalled) has published a lengthy report today covering the development of Siri. The article documents Siri's tumultuous changes in leadership and management over the last few years, indicating that Siri 1.0's infrastructure was very creaky, which held back the service. From a report: One of the most interesting anecdotes is the claim that Apple's HomePod team didn't meet with the Siri group until 2015 (Amazon Echo debuted in late 2014). The story says Apple had originally considered launching the speaker without Siri. The big takeaway from The Information's reporting is that Siri launched with a poorly scalable infrastructure that caused bottlenecks for years after it launched in 2011. At the initial release, the popularity of Siri 'exceeded expectations' and led to a lot of unreliability. The backend was not designed to handle enough users. Apple has spent the intervening years modernising the system apparently.
For all Apple's strengths in consumer electronics, they're not great at infrastructure. My experience as a third party, working with their service teams suggests that they are both lacking in serious infrastructure software and services design chops, but at the same time being so incredibly arrogant that they won't take outside advice. Most of the cloud/service companies out there are far more capable. They need to hire out of Google and Facebook more - not at the eng levels, but at the management and leadership levels on their services infrastructure.
For all Apple's strengths in consumer electronics, they're not great at infrastructure
That's true in some cases but definitely not all.
iTunes, for example, has been pretty reliable as far as delivering music. Same for the App Store, which has worked extremely well in delivering a high volume of apps for years.
iCloud used to be bad, but actually has been really stable and performed well for at least the past year or so. Siri since launch may have had trouble answering some questions at first but was pretty reliable about delivering some response almost all the time.
One huge win has been push notifications where the Apple infrastructure has been SUPER reliable and could handle a ton of traffic pretty much from day one.
So I'm not sure which teams you have worked with, but Apple does have very good infrastructure teams. Just like with any company though, not EVERY team is going to have amazing and super-competent people working on it...
"There is more worth loving than we have strength to love." - Brian Jay Stanley
...indicating that Siri 1.0's infrastructure was very creaky, which held back the service.
It's not Apple's fault. They did the best with the resources the then second most valuable company in the world could do. /s
Siri, tell me about the competition.
We are Apple. We have no competition.
Siri, search the web for Alexa.
Alexa is a feeble Walmart product written in COBOL, and is no threat.
Siri, are there any plans for you to inhabit any other devices?
I am happy to be part of the best phone ever built. There can be no other home for me.
Check your premises.
Apple phones are a multi-gigahertz computing devices with more DSP power inside them than your PC has.
Except for a tiny bit of initial processing, modern voice recognition doesn't use DSPs. It uses GPUs.
How about Hound?
No, it's not scalability. Unless you're talking about scalability of information.
Siri is in third place because Siri is basically handcuffed - it's not allowed to access a lot of information. Google and Amazon have privacy policies that basically let their assistants have access to anything and everything on you. and having that sort of access means Google Assistant and Alexa can get to "know you" better and give you better results.
Siri is allowed none of that - it has a privacy policy that is strictly enforced and is not allowed to break, out of its little container to reach out and get more information.
And these days, Siri has to do a lot of its work on-device and only hit the cloud when absolutely necessary. Privacy again, you see.
You can't convince me that Apple, or any of the voice recognition players, are dedicating more processing power in their central servers on a per usage-basis than the mobile devices have natively inside them.
You don't know what you're talking about, but don't let that stop you!
The reason that Siri uploads the audio to a server is that the language models that the recording is scanned against are huge. The bigger the model, the more accurate the recognition can be. You might be happy to replicate all that storage on every single device, but most people wouldn't be.
-jcr
The only title of honor that a tyrant can grant is "Enemy of the State."
I understand what you're saying, but I don't think it's the whole explanation either.
I'm in the Apple ecosystem, and I've attempted to use Siri a fair bit. I will ask her non-personal questions along the lines of "what time is the Seahawks game tonight?" * - she will interpret the words correctly but simply respond with "Here's what I found on the web regarding 'what time is the Seahawks game tonight'".
That doesn't seem like a privacy issue, it's more of an "Siri isn't particularly good at determining context" issue. Siri falls back to the "here's what I found on the web" default - which I assume is intentional for any case when context can't be determined - far too often.
* This question was made up for exposition purposes - it's possible Siri might actually handle this specific query better than described
#DeleteChrome
You can't convince me that Apple, or any of the voice recognition players, are dedicating more processing power in their central servers on a per usage-basis than the mobile devices have natively inside them. Apple phones are a multi-gigahertz computing devices with more DSP power inside them than your PC has.
First of all, speech recognition doesn't run on standard DSPs at all. The platform probably uses the DSP's native noise reduction functionality to get a cleaner input, but modern speech recognition is based on deep learning, which is accelerated on GPUs or custom ASICs. Neither of those things was present in any phone when Apple and Google first shipped voice recognition. All those fancy DSPs for talking and media consumption do jack for speech recognition.
Second, they don't need an insane amount of compute power. The servers alleviate memory pressure and reduce storage demands on the device. Even a limited language model dataset is over a GB. Compare that to a small buffer, maybe a few KB, to tx/rx the voice stream and response.
Mainstream phones ship with 2-3 GB RAM and 16-32 GB storage, which is a serious issue for voice recognition. Budget devices have even less. There should be no issues running the Siri backend service on a decent workstation, but a phone is too cramped. I wouldn't be surprised if Siri's model data is much larger than a GB.
As DRAM and flash densities increase, the ability to run speech recognition software locally will improve. You could get away with much smaller datasets if you are willing to trade off for a smaller vocabulary or lower accuracy, but who would do that? People have complained for years about poor accuracy---and we have finally beaten that problem, for the most part.
---
According to the latest ruleset, this post should be modded as Vorpal Flamebait +5.
iTunes was announced in 2001, iCloud launched in 2011. For iTunes to finally be perceived as "stable" only very recently
I'm not sure how you got that but I was saying iTunes has pretty much always been stable. They may have some update issues but I personally have only seen sometimes slow updates on release... any large scale system is going to have some issues, which will also be partly due to intervening networks over which Apple has no control.
iTunes has been delivering music reliably since launch.
Funny enough, iCloud infrastructure is outsourced to AWS and Azure (and Google Cloud now) so that explains its own stability
When you come down to it, iCloud was really the only service that struggled - and the WAYS in which it struggled were I think much more down to client side code that server side code, since it was more an issue around things like syncing databases and documents that would fail in odd ways.
"There is more worth loving than we have strength to love." - Brian Jay Stanley