Microsoft's Acoustic Caller ID Patent
theodp writes "A new patent granted to Microsoft Tuesday for automatic identification of telephone callers based on voice characteristics
covers constructing acoustic models for telephone callers by identifying words or subject matter commonly used by callers and capturing the acoustic properties of any utterance. Not only that, it's done 'without alerting the caller during the call that the caller is being identified,' boasts Microsoft in the patent claims."
The only difference here (aside from what agencies have been doing since the 1960's) is that this analysis seems to be done in real time, rather than offline? I mean, haven't monitoring people been able to tell who is speaking based on sound synthesis since forever?
Anecdotally I feel like some companies answer the phone quicker if you talk to their automated system in an irate and condescending manner. Could just be me though :)
If someone had acquired some of your personal information, and then tried to impersonate you, an automated voice recognition system could be useful by raising an alarm, or at least giving a percentage of how much their voice is like yours.
The sort of processing this patent covers is something that hasn't been possible until recently, but I think, in principle, is something absolutely necessary for robust AI, and that is doing recognition simultaneously on both low level features and high level features of data and on intersections of the two.
By "high level" I mean things like word choice, language etc. By low level I imagine they mean things like the specific resonance characteristics of a voice. In voice there are intermediate levels of features too, such a the characteristics of phonemes.
The upshot of this is that just as algorithms and hardware begins to reach a level of power necessary to show intelligence, it will be impossible to do so without stepping on patents.
We will have patents on a machine not being stupid.
Comment removed based on user account deletion
The keywords being:
'without alerting the caller during the call that the caller is being identified'
Don't we have laws against doing stuff with voices without informing people first? And since when is sampling audio, and then converting part or all of the audio to a format based on, and unique to the original, not an act of recording?
According to this:
;)
Not only that, it's done 'without alerting the caller during the call that the caller is being identified,'
They are describing a means to RECORD callers without their knowledge, and hence without their consent. So would this software be illegal in some jurisdictions? You bet yer ass it would be.
Wonder how it handles people who say "uhm" or "uhh" a lot.
My Suburban burns less gasoline than your Prius.