VoIP Calls Double In Quality
anthm writes "From Newsforge and
LinuxPR
FreeSWITCH, an open source soft-switch and IVR platform, have announced that they can support 16khz audio calls thus doubling the potential voice quality. They have had successful tests with a conference bridge, a pass-through SIP call and an IVR that reads RSS news feeds with the Cepstral Text-To-Speech Engine."
FreeSWITCH, an open source soft-switch and IVR platform, have announced that they can support 16khz audio calls thus doubling the potential voice quality. They have had successful tests with a conference bridge, a pass-through SIP call and an IVR that reads RSS news feeds with the Cepstral Text-To-Speech Engine."
Voip-Info.org has a good list of business VoIP providers.
Everything else is stuck at 8khz, so unless your call uses this service end-to-end, there's going to be a downconversion if you're calling someone on a land line. And you'll be stuck with 8khz if you get any calls from someone not on this service.
Still, its a good piece of news, onward and upwards.
*crosses fingers* Please nobody mention video phones. *crosses fingers*
I don't get it.
good work there, but all you need is to get the message across. its not like u r singing on the phone and need good voice quality. just do what's needed.
So what? If you're going to up the sampling rate why not go directly to 44khz stereo (CD quality audio) and be done with it? Jumping from the telephony industry standard 8khz to 16 khz is thoroughly uninspired.
Moderating "-1, Disagree" is simple censorship. Have the guts to post your opinion.
Google gives the definition of IVR as Interactive Voice Response.
So I knew what one was, I just didn't know there was a TLA for them. This inane personal revelation brought to you by the captcha "accuse".
-theGreater.
That said, can video telephony and the kind of communication we've seen portrayed on Star Trek et. al. be far behind?
I fail to see how adding one additional octave of frequency response to the 6 or 7 currently available, can be called "doubling" the quality.
We're a Cisco VOIP shop and phone conversations sound fine. I'm not sure how going from 8->16 would make it any better.
This can only mean twice as much material filling up the tubes.
Slashdot Burying Stories About Slashdot Media Owned
I wasn't aware that telephones even HAVE "definition", let alone that they are in HIGH DEFINITION now.
definition
4. a. The clarity of detail in an optically produced image, such as a photograph, effected by a combination of resolution and contrast.
b. The degree of clarity with which a televised image or broadcast signal is received.
Of course, what do I know... I didn't realize wireless networking equipment had fidelity, either (ie. WiFi).
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
Actually, I've used Asterisk to pass through 24KHz Speex encoded audio - very impressive sound quality, but only works when the SIP channel is client to client.
In theory a SIP server doesn't need to know all of the codecs a client supports - the clients themselves negotiate any compatible protocol.
Of course, if the sip server puts itself in the path (such as when it needs to pass through to PSTN or firewalled clients), then 8KHz is the (till now) maximum supported rate.
Sparks:Gadget:Beer Maker
...just like going from a 32- to 64-bit processor doubles it's processing power.
will be able to clearly understand me when I say, "I can't talk now, my leg is on fire."
the submitter is the author of the code.
Move along, nothing to see here yet
The problem isn't making a software based IVR system or even a softswitch run at a better rate. Now find me a SIP phone that runs at anything other than 8Khz. No, I'm not talking about a F/OSS softphone, but a real hardphone. They have the minimum DSP power the manufacturers can get away with to support 8Khz. Now find me a PRI that can interface with it. For now that is still an issue.
Skype has been running their softphones at higher than 8Khz/8bit so their softswitch obviously was the first widely deployed one to leave 64kbit max quality behind.
Yes, someday all telephony (except legacy telco stuff that will never change, which will be a shrinking market) will offer higher quality audio and an option for video. But not for a few more years until the saturation of next gen telephony products gets better.
Democrat delenda est
8khz to 16Khz is fine, but that's not usually the problem we encounter with VOIP. It's latency and dropped packets, which this will just make worse. But if you're doing this on your own network only then I can see where this would be neat.
They're just using a higher quality codec than G.711 (which is the standard for the back-end digital phone system).
The phone people (probabably AT&T) chose that standard since it gave pretty good voice quality given the limitations of current technology.
People are generally happy with the voice quality of the phone system - which is different from the voice quality of the last mile - the analog copper loop to your house, or CDMA/GSM/TDMA to your cell phone.
It's highly unlikely this new codec will catch on - the installed base of G.711 phone systems out there is enormous.
This "improvement" is idiotic. The thing which most limits the quality of a VoIP call is delay and jitter, NOT the sampling rate. Guaranteeing the quality of a telephone conversation over the internet is tricky because the internet was originally designed for best-effort packet delivery, with no guarantees on packet delay, sequence, or even (at the network layer) delivery.
If anything, this feature reduces end-to-end quality by doubling the amount of data being sent down the pipe, as you'd need to buffer more data at the same transmission speed to correct for jitter. Brillant!
Procrastination Man strikes again!
Is it even a difference human ear can notice? I mean, VoIP calls today are pretty good..
Speex is a CELP (code excited linear prediction) codec that is far more complex than the simple PCM system used by the telephone company. The resultant bit rate can be fixed or variable, and is not rigidly tied to the sampling rate used for data acquisition.
Mea navis aericumbens anguillis abundat
The VOIP to PSTN scene kind of sucks at the moment anyway. There are a lot of fly-by-night operators offering a vareity of confusing plans which may or may not leave you stuck on proprietary hardware. I'm more inclined to place my bets on private DUNDi peering networks like fwdout. It might be easier just to find someone who wants to call numbers local to your phone line and trade them service in your area for service in their area. Better yet don't terminate into the PSTN at all, except that in my case neither my parents nor my surviving grandparents have high speed Internet access.
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
So it's 10 times better than the Evil (tm) telcos!
And my software puts a green stripe around the edge of the data too... sucka!
Give a man a fish and you have fed him for today. Teach a man to fish, and he'll say "WHERE'S MY FISH, YOU IDIOT?"
Well sure, VoIP is digital audio over the internet. It contains all of the same properties as any digital audio. It can vary from CD quality down to unintelligable static.
Our voices don't have that wide a frequency range, there's little up in the high frequencies. A voice sample recorded at 22kHz (11kHz frequency range) is very hard to distinguish from one recorded at 44kHz (22kHz frequency range). In fact you'd need to be using a fairly good mic to really get much of the higher frequencies anyhow. 8kHz works since F1 and F2 (the frequencies of the first two peaks in the harmonic curve) fall under 4kHz for essentially all speakers. F1 and F2 are what we primarly use to determine vowel sounds and thus are what's realy relivant. Well with an increase to 16kHz you get F3 and even F4 which leads to pretty natural sound as far as most listeners are concerned. Past that, there's just not a whole lot that affects your perception of speech.
The reason for chosing 16kHz is probably simply that it's twice what you have before. Thus if you are interfacing with an old system that doesn't support it, just discard every other sample, no sample rate conversion needed (which is CPU intensive).
I wasn't aware that telephones even HAVE "definition", let alone that they are in HIGH DEFINITION now.
o
Apparently audio can have "definition"...
http://en.wikipedia.org/wiki/High_Definition_Radi
Of course, that's only in the same as networking equipment has fidelity...
Can I get an eye poke?
Dog House Forum
If you have to provide your name, address, phone number, e-mail, etc., just to download it, it should not be called "free"switch; it should be called "spam"switch.
16 kHz should be enough for anyone ;)
Procrastination Man strikes again!
I can't hear anything on the phone with my left ear, presumably because that eardrum's heavily scarred. I can hear most people talk with the left ear when they aren't on the telephone, but I get nothing over a phone.
I can't understand Lorena McKennitt's ballads using either ear (the right eardrum has much less scar tissue, and I can hear most things, including phone voices, but not including bats or Lorena, with the right ear).
No, really. I'm serious.
OMG, at this rate, we'll have 64 kHz calls in 6 years, and 128 kHz in 12 years!!!!
:-P )
(Going from 8 kHz to 16 kHz isn't a "doubling of quality"
Tired of being "punished" by the Slashdot $rtbl since 2002. I'm now over at http://soylentnews.org/ .
Until the hardware terminating a call to PSTN supports the quality of these shiny new codecs, there's more or less no point to them. Seriously, that's never, EVER going to happen. Telecom' hardware manufacturers are going to stick with G.711 and similar ITU standards.
The change will come when IP -> IP calling comes of age and the legacy hardware players are taken out of the loop entirely.
Theoretical maximum, may be as low as 3.
Second, this is enough to capture most of a human voice. Can you hit a high "C"? That is about one kilohertz.
Everything above 1kHz is being used to carry ever-dimishing harmonics that provide resolution for fast-rising sounds like "k" and "p". There's a slight loss of detail at 4kHz and very little at 8kHz. There is no honest way to refer to a move from 8 to 16 as "doubling the quality". Sycraft-fu's post has it right. In fact, if I were designing the system I'd put in a gentle low-pass filter to keep flyback transformers and the like out of the channel.
Besides, look at the response curve of human hearing, and notice how many dB it has dropped by 16 kHz.
For those of you who are not IRC junkies, the IRC client KVirc has built-in support for 44.1 KHz "voice chat" (not sure if it qualifies for "VoIP", but is a simple direct connection between two computers supporting real-time audio transfer). Not only does it support 44.1 KHz, but it has for at least a year (when I started using it). What's the big deal with 16KHz?
Do not meddle in the affairs of dragons for you are cruchy and good with ketchup.
Wahoo. Now I won't just hear that the slob on the other end is eating while on the phone, but maybe with this higher quality I'll actually be able to tell WHAT he's eating.
That's called wideband speech. It's been around for 10+ years and Speex supported it about 4 years ago. About time people actually use it (i.e. why people are still using narrowband in VoIP is beyond me).
Opus: the Swiss army knife of audio codec
Think of the hold music.
Now imagine that it responds to button presses so you can change songs.
"Operator... oh won't you help me make this call..."
The one case where several conversions are safe is converting from the original G.711 64kbps format down to almost any 8kbps or 6.3kbps format and back to 64kbps. The reason that mobile-to-mobile calls can work better than that is that most of them use the same codecs, so they can usually avoid conversion if you're going from one GSM phone to another. I don't know if this works when connecting a GSM carrier to a CDMA carrier or not. Also, of course, mobile phones usually have lousy little microphones and tinny little speakers, so much of the audio damage is done at the ends rather than by the codec itself, especially if you're using it in traffic.
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
Newsforge has no technical information, and Freeswitch is largely Slashdotted, but there's one sentence that says that they're using the Speex Wideband Codec as their 16kbps codec. One reason Speex is using 16khz sampling is because it's relatively available on PC sound cards, but another reason is that they do a cute sub-band coding technique - instead of representing the 8kHz analog waveform by directly encoding the 16k samples/second, they split the information into two bands - 0-4kHz which they encode using the same encoding they use for their 8kbps codec, and 4-8kHz which they encode (somewhat differently, for complex technical reasons :-) to provide additional depth for receivers that support the wideband format. So if you've got a wideband codec encoding the speech at 16kbps, you can play it on an 8kbps player if that's all you've got. For a live conversation between two software-phones, that's not particularly useful (except for a bit of code reuse), but if you're playing recorded files or setting up a multipoint conference between some 8kbps phones and some 16kbps phones, it's easy to send each phone what it wants.
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
Of course vowels aren't going to have a lot of content in the upper frequencies. Now try saying "This is the eighth utterance" into a microphone and see what doesn't happen. I did it myself, using a crossover at 4 kHz to split the signal into low-pass left and high-pass right channels. Listen to the Ogg Vorbis file and play with the balance. Notice how the phoneme /s/ comes through three times clearer when you have both speakers on (8 kHz bandwidth) vs. just the left speaker (4 kHz bandwidth).
That's a bit of a retarded demo for the technology: every techie's instincts are screaming "why not just transmit the RSS and convert to speech at the client?"
Congratulations on being the guy who completely missed the point. Perhaps next time you'll try reading my entire post before replying.
"Definition" is a video term, it has NO application at all to audio. It makes no sense.
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant