Voice Over IP for Linux Games?
fathom asks: "A few friends and I are attempting to move all of our gaming from the Windows platform to whatever distro of Linux we like. For the most part we've all had great success: just about everything we play is fine under Linux. However, there is one major drawback: we don't know of any software programs for Linux to do Voice Over IP like BattleCom, RogerWilco, and the GameVoice. Are there any programs out there like this for Linux?" Why limit to Linux? What Voice Over IP software can be used for any Unix that's flexible enough to work for other applications as well as games? We did a similar question about a year ago; has anything changed since then?
Loki had the same sort of problems when they ported Tribes2. They switched over to a freely available GSM encoding (from a university in Germany). It worked so well they're adding the code to the Windows version so you can chat between versions.
VOIP, or Voice Over IP, has come to mean a specific suite of protocols for providing telco integration, call setup/teardown, etc. We shouldn't use it as a generic term for 'speech over internet' anymore..
Well, for the year 2001 you may want to use `ssh' instead of `rsh', and /dev/dsp instead of /dev/audio for Linux, but the idea is still the same ...
Almost as much fun as making the Sun next to you belch while the newbie is using it!
How about Speak Freely for Unix?
I have played with it a bit, and it seemed to work, but I haven't actually used it for gaming yet.  It didn't seem as simple to configure and use as some of the windoze voice comm programs, though.
Sorry, some of us have lives on the weekends ;) My humblest apologies for having better things to do on a saturday night than to debate gun control with someone who loves to continue to dodge the simple question, "Do you think the US legal system is wrong 50 times as often as it is right?", by making pseudo-statistical arguments without overcoming the sheer numbers, and nitpick choices of examples in arguments without hitting the core of them (I.e., "just because something else is worse doesn't justify something bad"). But, regardless, this thread is about audio compression :)
;) ). Do you mean "arithmatic encoding", perhaps? I have a neat theory for using that, with scaled probabilities, to create an optimal compression ratio (predictive) for thresholded, quantized data, that I came up with after the last slashdot conversation on compression (it was video compression then).
;)
;)
When you do a block DCT or MDCT on an audio signal, you're not looking at a whole page of text's worth. You generally look at a fraction of a second. Speech has little redundancy at this level. However, that isn't what I was referring to. Do you have any background in audio compression? There are two keys in compressing audio using current methods: Frequency masking and signal response. Frequency masking is the fact that when the human ear hears a strong signal, weak signals that are near it in frequency seem to "dissapear" or "merge" with the stronger signal. Signal response (hearing response, frequency sensitivity, etc), is how good, overall, the human mind/ears are at hearing weak signals at various frequencies across the spectrum. By a careful knowledge of these, in music or voice, you can kill off many more frequencies than without it. However, it also is a big CPU consumer to do it very carefully. Cutting out some of the analysis can save you a good bit of CPU - and, in the case of human voice, which tends to be in a very audible range with few masking effects, won't affect your compression rate much.
Second, please, if you can create a good sounding speech synth - especially one that can give inflections, emotion, etc - please, please share it with us. Until then, good luck having something like this work (simply neglecting CPU issues) without sounding like a 50s robot that messes up once a second.
Oh, and to answer your theory about masking out frequencies below 500hz and above 3000hz: No. That will sound so unbelievably awful. First off, lets neglect the fact that someone with a voice like Barry White would be inaudible, and that you'd never hear a 't' or 's'. Ignoring that, that's a silly way to do it. You need a simple curve, even just a simple line graph. It takes little CPU time, and will actually be able to reproduce the original sound well. Arbitrary truncation points are unbearably bad.
Next, you seem to be of the notion that MP3 encoders are "tweaked towards music". MP3 encoding is a fairly abitrary term. MP3 is a specific format for encoding streamable, quantized, transformed data. You can use any truncation scheme you want - even the silly one you proposed. Most encoders you'll find are tweaked towards the human hearing range - an optimal choice for both voice and music (especially voice, though! Voice compresses very well, because, compared to music, it has most of its energy concentrated in a few signals at any given time).
Next, why use "logarithmic encoding" for compression? Logarithmic encoding is a (poor) way to store raw (uncompressed) audio data - it sacrifices low-level clarity for the ability to represent very loud signals - something seldom of use in normal audio compression applications (have you ever noticed how quiet signals on an 8-bit sound card are very crackly, but the loud ones are clear? Thats the sort of effect logarithmic encoding gives to sound). It is useful in efficient Pulse Code Modulation (PCM) of data for maximizing the number of transmissions over a small number of physical channels, but doesn't even begin to apply as far as storing quantized data is concerned (that would be like using a bubble sort to compute Pi or something
Please... if you're qualified to discuss audio compression, how about the basics? Do you know how to compute an FFT? Do you know why you wouldn't use an FFT for audio or video compression? What about a DCT? MDCT? What do you know about quanization schemes? The advantages/disadvantages to storing quantized data with huffman encoding vs. arithmatic encoding? Have you ever written a single signal processing function? (I've written a whole library). Do you know anything about the subject at all?
If you don't know what you're talking about, please don't be suggesting encoding schemes. There are enough bad ones out there already
- Rei
P.S. - sorry if I seem a bit bitchy. For some reason, they decided to leave us without air conditioning today at work
You know when it's okay to shout fire in a crowded theatre? When it's on fire.