Hiding Messages In VoIP Packets
Orome1 writes "A group of researchers from the Warsaw University of Technology have devised a relatively simple way of hiding information within VoIP packets exchanged during a phone conversation. The called the method TranSteg, and they have proved its effectiveness by creating a proof-of-concept implementation that allowed them to send 2.2MB (in each direction) during a 9-minute call. IP telephony allows users to make phone calls through data networks that use an IP protocol. The actual conversation consists of two audio streams, and the Real-Time Transport Protocol (RTP) is used to transport the voice data required for the communication to succeed. But, RTP can transport different kinds of data, and the TranSteg method takes advantage of this fact."
From what I understand, steganography works if an observer (Carl) cannot tell that transmission of covert data is taking place between Alice and Bob. The proposed method results in an RTP bitstream that does not hold the payload advertised in its headers -- the audio is compressed using a more efficient codec than advertised in the packet headers, and the extra space is used to carry the "hidden" payload; Alice and Bob agree beforehand on the audio codec to use.
Now if Carl wants to eavesdrop on the conversation by hijacking (or owning) an intermediary network node, he would get corrupted audio data when trying to decode the packets with the (fake) advertised codec. Wouldn't this be a strong indication that covert communication is taking place?
I was thinking that a way of sending hidden messages between two locations (assuming a reasonably reliable network), one could introduce send messages by controlling the rate of the replies in a predictable manner (using ECC and varying transition timings for error rate compensation).
Another simple one would be with TCP/UDP in forcing out of order packets for positive/negative bit representation and similar correction routines as above.
Both hidden message systems are slow to send any substantial amount of information, but I can't see a reasonable approach to intercept without a full dump of the entire packets and timestamps which is more laborious than just the session data contents (assuming one is ManInTheMiddle). Further security on the payload as necessary, but the transmission of the message itself is hard detect.
Bye!
Most used codecs use some internal ECC, so filling RTP packets with your data will be easily recognized.
Another approach would be doing FFT on decoded audio. Codecs tend to produce wideband noise with random data and that is very different from usual speech frequency response.
Much better method would be using LSB bits in codec to transfer message. It would result in slight differences in pitch or other parameters, but it would be almost undetectable.
Steganography is already widely used by the movie industry. Movies sent to movie theaters have robust watermarks hidden in them, which helps the MPAA identify the theaters where unauthorized recordings of movies are being made. Steganography is also used in laser printers, to help the FBI identify the origin of printed documents.
Like cryptography, steganography is not just limited to keeping your information private or to fighting censorship.
Palm trees and 8
Except this is not steganography. Not exactly. It is a lot more complicated and highly unlikely to work.
RTP streams can carry multiple data streams. That's how voice and audio can be sent in the same connection. The summary implies that additional RTP streams are added, which is not steganographic at all. The additional streams are easily detected. It is as much steganographic as alternate data streams are in Windows files.
However, reading the article indicates something completely different from the summary. This method is not taking advantage of alternate/additional RTP streams at all. It is choosing different codecs based on a complex mapping pattern known only to the sender and receiver. The difference must allow the newly compressed, and transcoded, stream to contain extra hidden data without altering the expected size.
1) Not all VOIP systems use different codecs. It is not really required. My own systems use g729 exclusively from the handsets/deskphones/softphones all the way to termination and origination providers. Without a robust codec library the number of variations here is pretty low. Not to mention both sides would have to support it.
2) This assumes the RTP traffic is encrypted. Which means you are only using steganography as an additional layer of security.
3) If the RTP traffic is in plain text.... this makes it that much easier to defeat. If you were expecting a jpeg file, but upon inspection, found a bmp file, would you not suspect something? This method seems to rely on saying you are using one codec but choose another one. That would seem to be trivial to verify as a 3rd party intercepting packets.
The whole idea is not very workable since the value of codecs is their ability to preserve audio quality, work around iffy connections, and achieve a smaller transmission footprint.