Program Hides Secret Messages in Executables
DmuZ writes "My friend Rakan has created a new steganographic tool named Hydan which can embed messages into an executable without altering its size. He recently presented this tool to the public for the first time at codecon. This new technique was intriguing enough to get coverage on SecurityFocus.com. The code is available here."
Steganography would be more precisely defined as "information hiding". It doesn't require that it is impossible to find the data hidden, but it tries to conceal the existence of the data.
Cryptography on the other hand does not try to try to hide the existence of information, it just tries to hide what message is embedded in that information.
Cryptography != Steganography, but they can be used in conjunction.
Add -ldl to the LDFLAGS in the Makefile.
This is a well known technique that was used in the mid-80s by Eric Isaacson in his product "a86". See here: http://eji.com/a86/
Eric Isaacson used the technique to mark executables, so that he could determine if they were created with an unregistered copy of a86.
Some instructions have dont care bits in them.
You could remove these bits in order to compress the file but they occur so rarely its not worth it.
And yes the redundency would have been used up.
Mouse powered Chips, Open source Processors and Lego
It just means that you can encode certain stuff in equivalent ways (*). Like: mov eax, 0 xor eax, eax sub eax, eax are all equivalent in functionality to zero the eax register.
* = Taking into account flags and instruction size restrictions, etc.
The "redundancy" comes from these facts. So, it's not size redundancy as such, and you can't remove the redundancy. It's more like permutations of the instructions are equivalent (length stays the same).
Can someone explain to me exactly what this means? Will all i386 executable binaries have unnecessary redundancy? Could the size of the binary be harmlessly reduced by removing it? If so, then why isn't this done?
It means that if you want to add 50 to a number, you can choose to do (+50) or (-(-50)). They both take up the same amount of space and do the same thing. But if you first process a program to ensure that all additions and subtractions are actually additions, then you can encode data into the list of additions by making some of them into subtractions.
er...
Steganography requires that it is impossible to prove that data is being hidden there. (Without reference to other material, etc etc).
From The Free On-line Dictionary of Computing (09 FEB 02):
steganography
Hiding a secret message within a larger one in such a way that others can not discern the presence or contents of the hidden message. For example, a message might be hidden within an image by changing the least significant bits to be the message bits.
Yeah, I know another unchecked perpetual motion machine story from timothy. But no, in this case, the story is not wrong, its just 15 years old (the technique was used 15 year ago, I mean.)
The key point is to exploit x86 instruction set redundancy to find a few bits of entropy here and there strewn throughout the executable code. RISC instructions have the same potential. For example:
add r0, r1, r2
add r0, r2, r1 # not much different
You are both correct and incorrect. While it's obviously not possible to simply go through changing instructions, operators and operands without consideration as to the effect on the program, it is possible to leverage functionally identical instructions to represent a bit.
If you read the article, a trivial example would be subtracting -5, rather than adding +5. The presence of a subtraction operation, rather than an addition operation can signify a binary digit.
Unfortunately, due to the consistent output from compilers, this is not steganography - you can both tell that the executable has been altered, and read the message! His plans for the future (parameter organisation, etc.) may be more relevant, but at the moment this is a proof of concept implementation, not a usable system.
Anyone interested in other forms of steganography could do worse than to read Andrew Tanenbaum's page on the subject.
- It means that if you want to add 50 to a number, you can choose to do (+50) or (-(-50)).
Actually on the x86, those two are not equivalent. They set the carry flag in opposite directions.Now what the author does is, alter the original binary string to that bit-string data of our interest (of the same length). This process requires flipping of instructions. For example, if some instruction is addition (1), but your data requires it to be (0) bit, you change the instruction to subtraction, and change the operand to a negative of the original value. Same applies to flipping a '0' to '1'.
Addition-subtraction works because there are no overflow issues (atleast with signed ints). Since this is also a very common operation, your executable is likely to be large enough to "hold" sizeable data.
Never. The information, though contained in an executable file, is not itself executable (unless you went and took that information out and then executed it separately. The whole point is it does not affect the execution of the program that you hide the information into. So you can put whatever information you want in there (even the code for a virus) and it will still not be a virus, because that information will never get executed unless you actively go in there, extract it, paste it as an executable file somewhere else (eg in memory) and then execute it - so you'd need another virus to do this, basically.
Daniel
Carpe Diem
Hiding messages within messages are used often in many contexts, like the radio broadcasts in WW2 sending "birthday greetings" among other things
This is no hoax
I has the same properties as:
a*b gives the same result as b*a.
You have options on what instructions to use which yields the same results.
Lets say a*b is a 1 and b*a is a 0. You could describe a byte with eight occurancies of the (a*b || b*a) operation.
a*b b*a b*a a*b a*b a*b b*a a*b == 10011101
A common practice with x86 is to use XOR AX, AX instead of MOV AX, 0 to clear the AX register.
However, this is not interchangeable since they do not have the same size. The XOR method is often used because it is faster and have less size IIRC.
Did you read the article?
First, executables are called executables because the computer interprets them. They are made of instructions, and unlike a document you cannot simply tamper with things because it will confuse the computer when it tries to run the executable.
Of course you can tamper with executables! As long as your modified version does the same thing, there is no harm done. If you change the addition of a positive number to the subtraction of a negative number, you get the same result if you run it. You run through the binary and if the current bit of data to be hidden is a 0, you don't modify that particular addition instruction and if the data bit is 1 then you *do* modify it. If you compare the modified binary to an original, you can see all the changes and extract the hidden data.
Second, and most importantly, the size of the file is dependent on the size of the bytes within the file. Because the bytes in the file have differing values depending on the instructions they encode, altering the data will alter the size unless you're borrowing from one byte to inflate another -- and in this case, again, you run afoul of the first problem.
This makes no sense to me. The replacement instruction is the same size as the original.
I'm surprised the editors didn't review this before approving it for posting. This is really pretty elementary to anyone who understands object code.
I don't doubt that you understand object code but you don't seem to understand this technique.
PUSH/POP is significantly slower than two MOV instructions on an x86, though...
-- Ed Carp, N7EKG erc@pobox.com PGP KeyID: 0x0BD32C9B What I'm up to: http://intuitives.mine.nu
The first vorbis plugin for Nero is out.
One less thing for the mp3-lUsers to complain abou
You're confusing "byte" and "char". "char" is related to character sets, "byte" has nothing to do with them. Just because you're using 16 bit unicode does not change the size of the byte, it simply means that your "char" is two "bytes" (if your bytes are 8 bits). Why would a unicode system half the resolution of memory just because of the character set used? You could have a byte of 8 bits, a character of two bytes, or a byte of 128 bits and a character of 256 bits. No connection between the two.
Stenography is the art of writing in Shorthand. :-)
Follow me
It should be easy enough to get around this. The statistical telltale is only due to the fact that El-Khalil consistently uses the same type of instruction to encode a certain bit value. Have Hydan XOR the hidden message with a secret key that produces the right distribution of ones and zeros prior to encoding the message and the problem disappears.
I'm afraid this will not work.
Problem is: 'normal' programs will do 'sub 50' instead of 'add -50'. If you don't want to be visible that a message is contained, you cannot change that. But if you don't change that (in about 50% of the cases), you can't hide the information! The only key that would work here would be as long as the message itself!
The technique you propose will work to get a more even distribution of ones and zeros, but not the 'all zeros' (sub 50) distribution that is present in 'standard' programs.
You're confusing redundancy in the program (extra instructions executed) with redundancy in the instruction set (extra instructions available).
The i386 set has add and subtract instructions where only one is strictly needed. From what I've read, this tool works by changing a sub 50 to an add -50, taking advantage of this. (Or a add 30 to sub -30.)
The problem is, no person or complier would write code this way unless they had a particular reason to. Such as hiding something.
if the answer isn't violence, neither is your silence / freedom of expression doesn't make it alright
There is no "byte" data type in C
There are distinct "byte" and "char" data types in the Java programming language. The "byte" is 8-bit as expected in PC-type and RISC architectures, but because the Java programming language's native character encoding is UTF-16 Unicode, "char" is 16-bit.
Will I retire or break 10K?
All stenographic methods that I've heard of leave some signs of tampering. For instance, the common method of hiding information in an image file by fiddling with the least signifigant bits in the RPG values is completely undetectable to the eye, however a statistical analysis of those low bits will reveal an unnatural amount of randomness. Really this is unavoidable since most any innocent looking data is going to have some natural order to it.
We hope your rules and wisdom choke you / Now we are one in everlasting peace
Unfortunately, David Pridie, aka. Martial Artist, the programmer that wrote the first message, "passed away very suddenly on the morning of Friday, January 12, 2001. He died in his home, listening to music and playing a computer game. An attack of bronchial asthma was established as the cause, something which he had complained of the past week or so before. "
"At the time he got himself and H2O in quite a bit of hot water with Nintendo. He figured it was his small piece of immortality"
He was right
http://www.pridie.org
Hydan doesn't give you any deniability, does it? I just read the artice; I haven't tried the program, but if you use a well-known method of embedding info, it's not very steganographic anymore. The bad guys can just run hydan on your executables and see what comes out.
If you want deniability even in the face of torture, you want rubber hose crypto. You might also want to use an authentication method more complicated than a password, so they'll have to torture you in the computer room instead of the dungeon, and they can't break your fingers or damage your higher brain functions.
#define X(x,y) x##y
Peter Cordes ; e-mail: X(peter@cordes ,
Not in this day and age, because everyone uses strong hashes. I suppose the error-detection code that they preserved was CRC-32, or an checksum (add up all the bytes). There is no known way to efficiently figure out how to change a file without changing its MD5 or SHA1 hash. Any cryptographically strong hash will make undetectable modification computationally infeasible.
#define X(x,y) x##y
Peter Cordes ; e-mail: X(peter@cordes ,