Hydan: Steganography in Executables
An anonymous reader says "Ever wanted to hide a message into an executable? Now you can with Hydan. Presented recently by Rakan El-Khalil at Defcon and Blackhat, this tool lets you embed data into an application without changing its functionality or filesize! Check it out. Use includes steganography as well as embedding a program's signature into itself to verify it's not been tampered with."
You guys make me proud!
"What are you doing?"
"Oh, hydan out."
If steganography is now in the hands of joe user, how useful is it really? It's not exactly a secret anymore, is it? ;P
Un-news
...the next generation of ASCII art coming our way.
Discovered a copyright string that was also executeable 68k code .... and included it in my main initialization routines
I am 1337.
"That's the sort of blinkered, philistine pig ignorance I've come to expect from you non-creative garbage."-Monty Python
if you blurt something like that out in the blurb maybe it would be nice to mention how the hell it happens. especially when the site gets slashed so fast.
executable packing or actually increasing the filesize? either one has to happen.
world was created 5 seconds before this post as it is.
it looks like the information is being hidden by a slashdotted executable.
NOOOOOOOOOOO!
So, for when the first lawsuit against this?
especially if the OS goes off and double checks the executable is legit before executing it...
Donald 'Duck' Dunn: We had a band powerful enough to turn goat piss into gasoline.
I can see how this would be interesting, well to all the terrorists out there, what are some honest or dishonest uses for stegnography?
If you embed a signiature of the file into the file, this by definition changes the file's signiature. At best you can append the signiature. However if the file can be modified, so can it's signiature.
If these folks have figured out a way of circumventing this innate paradox, I'm impressed and am dying to hear more about the technology/mathematics behind it! Can you say Nobel Prize nomination?
"lets you embed data into an application without changing its functionality or filesize" Oh well, it was just a theory anyway ;)
At least not without a top down Orwellian soceity where all hardware and software is controlled.
Free Mac Mini Yeah, it's
Not really :)
But I'd like to make that dog downstairs stop barking.
Get your own free personal location tracker
without changing file sizes... let me stick my pirated version of War and Piece in my Hello world application.
sometimes you don't even have to rtfa to rip on a topic...
Nuttles
Christian and proud of it
Not only a dupe, but a link to the original story is listed on the referenced page.
Wow.
The message retrieval method should be called "Hydan Seek"
Hydan: Hiding Information in Program Binaries
Rakan El-Khalil and Angelos D. Keromytis
Department of Computer Science, Columbia University in the City of New York
{rfe3,angelos}@cs.columbia.edu
Abstract. We present a scheme to steganographically embed information in x86
program binaries. We define sets of functionally-equivalent instructions, and use
a key-derived selection process to encode information in machine code by using
the appropriate instructions from each set. Such a scheme can be used to watermark
(or fingerprint) code, sign executables, or simply create a covert communication
channel. We experimentally measure the capacity of the covert channel by
determining the distribution of equivalent instructions in several popular operating
system distributions. Our analysis shows that we can embed only a limited
amount of information in each executable (approximately 1
110 bit encoding rate),
although this amount is sufficient for some of the potential applications mentioned.
We conclude by discussing potential improvements to the capacity of the
channel and other future work.
1 Introduction
Traditional information-hiding techniques encode ancillary information inside data such
as still images, video, or audio. They typically do so in a way that an observer does not
notice them, by using redundant bits in the medium. The definition of "redundancy"
depends on the medium under consideration (cover medium). Because of their invasive
nature, information-hiding systems are often easy to detect, although considerable work
has gone into hiding any patterns [1]. In modern steganography, a secret key is used to
both encrypt the information-to-be-encoded and select a subset of the redundant bits
to be used for the encoding process. The goal is to make it difficult for an attacker to
detect the presence of secret information. This is practical only if the cover medium has
a large enough capacity that, even ignoring a significant number of redundant bits, we
can still encode enough useful information.
Aside from its use in secret communications, an information-hiding process [2] can
be used for watermarking and fingerprinting, whereby information describing properties
of the data (e.g., its source, the user that purchased it, access control information,
etc.) is encoded in the data itself. The "secret" information is encoded in such a manner
that removing it is intended to damage the data and render it unusable (e.g., introduce
noise to an audio track), with various degrees of success.
In this paper, we describe the application of information-hiding techniques to arbitrary
program binaries. Using our system, named Hydan, we can embed information
using functionally-equivalent instructions (i.e., i386 machine code instructions). To determine
the available capacity, we analyze the binaries of several operating system distributions
(OpenBSD 3.4, FreeBSD 4.4, NetBSD 1.6.1, Red Hat Linux 9, andWindows
XP Professional). Our tests show that the available capacity, given the sets of equivalent
instructions we currently use, is approximately 1
110 bits (i.e., we can encode 1 bit
of information for every 110 bits of program code). Note that we make a distinction
between the overall program size and the code size. The overall program size includes
various data, relocation, and BSS sections, in addition to the code sections. Experimentally,
we have found that the code sections take up 75% of the total size of executables,
on average. For example, a 210KB statically linked executable contains about 158KB
of code, in which we can embed 1.44KB (11, 766 bits) of data.
In comparison, other tools such as Outguess [1] are able to achieve a 1
17 bit encoding
rate in images, and are thus better suited for covert communications, where data-rate
is an important consideration. The 1
110 encoding rate achieved by the currently implemented
version of Hydan is obtained when we only use instruction
Intresting. Allthough I didn't get a chance to RTFA, hiding encrypted data in an executable doees not seem all that practical to me. It may not change the filesize or functionality, but would it not also change other signature methods (like md5sums?). From my understanding, the main strength of steganography is the file with the encrypted data being indistinguishable from regular files. Since the diffrence can be detected with CRC or MD5, wouldn't that defeat the main purpose?
They should have put their message in the web servers executable so that when it gets slashdotted it could just shit itself and we could still get how it works.
as soon as OBL gets a hold of this program he's going to start secretly x/f'ing his messages through this program. Yep, wont' be long and the NSA/CIA is going to can this program from all the rest of us. GET IT WHILE YOU CAN!
http://dont.spam.me.anymore.com
You mean they have an entire scientific discipline dedicated to studying stegosauruses now? Personally I'd rather specialize in Tyrannography instead...
MD5 Hash is my friend...
Given that it embeds itself in the program without changing it, how would you recover the data while being sure to prevent false positives?
Since when has this country used intellectual elite as a pejorative term?
The gist of it is that there are many instructions in x86 that have the same result. You can replace these, and based on which instructions you encounter you can find a hidden message.
So much for theory. Here's an example; let's say we have a couple of synonyms, like so
car, automobile; Robert, Bob; crashed, trashed; beer, whisky.
Let's say we have a little story like so;
"Bob got in his car. He crashed it, because he had been drinking too much beer. His car is now a total loss."
Let's say we want to send a secret binary message "0110". Cunningly, we substitute the first of each pair of synonyms if we want to encode a zero, and the second for a one. So the story is now
"Robert got in his automobile. He trashed it, because he had been drinking too much whisky. His car is now a total loss." (notice how not all key words changed).
This is a bit harder with natural language, as many words aren't quite right to use in place of the other ("got in his automobile" just doesn't sound right), so it's actually easier to do for machine code.
SCO employee? Check out the bounty
Isn't this idea (in fact, this exact program) a few years old? I've definitely heard about it before...
The idea is that, in executable code, there is more than one way to accomplish exactly the same end result. For example, asking whether an integer is >=1 or >0 is effectively the same question, but might be coded differently in an executable.
Hydan probably searches for a certain subset of such "optional" commands, and encodes data in binary by using either one type or the other. Thus, the functionality and size are unchanged (assuming both commands take the same number of bits to encode), but you can store data which can be read by appropriate decoding.
Note: I didn't RTFA so this is just what I recall, and the actual implementation might be quite different... but you should get the idea of how they can claim to leave filesize and functionality untouched.
OK, you place the hash in the executable; the file is changed. Now the hash should be different...
Problem.
If you mod this up, your slashdot background will turn into a beautiful sunset!
No...
Read my blog.
steganography: the hiding of a secret message within an ordinary message and the extraction of it at its destination.
The nice part of steganography is that you don't know there is a hidden message.
In order to make sure people can't determine any changes to a file, so preferably there is no reference material to compare the file with. Reference material like other unchanged executables.
So this doesn't work unless you write a program for each message you want to hide.... Not? Ok. I'd think so too.
So I'd rather take my digital camera, take a picture of a whatever and use that as an original.
But I have to admit it's way cool to make something like this, that changes an executable without breaking it.
Privacy is terrorism.
How do you know there's information in a given executable?
If you know what compiler it was compiled under you could look for opcodes that aren't generated by that compiler. But what if you don't know what compiler generated the executable?
And what if the information isn't hidden in the opcodes at all, but merely in the ordering of rearrangeable instructions?
Take the following two instructions for example:
mov ax, 5
mov bx, 6
What if your stenography program would set them in alphabetical order by register for 0 and reverse alphabetical order by register for 1? Or what if it was instead based on the numerical order of the immediate values? Or heck, we don't really have to use bx here, we could use cx instead.
But then again, all of those outputs could be generated from the same source file by different compilers (and possibly even with the same compiler -- just slightly different source files).
This means that it would be impossible to know if a given file contains stenographic information, and even then, you would have to look at the exact right combination of features in order to decode the message, and anything else would give you garbage.
Though as I was writing that, I thought of a way to possibly detect stenographic executables: Compilers generally do the same thing the same way every time. If the executable appears to alternate or shift the way it does things, then it probably contains stenographic information. Images are definitely much better for information hiding (more randomness to hide in...). Yes the precious jpg's in my po^H^H art collection would be perfect...
A shareware x86 assembler some years back claimed that the author was able to tell if anyone used his assembler to distribute binaries in violation of his license. While it apparently didn't scare enough people into paying for his program (maybe they used MASM instead?), the program might be useful for busting any patents that come up around this technology.
CEE5210S The signal SIGHUP was received.
So I can embed an entire copy of the unabridged oxford english dictionary into something like vi? That's some serious data compression!
Already done this one.
Can you say college algebra?
The only thing they have to solve is f(X+S) = S, where f is the algorithm for calculating the signature, X is the exe code, and S is the signature. Depending on f, it can either be completely trivial to calculate S or impossible.
The site is slashdotted so I don't know if this is how it works, but...
Some 8086 opcodes contain a bit that reverses the operation. For examble, with the bit set in the instruction "mov bx,cx", bx would be copied to cx instead of cx to bx. By switching the registers AND setting the bit, you effectively reverse the operation twice, creating different machine code that does exactly the same thing.
The A86 assembler used this bit to create a fingerprint that would make it easier to detect non-paying users.
Now that messages can be hidden in executable files, I feel a lot better about opening .exe files that are mailed to me!
Very informative.
By the way, to decode, you would simply scan for keywords: Robert, automobile, trashed, whisky, car = the 1st, 2nd, 2nd, 1st, and 1st words in their respective 2-word sets. Thus, 01100.
However, this adds extra data onto the end of the encoded message. But to get around that, just have it be well-known that the first n bits you encode indicate the length of the message, not counting those bits.
Comment removed based on user account deletion
Drill baby drill - on Mars
If the program has been tampered with, the most obvious thing to tamper with would be the validation mechanism.
I'm going to stick with a separate md5sum, thanks.
-- perl -e'print pack"H*","6e656d6f406d38792e6f7267"'
app that does not change the md5.. then i'll be impressed.
Since a lot of people is asking, here it goes:
- How it works
--------------
Overview: Hydan finds sets of equivalent instructions in the binary,
and uses that redundancy to embed data. The larger the set of
equivalent instructions, the more bits can be embedded. For example,
if instructions {a, b, c, d} are all equivalent, then we can embed two
bits of information when any of those instructions are encountered.
Embedding: Hydan goes through the application sequentially, and
whenever it finds an instruction that it has equivalents to, it
substitutes in the instruction that represents the bit(s) of data
hydan is currently embedding. A simple example: "add %eax, 50" is
equivalent to "sub %eax, -50". So this set is {"add %reg, $imm", "sub
%reg, $imm"}. Whenever an instruction of the form "add %reg, $imm" is
encountered, hydan can embed one bit of the message. If the bit is 0,
it leaves it as an add instruction. Else it substitutes it to "sub
%reg, -$imm". (and vice versa)
Decoding: When it is time to extract the embedded message, every
"add %reg, $imm" is taken to mean bit 0, and every sub instruction
encodes the bit 1, and the embedded message is reconstructed that way.
Encryption: Hydan first prompts the user for a passphrase before
embedding or decoding the contents of the application. In the case of
embedding, hydan prepends the length of the message to the message,
encrypts that with blowfish in cbc mode, and embeds the result into
the application. When decoding, hydan extracts all the possible bits
from the application (since it does not know how long the message is
a-priori; that information is encrypted). Hydan then decrypts the
message properly since it is in CBC mode and need not know the total
length first. The lenght is then used to truncate the message to
size.
Instructions: For a complete list of the sets of equivalent
instructions, please refer to hdn_insns.c.
- Attacks
---------
There are three classes of attacks that are applicable to hydan:
overwriting, detection, and extraction. The overwriting attack refers
to the ability to overwrite the message embedded in the application,
whether its presence was detected or not. An attacker should also not
be able to detect the presence of a message in the application, nor
decrypt it.
The overwriting attack: hydan currently has no means to protect
against this type of attack. Since hydan embeds the message
sequentially, starting from the top of the application, an attacker
could re-run hydan with a bogus text and embed that on top of the
original message. The intended recipient of the application would
thus be unable to retrieve the original message. One way this could
be solved is to add an error correcting code to the encoding of the
message, and distribute the message throughout the application in a
passphrase specific manner. This way only parts of the original
message would be overwritten, and the original may still be
reconstructed. Of course, there is nothing that can be done if the
attacker insists on overwriting with a message size that is the
maximum embeddable in the given application. However, the computation
required to overwrite each application on a large scale might be
prohibitive enough to discourage this as a routine behaviour, at an
ISP for example.
Detection: Binaries produced by hydan should not exhibit obvious
patterns. At the most superficial level, this is accomplished by not
embedding any marker or other easily recognizable token. At best, the
embedded data looks random (which is why it is bf encrypted). At the
assembly level however, the current version of hydan makes no attempt
at mimicing the original distribution of instructions in the
application, and is thus vulnerable to statistical analysis. Indeed,
although all the instructions are equivalent, some may appear more
frequent
Mentioned here about a couple of years ago. Looks like its about time i cut down on my /.
The lunatic is in my head
But then I started thinking about how effectively viruses are distributed by non-techies who do click on the attachments in their EMAILs. Perhaps viruses or spyware could be used to "broadcast" a message this way to different cells in a covert organization (terrorists, organized crime, chess club members, whatever). All you'd need is an unprotected PC to act as a tethered goat and catch all those infections for later reading.
For that matter, a sender could "neuter" a virus by disabling its reproductive code and then embed a message in it and send it through some anonymizer (either a formal anonymizer or using a shell account). When the recipient stores it in a quarantine directory, it would look just like an infected EMAIL that had been cleaned up by your antivirus program, not a covert message. Some variation of this using spyware infection would be even more effective as they tend (in my limited experience) to have even more variants than viruses - the obfuscated message would be more readilly confused with normal variation. Instead of posting your tampered executables to some usenet forum, you would simply have the reciepient visit a site running the spyware. New messages would be sent and old ones sterilized when the spyware reinstalls itself.
Just my 20 mills.
"Prepare for the worst - hope for the best."
This guy wrote his assembler to generate unusual form of MOV instructions at least 10 years ago. In this way, he can find out if a program is generated using an unregistered version of A86.
Any CPU that has an instruction to exchange two registers will have some redundancy, but for X86 even basic mov (as well as add, sub, cmp and so on) specifies both two operands and a flag that specifies which one is source and which one is destination. The significance is that both operands can be registers, but only one can be a memory reference.
A much more impressive use would be a program that reads its own code as data to save the last few bytes, especially if it has a real purpose, like fitting a game into a fixed-size ROM.
This same idea of reorganizing the message without changing the message to conceal a more covert message could be applied to alot of things. I guess its only a matter of time before we had this in software. I wonder what other mediums this can be applied to..
Meet new people, and kill them.
Note that as far as I remember, stenography by definition is supposed to make it impossible to prove that there is data hidden there - one step further than normal encryption. It's not so much as about hiding the data as being able to deny its existance.
One reason for this is if you have encrypted data on your disk, then courts can demand the password for it. Stenography allows you to insist there is no hidden data.
A new virus is quickly spreading across the internet. Experts say it started at Defcon with a demonstration of a program that allows users to add a secret text to an executable file without altering it's filesize. Apparently the program also attached a message of it's own... don't run programs demonstrated at defcon!
I can count to 1023 on my hands. Ask me about #132.
If you owned a company and were concerned about this kind of practice, surely you could just run the same program over every executable that comes in... destroying the original, but preserving the behaviour
He's right.
Now, that "block all executables" setting that I can't turn find or off in Outlook will prevent terrorists from exchanging secret messages embedded in trojan executables that are attached to emails purporting to be great pornography!
It's not an annoyance; it's a *feature*!
This is a fascinating approach. One thing I didn't see mentioned at all in the documents is the possible change in performance characteristics by changing instructions which have the same effect but which have different pipeline, execution unit, or cache properties.
Modern optimizing compilers spend an awful lot of effort generating efficient combinations of instructions which try to make the most out of CPUs having complicated rules. For example, add eax,eax and shl eax,1 might both produce the same desired effect but yield significantly different runtimes depending on the presence / absence of barrel shifters or the ability of particular instructions to pair in a given CPU.
Naturally the above would only matter if the modified code is in an inner loop, but it could happen.
'Copyright SCO' and 'Darl Rules!'
All stenographic methods that I've heard of leave some signs of tampering. For instance, the common method of hiding information in an image file by fiddling with the least signifigant bits in the RPG values is completely undetectable to the eye, however a statistical analysis of those low bits will reveal an unnatural amount of randomness. Really this is unavoidable since most any innocent looking data is going to have some natural order to it.
When you decode my message, it will say, "FIRST P0ST w00+"
Wh47 d1d j00 541, 31337 15n't t3h r0xor5 ne m0r3???
inserting the signature into the exe will change the signature of the file, so don't you need to know the (exe + embedded sig)'s signature before embedding anything? how?
thanks in advance for any explanations.
How about a virus that does not harm to your exe files, but only signs them and counts how many times they have been signed.
In cryptography, steganography has a particular meaning. In the same way that the goal of encryption is to prevent the message from being read, the goal of steganography is to prevent the message from being detected. A successful steganographic embedding is one in which a third party would not be able to find out if it is there. If you gave him two files, one with an embedded message and the other unprocessed, he should not be able to tell them apart.
For a method to truly be steganography, it's not enough just to embed some data into another. That's possible any time there's redundancy. The requirement is to make it so clever and/or subtle that there is no way to distinguish a processed file from an unprocessed one.
I doubt that this new method passes the test. Generally, while there are many synonyms possible in code, both in single instructions and in short sequences of instructions, the statistics of how these are distributed in unprocessed files are probably not random. Chances are that one synonym is used more than another. If you embed random data in a straightforward way, you will then have equal usages of both alternatives. This is a highly unusual condition, and to someone in the know, files like these will be easily distinguished.
Only if they have found a kind of synonym which already has purely random statistics, or where they are careful to precisely mimic the statistics of the original file as they add their data, can this truly be considered a form of steganography.
Acronyms are abbreviations which are pronounced like normal words (e.g., SCUBA, RADAR, GNU). MD5 is pronounced as a series of letters and numbers ("Em Dee Five"); therefore, it is not an acronym.
-1, Horrible Animal Cruelty
nt
I do steganography research and this technique is entirely unimpressive.
The whole point of steganography is to make the hidden message undetectable. Making a message unsuspicious is being clever, but its not steganography. This is like renaming your pirated divx movies to "shareware.exe". Sure it makes it less suspicious, but its easy to check.
Like the article mentions. The technique is completely vulnerable to statistical attacks. But these might not even be necessary, simply overwrite the embedded information with the default compiler values (surely the compiler would pick the same equivalent expression every time). Its only a matter of effort to create a detector.
Real steganography is undetectable (at least below a certain probability of detection).
Hey, I was disassembling all of my executables looking for hidden messages, and I found one. It reads "0xDEADBEEF"! Does anyone know what that means?!
Not only, as some other posters have already pointed out, do several of these instructions do different things, they're also different sizes. inc ax (or eax, depending on what mode the processor is in) is only one byte, whereas the add instructions will need a extra byte or four for their operands.
Pretend that something especially witty is here. Thanks.
the statistics of how these are distributed in unprocessed files are probably not random.
Sounds to me like my home computer system needs to be compiled with anti-optimization to throw in some more of these synonyms randomly into my executables.
"Provided by the management for your protection."
The documentation for the shareware DOS assembler, A86, claimed that the set of opcodes it chose to emit for various instructions was unique (i.e. those exact choices weren't made by any other assembler). Therefore if you released software assembled with A86 without registering it, if the A86 author ever got hold of that software, he'd know you had used his assembler to produce it. So the steganography in this case encoded a one bit value: "I used A86".
Pretend that something especially witty is here. Thanks.
http://it.slashdot.org/article.pl?sid=03/03/02/045 5200&tid=93&tid=172
http://developers.slashdot.org/article.pl?sid=03/0 1/21/2322222&tid=162&tid=8
KISS
Oh well, what the hell...
I have lots of messages hidden in executables. Observe:
pth@ph9:31:~/work/projects/.linuxx$ strings /usr/bin/*
/lib/ld-linux.so.2
/tmp
libc.so.6
textdomain
printf
stdout
geteuid
getopt_long
__fpending
__ctype_b
getenv
[...] (sorry, I have to remove 1099503 lines because of stupid lame filter)
can't create a temporary filename
%s exited with status %d
open decompressed file: %s
%s:%d: %s
TMPDIR
XXXXXX
memory exhausted
%s:
: %s
pth@ph9:31:~/work/projects/.linuxx$ _
Some of them even seem to be encryptioned. Old news.
Sincerely,
Pan Tarhei Hosé, PhD.
"Homo sum et cogito ergo odi profanum vulgus et libido."
I used to do this a long time ago, take some text, encrypt it, and pad the .exe with the data. The program ran fine, the filesize may or may not go up, depending on where in the program I hid it (some MS apps have a quarter meg or so of empty space right in the middle of the .exe...).
.exe through a hex editior...it was slow and painful, but what better way to encrypt and hide data did you have at the time....steganography in a .gif? Please...in most cases you could tell it was tampered with...given in some you couldnt, but at the time the technology just wasnt all that good.
I usually padded the
Oh, well...back to talking about how when I was your guy's age, i used to have to climb mt.everest to get my CPUs, and how I had to wrestle alligtors to get money for floppy disks...
So now I can take a virus, embed a secret message, and unleash it. The virus may get through virus scanners since it's signature changed, and my secret message gets distributed worldwide in a few hours. The existance of the secret message would not be common knowledge, so the virus scanners would get updated quickly to stop it, but those expecting the message would allow it in, and then decode the secret message. Totally anonymous mass distributed communications.
Now who would use something like that?
Salon had an article awhile back on embedding messages in photos.
That appears to be the more common use of the technique of steganography, lots of synonyms in media files.
Why wouldn't Microsoft or any other mega-corporation do this with their executables? They could embed your product key in the MSOffice.exe when you "activated" your product and if it ever got out they could send the goons in black helicopters?
Get my tin foil hat.
...But I digress. TREMBLE PUNY HUMANS!ONE DAY MY SPECIES WILL DESTROY YOU ALL!
This is OK as an academic exercise or to demonstrate the basics of steganography, but as has already been partly covered by others, this fails to be a practical or useful example of steganography for several reasons:
Firstly, as has already been addressed, the changes are detectable. This is very important for plausible deniability which legally speaking means the ability to deny there is anything stored within the "media" (in this case the executable). In the UK you are required to provide encryption keys or anything else required to "decode" data for a court of law. So if it's absolute security you want, you won't get it in the UK!
Secondly, the coverage at 1/110 is extremely poor. A much more traditional form of steganography is to hide information in graphical images by playing around with the least significant bit and changing the palette. This can typically achieve a coverage of as much as 1 in 8 which whilst better is still painfully low for serious use. This is often mentioned in the urban myths about 9/11 and AQ supposedly hiding messages in images. Another common usage is in .wav files as the human ear is unable to detect changes to the lower frequencies.
Coverage and plausible deniability are both very important for field operatives (or secret agents if you prefer). Firstly, 1/110 means they would need to start shipping megs if not gigs of executables across networks to pass on the simplest of messages, and trying to hide or transport secret documents becomes impossible. Secondly, the plausible deniability extends far beyond the realms of the UK justice system to the realities of field work - when they're strung up by bits they shouldn't be strung up by, they can hardly plead ignorance to avoid revealing their hidden secrets if their captors already know the data is there. The whole point of steganography is that it's indetectable! Agent X would not be a happy bunny!
Now I'm not saying this example or image/.wav files aren't fun or interesting because they are. But there are far more serious uses for, and much better applications of this technology. One example would be:
http://www.stegostik.com/
-Mark.
embedding a program's signature into itself to verify it's not been tampered with
If I wanted to use this for the above, I could simple encode a serial number for the stenagraphy part. I could have an online server with a database of accepted strings.
User> Pays online. The users 'username' and 'password' is also registered on the server. His/her executable is encoded with the encrypted serial number of the 'username' and 'password'. Then the user can download the executable. Everytime the user wants to use the executable, they have to enter their 'username' and 'password' The 'username' and 'password' is encrypted to a serial number which must equal the serial encoded in the executable itself and voila, Microsoft can now begin charging for use of whatever they are not yet charging you for...
Interesting that this topic has emerged on /.. The plot of William Gibson's latest novel, "Pattern Recognition", uses the concept in an interesting context, in a fascinating departure from the mostly hypothetical technology in his previous works. I'm not finished the text yet, but will submit a review to /. when I do (if I can find a handle that isn't already taken).
In short, unencrypted steganography isn't particularly useful, but encrypted, you can really hide things.
Would mass use of Ron Rivest's cleartext 'chaffing' technique offer suitable 'deniable encryption' to the masses?
Chaffing and Winnowing: Confidentiality without Encryption
This is the Ron Rivest of RSA fame...
In particular, read section 6 in this part of the original documentation.
Hydan merely takes this technique that was invented by Eric in 1986, and turns it into a general-purpose Steganography algorithm. It is not anything new.
How would something like this affect the MD5 Sum of a file?
Correct me if I am wrong, but doesn't it [MD5] take in to account the NUL(0) bytes of the file? So like, if this method of stenography was used to embed something in to a program - be it message or virus, would it not change the MD5 Sum, meaning it was thus possible to detect an anomoly [eg the message]?
Founder & COO, Hayai India (hayai.in) / USA (hayaibroadband.com)