Researchers Seek Help Cracking Gauss Mystery Payload
An anonymous reader writes "Researchers at Kaspersky Lab are asking the public for help in cracking an encrypted warhead that gets delivered to infected machines by the recently discovered Gauss malware toolkit. They're publishing encrypted sections and hashes in the hope that cryptographers will be able to help them out."
Adds reader DavidGilbert99: "The so-called Godel module is targeting a specific machine with specific system configurations, and Kaspersky believes the victim is likely a high-profile target. The decryption key, Kaspersky believes, will be derived from these specific system configurations, and so far it has been unable to find out what they are."
The trick in this case is that the key is already available at the targeted machine - the virus tries to combine various pairs of %PATH% paths and names from %PROGRAMFILES% and if some combination has an expected checksum, that's the key. To make cryptanalysis a bit more difficult, it seems that the second part of the key is not in plain ASCII. Therefore the "key distribution problem" is nicely solved - if the code runs on targeted system, the key will be easily generated. On any other machine you won't obtain any information about the key.
The program doesn't have the key, the target computer does! When it runs, it collects various information about the computer's configuration and uses that to generate a possible key. It tries to decrypt its payload with that, and if the decryption works, the payload runs. If the decryption doesn't, then the key was wrong, and it's not the target computer, and the payload doesn't run.
It's a very clever approach, and depending on how specific the target configuration is, we may never see the decrypted payload in the public world.
According to Kaspersky, the way it works is:
1) Enumerate all directories in the computers PATH variable
2) Enumerate all files in the %PROGRAMFILES% directory whose file name starts with a non-latin-alphabet unicode character (i.e. arabic)
3) Hash every pair from the previous two lists with MD5 and check against a known hash
If the hashes match, then it has found the correct configuration. This means it is looking for a computer with a specific directory or file in the %PROGRAMFILES% directory, in combination with a specific directory in its path variable. This hash is salted and stretched so they obviously knew what they were doing.
Once it knows it has the correct configuration, it rehashes that pair with a different salt to get an RC4 encryption key which unlocks the payload. Different salts are used in the validation and decryption stages so that the validation hash (which is stored in the binary and known to everybody) does not give any information about the target configuration or the encryption key. Given the number of possible combinations of known files that could be in %PROGRAMFILES% and directories that could be in %PATH%, combined with the fact that the target configuration is likely one that is not publicly known, it will be very difficult to break this unless the targeted party comes forth.
No, the key isn't in there. The algorithm to generate the key from specific information on the host system is in there, but the key can only be correctly generated from the host system having the right information for which the algorithm can properly derive the correct key.
This is not at all how it works. Nobody has the key, the key is derived from local configuration values using a cryptographic hash function. Just as your hard drive may be encrypted with a key that is generated from your password, this payload is encrypted with a key that is generated from a very long password which is a combination of specific settings on the machine. If you run it on a machine with the settings exactly right, it will unlock. If you run it on any other machine, it will not and you will get no information about what they key is. Since there are so many possible combinations of settings (particularly it is looking at all the programs in your program files folder in combination with all the directories in your path variable) it is unlikely that people will just stumble across the correct one.
If they probably are using a GPL library for decoding/uncompressing, they could be sued to release the code to be compliant with the license.
That seems to be a common misconception. That's not how the GPL works. They need to make the code available to their customers on demand. You aren't their customer, you can't demand anything.
Google it.
Last time I did, it's basically believed to be a vector for detecting infection by simply making a target navigate to a web page that tries to load the font. If it's there, you can tell the PC has the font and (therefore) the infection. If it's not, it just gets substituted and you can tell from the CSS etc. what's happened.
Probably a way for the author to see if their target machine actually ended up getting infected or not.