Mystery of Duqu Programming Language Solved
wiredmikey writes "Earlier this month, researchers from Kaspersky Lab reached out to the security and programming community in an effort to help solve a mystery related to 'Duqu,' the Trojan often referred to as 'Son of Stuxnet,' which surfaced in October 2010. The mystery rested in a section of code written an unknown programming language and used in the Duqu Framework, a portion of the Payload DLL used by the Trojan to interact with Command & Control (C&C) servers after the malware infected system. Less than two weeks later, Kaspersky Lab experts now say with a high degree of certainty that the Duqu framework was written using a custom object-oriented extension to C, generally called 'OO C' and compiled with Microsoft Visual Studio Compiler 2008 (MSVC 2008) with special options for optimizing code size and inline expansion."
they may have learn MASM to avoid detection.
Oh no, Allens do exist. Although he spells it Alan.
Here you go: http://www.securelist.com/en/blog/667/The_Mystery_of_the_Duqu_Framework
"None can love freedom heartily, but good men; the rest love not freedom, but license." --John Milton
How did they deduce it was an unknown programming language? By looking at the compiled machine code? How could they tell this wasn't just regular C?
A well publicized article featuring Microsoft Development products of all things, I think they should use that PR in their Microsoft Visual Studio Ads...
"Enjoy what you're doing! If it becomes drudgery, you're doing it wrong!" - Jim Butterfield
They are trying to do the forensics. If you know the tools used, you have a much better idea where to look for the people who did it. It was almost certainly NOT a matter of determining what it was doing, they wanted to figure something out that would help them track it back to the source.
I'd call it "subjugative C".
FTFA:
Why did the authors of Duqu use OO C? While there is no easy explanation why OO C was used instead of C++ for the Duqu Framework, Kaspersky experts say there are two reasonable causes that support its use [More control over the code & Extreme portability]. These two reasons indicate that the code was written by a team of experienced ‘old-school’ developers
Why OO C? Because it worked, because they new how to use it, because they knew it would throw Kaspersky for a loop, because they thought it was cool. There are many many reasons and they do not all have to be logical.
Kaspersky experts might want to consider that the programming wheel of life may have turned and that what was once old-school is now new-school. Whose to say that the under-estimated script-kiddies cannot grow up to be formidable adults with a whole new bag of tricks?
This is already done to a large degree, at least with what matters in binary code. The "Script kiddie" tools are extremely well documented. This goes way back in time to when a tool came out called (I hope I'm remembering the name right) VCL or Virus Creation Labratory. It became pretty easy to determine VCL based code and the tool set pretty much evaporated.
What editor you use is really unimportant. The compiler is what counts, and the compiler never sees your editor.
-The wise argue that there are few absolutes, the fool argues that there are no probabilities.
The 'mystery' involved has nothing to do with the actual code. The actual code itself is probably not really all that advanced.
The reverse compiler UnStux says the program is called Hello World.
For O'Reilly's "Mastering Duqu"?
Who keeps Atlantis off the maps?
Who keeps the Martians under wraps?
We Do, We Do...
"Happy families are all alike; every unhappy family is unhappy in its own way." -- Anna Karenina by Leo Tolstoy
It was too consistent to be compiler intrinsics, but not consistent enough to be straight assembly. That's the impression I got from the original blog post.
No question it would have been possible, but given the rest of the code was compiled in MSVC it made sense that some sort of macro, framework, toolkit, or something was in between the course and the output.
Smarter than you think. I remember reading somewhere that US radio controllers in WW-II used a native american language to communicate with each other. No amount of analysis will give you any insight, if the other party is careful to not use any trails. To translate on language into another mechanically requires deep knowledge of both the languages.
If you rolled your own language with its own grammar, you can be secure in the fact that *even* deep analysis will not yield any clues, not atleast by the current technology. I am not sure such a thing can be even done by a turing machine. People with better knowledge of it are welcome to correct me If I am wrong. All the current technology is concentrated on modifying bits for security, but if you do on a sufficiently high level(aka another language) there is no way to crack it.
This case however has a achilles heel; you can still modify the binary and see what results would be by running it. After a sufficient number of trials, you should be able to decode it.
You will never have experience until after you needed it.