Reversing a Checksum Algorithm?
Todd asks: "Does anyone have good suggestions or advice regarding techniques for determining the checksum of a serial data protocol? I am currently working on a fun little project using a digital display sign which uses an unknown serial protocol. I cannot move on with the project until I figure this dreaded checksum out."
really.
I don't think you have much choice but brute-force it.
Collect as many as possible checksum -generators, and try them on portions of message.
Some more-or less trivial are 'LRC', CRC, CRC16, and CRC32 ones. Of course XOR too. difficulty is to determine which characters go to checksum, as it's not always that 'STX'/'ETX' are included.
fucktard is a tenderhearted description
Is it a very simple machine with a very simple processor and limited memory? If so, they may well have written their own with shifts and xors. On the other hand, if it is a modern processor with megabytes of memory, they probably have used a library function. Can you get access to the standard libraries for that chip?
How long is the checksum? What happens if you feed a wrong one? If it is just a single byte, and the box doesn't lock up completely on error, try all possible values for every packet!
Why do you think they put the checksum in the protocol? To catch a few dropped bits over a bad line, or to prevent tampering? If the later, watch out for some real cryptography. If only the former, assume they took the path of least resistance.
In Murphy We Turst
you said you have a DOS application that generates the data.
Well run it in a debugger and locate the part the does generates the checksum, you could rip the code out directly(or load the application up into memory and call the checksum bit), this will probably violate a few laws though.
So the best way is use there code to work out the algorythm and then write you own code(even better get someone else to do it) based on the algorythm
thank God the internet isn't a human right.
I played around with the sample data you provided. I came across some funny business which I am almost certain is an error in your spreadsheet, which throws the whole thing into doubt. Did you mean to repeat the slot encoding 0x05 for both slots 5 and 6? I'm going to assume it was a typo. If so, I learned the following about the impact the slot number has on the checksum:
8-bit-number
Bits == 76543210
Csum(x) is the value of the checksum with slot #x and all else held constant.
Csum(x) = Csum(0) +
(Bit0(x) ? -0x0F : 0) +
(Bit1(x) ? -0x1E : 0) +
(Bit2(x) ? +0x3C : 0) +
(Bit3(x) ? -0x78 : 0) +
(Bit4(x) ? -0xF0 : 0)
Note that the values added/subtracted for each bit follow a pattern:
0x0F = 00001111
0x1E = 00011110
0x3C = 00111100
0x78 = 01111000
0xF0 = 11110000
More data might shed some light on the pattern. Whatever the case, I think this is reminiscent of a CRC16, since I don't think a checksum would have this kind of behavior -- a standard addition checksum or a XOR-based checksum (even with bit rotation) would make a bit always add/subtract the same amount, but it would be a power of 2 (I think).
So now you need to find the CRC's polynomial, and I don't know enough about that kind of thing to help you. (And there is a chance that everything I've said here is wrong, since this is not my specialty!)
Time flies like an arrow. Fruit flies like a banana.
Have you tried emailing this guy? According to his resume, he worked at SilentRadio from 1980-1997, eventually ending up as vice president. And if he doesn't know the answer, maybe he can get you in touch with Mike Levin, the guy who started the company.
I put sign.exe thru IDA and identified the checksum algorithm. I found out that the only thing that goes thru the checksum is 35 35 00 then 0E DA is skipped and then the rest is put thru. The algorithm is a simple crc alike algo that adds the chars xors with the length and rotates some bits. You can find a perl program i wrote to calculate the checksum for a given range at: this location.
Good luck with your project,
Gijs
Heh - I was just working on this exact same thing after finding someone's post on piclist looking to reverse-engineer the same program. And to think, you only beat me to the solution by a few hours...
Except that you didn't, completely. The trickiest part of this checksum is that it shows up TWICE in the packet format. As you state, the checksum is initialized after the ff attention byte, and then most of the packet is run through it. However, looking at the initial packet:
FF 35 35 00 0E DA BC 01 00 00 08 00 19 02 54 45 53 54 1F 8C 52
^^^^^ ^^^^^
both of the indicated portions are checksums. The first checksum covers the string "35 35 00 0E" and the second covers the string "35 35 00 0E 01 00 00 08 00 19 02 54 45 53 54 1F". The checksum is not re-initialized after it's output. Incidentally, here's my perl version - it uses bit operations in preference to the %-heavy code that you use, but it's the same algorithm:
#!perl
$check = 0xa55a;
$sum=0;
@bytes = (@ARGV);
for $pos (0..$#bytes) {
$sum += (hex($bytes[$i]) ^ ($pos & 0xff));
$sum = $sum & 0xffff;
$check += $sum;
$check = (($check >> 1) & 0x7fff) | (($check & 1) << 15);
}
printf "%04x\n", $check;
__END__