Slashdot Mirror


Reversing a Checksum Algorithm?

Todd asks: "Does anyone have good suggestions or advice regarding techniques for determining the checksum of a serial data protocol? I am currently working on a fun little project using a digital display sign which uses an unknown serial protocol. I cannot move on with the project until I figure this dreaded checksum out."

4 of 35 comments (clear)

  1. Need more info by heikkile · · Score: 4, Interesting
    It all depends what you have access to. Can you capture a number of packets with a valid checksum? Can you feed your own guesses to the box, and get a decent feedback? Do you know the rest of the protocol? Do you have access to the drivers or other software (in either end)?

    Is it a very simple machine with a very simple processor and limited memory? If so, they may well have written their own with shifts and xors. On the other hand, if it is a modern processor with megabytes of memory, they probably have used a library function. Can you get access to the standard libraries for that chip?

    How long is the checksum? What happens if you feed a wrong one? If it is just a single byte, and the box doesn't lock up completely on error, try all possible values for every packet!

    Why do you think they put the checksum in the protocol? To catch a few dropped bits over a bad line, or to prevent tampering? If the later, watch out for some real cryptography. If only the former, assume they took the path of least resistance.

    --

    In Murphy We Turst

  2. My process by hackwrench · · Score: 2, Interesting

    looking at your page http://www.dickwick.com/sign/sign.htm The first three values differ from each other by 15 and the next differs from the others by a multiple of 15, the only number from this set that is missing is 61, which would appear to might have been generated by 0 if that was a valid value. The ones with bit seven set are also 15 apart but offset 1 from the first group. If you keep repeating the process for other bytes you should find a connection with every bit postition/group and the resulting values of the checksum.

  3. Solved it by sepulcrum · · Score: 5, Interesting

    I put sign.exe thru IDA and identified the checksum algorithm. I found out that the only thing that goes thru the checksum is 35 35 00 then 0E DA is skipped and then the rest is put thru. The algorithm is a simple crc alike algo that adds the chars xors with the length and rotates some bits. You can find a perl program i wrote to calculate the checksum for a given range at: this location.

    Good luck with your project,
    Gijs

  4. Have you checked... by arb · · Score: 2, Interesting

    I assume you've found the following resources, but just in case...

    There's a thread the 'The LED Forum' about this sign. The people there might be able to help... http://www.eio.com/public/led/0219.html

    Also, one message in that thread is for a message at SignIndustry.com http://www.signindustry.com/mb/read.php?f=19&i=138 &t=69 which might also provide some people who could help...

    While the information you are after is not on these boards, at least you will be able to contact some people who have the boards - maybe they've found some more information.

    Other than that, I can only repeat what a couple of other people have said here - do some very methodical testing, starting with all single character messages from a-z, A-Z, 0-9 and punctuation. Then try a fair range of two character messages with the same options as the first run. (aa-az, aA-aZ, a0-a9, ba-bz, etc, but there's no real need to try ALL combinations.) Then try a fair sampling of 3 char messages. Next try some longer strings, modifying one character at a time and the same string modifying each property. After that, try a range of single characters (say 10-15 different characters) with each of the options. Finally, do some 2 char, 3 char and a longer message with a reasonable set of different options.

    If you gather a truck-load of data (several hundred sets, not just 100) you stand a better chance of figuring it out. You really need the trivial cases (single and two character strings) and some non-trivial cases (a phrase, not just 'TEST') to make it easier to reverse-engineer the algorithm.

    As others have mentioned, given the time frame and the fact that it is 6502-based, the algorithm is not likely to be particularly complex, but working it out can take a lot of time!

    Mean-while, I might have a look at the 6502 code and your test data to see if anything jumps out at me...

    Good luck!