finding CRC polynomial from data

Go To Last Post
23 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi freaks

I want to find the 16-bit generetor polynomial used to compute the crc of some data. I tried a few common ones (correctly I hope) but dont seem to get anything.

Sample data is:

header + data crc ?!?
7D7523210408206CF0F1390A 61AD
7D7523220408206CF0F1390A 57B6
7D7523230408206CF0F1390A 4DBF
7D7523240408206CF0F1390A 43C8
7D7523250408206CF0F1390A 39D1
7D7523260408206CF0F1390A 2FDA
7D7523270408206CF0F1390A 25E3
7D7523280408206CF0F1390A 1BEC

I chose very similar data packets of the same length in the hope that it will make it easier.
Its possible that a final XOR is performed on the data...
Are there any other types of "checksums" that are not CRC but have the same function?

Thanks for any help you can provide

Regards
alex

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

If it's only a 16-bit CRC, you can easily brute-force it.

XOR the results of 2 lines together to eliminate the effect of final XOR.

Quote:
Are there any other types of "checksums" that are not CRC but have the same function?

Sure. Depends on how you define "the same function". Simple checksums/xorsums do in many cases; yet in others, a much more sophisticated checksum called cryptographic hash has to be used.

JW

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

You have not only the polynomial, but the initial value to deal with.

Jim

Jim Wagner Oregon Research Electronics, Consulting Div. Tangent, OR, USA http://www.orelectronics.net

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Don't forget the reverse of common polynomials!

Math is cool.
jevinskie.com

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Jevin wrote:
Don't forget the reverse of common polynomials!
Further, if an implementation is incorrect, your code has to have the same errors to arrive at the same CRC value for a given data stream.

Don Kinzer
ZBasic Microcontrollers
http://www.zbasic.net

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Can you not get the documentation for the protocol?

As you can see, there are MANY variables here, making a total brute force search impractical.

Having said that, you could try common polys, msb first, lsb first, and with various common start values. If you don't get a match in there, your search is going to be much more difficult.

Writing code is like having sex.... make one little mistake, and you're supporting it for life.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I guess you could do it with brute force - there are 16 bits in the polynomial and 16 bits in the starting value so it's just 4,294,967,296 combinations to be tried. There's also a pretty strong bet that the start is 0x0000 or 0xFFFF - so possibly only 131,072 combinations. Here's a typical CRC function:

    Optimized CRC-16 calculation.

    Polynomial: x^16 + x^15 + x^2 + 1 (0xa001)
Initial value: 0xffff This CRC is normally used in disk-drive controllers. The following is the equivalent functionality written in C. \code uint16_t crc16_update(uint16_t crc, uint8_t a) { int i; crc ^= a; for (i = 0; i < 8; ++i) { if (crc & 1) crc = (crc >> 1) ^ 0xA001; else crc = (crc >> 1); } return crc; }

One would just need to parameterize the fixed 0xA001 in this then call the thing 65536 times with every different possibility.

Cliff

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Did you notice that the difference of two consecutive results in your example is constant?

This doesn't look like a CRC for me...

JW

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Given that amount of data, we cannot say how to calculate the checksum, except for the examples you gave.

So when x is in set 1..8, the message

"7D75232x0408206CF0F1390A" has checksum of 6BA4-(x*9F7).

Extrapolating that, see if x=0 has checksum 6BA4 and x=9 has checksum 11F5.

Now, we don't know if it is valid for other values of x from 0 to F, and how other digits affect this. So you can figure it out by using superposition; you have done the math for one digit now, do the math for other digits too.

Definitely not CRC.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

A web based CRC utility:
http://zorc.breitbandkatze.de/crc.html.

Have Fun, Paul.

Doing magic with a USD 7 Logic Analyser: https://www.avrfreaks.net/comment/2421756#comment-2421756

Bunch of old projects with AVR's: http://www.hoevendesign.com

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Reminded me of a similar one...

http://www.8052.com/forum/thread...

:-)

JW

PS. Once I looked it up... I simply copy the questions:
1. can you provide more examples?
2. can you influence the transmitted data (i.e. chose the content or a part of it)?
3. couldn't you simply ask the author? :-)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

the difference in the CHECKSUM is constant indeed!
Very good observation! How come I missed that...

Thanks for the comments so far!

wek:
1. can you provide more examples?
> yes I have longer examples too, I can pm you some more sequences if you want.
2. can you influence the transmitted data (i.e. chose the content or a part of it)?
> no I cant...
3. couldn't you simply ask the author?
> Not a good idea :)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

d-fens wrote:
1. can you provide more examples?
> yes I have longer examples too, I can pm you some more sequences if you want.

Why not zip it up and post as an attachment here, for others to have fun, too?

d-fens wrote:
3. couldn't you simply ask the author?
> Not a good idea :)

I presume you are aware of intellectual-right issues. IANAL, but AFAIK under European copyright law it is OK to deduce working of a box from its external behaviour, as principles like checksumming is not patentable; but in the USA they have a slightly different perception of what is normal.

JW

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks for the interest

I dont know how fun it will be for the average freak here but here comes the file as requested.

I am trying to find a relationship now that you guys discovered the constant difference in the "checksum" (more appropriate name than "crc" now).

/alex

Attachment(s): 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

What a nice little lunch-time project... ;-)

Amazing, how little fantasy the programmers have: this is almost the same scheme as in the one I gave the link to.

Here you are (Pascal-style):

    for j:=1 to length do begin
      cs0:=cs0+q[i];
      cs1:=cs1+cs0;
      cs2:=cs2-cs0;
    end;
    cs2:=cs2-cs0;

cs2 is the first byte of checksum, cs1 is the second.

It fits all but the three longest strings in your file; but those won't fit the "first byte determines length" scheme either, so I assume it's some randomly recorded garbage.

Enjoy! ;-)

JW

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

wow!!! don't know how to thank you, I haven't checked it myself yet, but I'm sure this is the path.
As for

Quote:
the first byte determines packet length
for the small packets packet length is determined by the 2nd nibble of the 1st byte, and for the larger packets from the the 2nd byte.

I burned all my braincells thinking I was dealing with a crc. Will put your code to test after I recover from my headache :)

Congrats again!

/alex

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

d-fens wrote:
As for
Quote:
the first byte determines packet length
for the small packets packet length is determined by the 2nd nibble of the 1st byte, and for the larger packets from the the 2nd byte.

The large packets will most probably employ a different checksumming scheme, then.

JW

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I wrote some simple c code for the algorithm above and got it to work both for the short AND the long sequences, maybe you lost a byte on the way...

hurray!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

wek wrote:

Amazing, how little fantasy the programmers have

Aye, but it is very similar to Fletcher's checksum scheme, just without summing back overflows.

What would really make things hard is to use some kind of weird CRC nobody else uses, or running index number, maybe scrambled so that you think it is a checksum but basically generated from a random generator with known seed.

I have sometimes just used shifts or rotates and xors as a checksum, like sum=(sum<<1)+data;

Edit: ah if cs0 and cs1 were used it would be like Fletcher, but this uses cs2 and cs1.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

What would really make things hard is to use some kind of weird CRC nobody else uses

MD5 would be another good one - the whole point of it is that while each sequence generates a unique result it's close to impossible to work out the relationship. If they'd used that I'd predict that no one here could ever have cracked the algorithm (well unless they had an inspired guess and tried MD5 or SHA256 or SHA512 or whatever)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

MD5 would be good one. Actually I propably need to implement some form of SHA soon, as currently a chip does SHA verification of some values for me right now, but the chip is not always available.

But for few byte transfers, I must say MD5 or SHA is way too big :)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

d-fens wrote:
both for the short AND the long sequences, maybe you lost a byte on the way...

Yeah, you're right, I truncated the strings to 256 chars... :oops:

But, be warned: there is no guarantee it will work all the time. It's so easy to put some surprise into thigs like these... and so often the programmers puts some more inadvertently... :-P

Jepael wrote:
Aye, but it is very similar to Fletcher's checksum scheme, just without summing back overflows.

Yes, this is the simplified form. But, given the short packets, it would be easy to guess the carry loopback, too.

Jepael wrote:
What would really make things hard is to use some kind of weird CRC nobody else uses, or running index number, maybe scrambled so that you think it is a checksum but basically generated from a random generator with known seed.

Yes, these things are much harder or straight impossible to guess (as well as are the hashes, block ciphers used for hashing etc.etc.etc.).

Jepael wrote:
I have sometimes just used shifts or rotates and xors as a checksum, like sum=(sum<<1)+data;

Yes, if an uncritical checksum is to be constructed, a couple of well-though simple operations converting to single processor instructions might easily come as a surprise for the "baseline attacker" like me. It is also worth to add an initial value, to avoid the "all zeros/all ones" sort of problems.

Jepael wrote:
Edit: ah if cs0 and cs1 were used it would be like Fletcher, but this uses cs2 and cs1.
The difference is purely aesthetical - the principle thus the "level of protection" is the same.

JW

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0
unsigned int ModbusCRC(char *ptr, unsigned char  count)
{
    unsigned int crc;
    char i;

    crc = 0xffff;
    while (count--)
    {
        crc = crc ^ (unsigned int) *ptr++;
        i = 8;
        do
        {
            if (crc & 0x0001)
                crc = (crc >> 1) ^ 0xA001;
            else
                crc = (crc >> 1);
        } while(--i);
    }
    return (crc);
}

This is the same as Clawson's example which is right out of the modbus manual. This implementation assumes the data is sitting in a buffer. You pass it a pointer to the first element and the number of bytes and it returns the 16 bit CRC.

official AVR Consultant
www.veruslogic.com