Good advice at the end plus a link to an unrelated but equally interesting hardware debug/reverse engineer:
Lessons for aspiring reverse engineers
Spend a few hours/days reading before you start doing.
Prior knowledge helps. You'll get better at reverse engineering as you do it more.
Code shape is recognizable. SPI drivers all look the same, I2C drivers all look similar, circular buffers all look the same. Code shape is a great hint about code function.
Assume people who wrote the code and designed the chip are sane (until proven otherwise).
Newton's first law of software and hardware design: without a significant outside force, things will keep being designed as they always have been. Assume most designs are similar, and what you saw before is likely what you'll see again
Defaults are not changed most of the time.
Every bit of knowledge helps eliminate possibilities in other places. When something confuses you, leave it alone and go analyze something else. Come back to this one later when you know more.
Weird-looking constants mean things. The weirder the number, the more meaning it probably carries
Have a theory before you rush to try things. An experiment with no theory is meaningless.
Do try things. A theory with no experiment is pointless.
Take notes as you try/figure out things, since your "trying things" binary will quickly become an unmanageable mess and you'll forget things.
> ...Take notes as you try/figure out things, since your "trying things" binary will quickly become an unmanageable mess and you'll forget things.
Worth to mention also that Version Control of the analysis artifacts helps in tracing your way forward. It also helps in trying out ideas to refine your understanding.
These were written by a different person. In his linked article (reverse engineering an eInk tag) he writes:
Low Power Sleep
The thing about humans is: they're human. Humans like nice round numbers. They like exactness, even when it is unnecessary. This helps a lot in reverse engineering. For example, what response does the constant 0x380000 elicit in you? None? Probably. What about 0x36EE80. Now that one just catches the eye. What the hell does that mean? So you convert that to decimal, and you see: 3,600,000. Well now, that's an hour's worth of milliseconds. That length is probably only useful for long-term low power sleep. I have lost track of how many things I've reverse engineered where constants of this variety lit the way to finding where sleep is being done!
In this device, the constants passed to the function in question were: 1,5000 2,000 5,000 10,000 3,600,000 1,800,000 0xffffffff. Pretty clear that those are time durations in milliseconds. The last one is probably a placeholder for "forever, or as close as we can get to it"
Here, there was little chance to understand what many of the regs do, as they are only used by the sleep code. Some were in SFR and some were in MMIO space. I was able to copy the code and replicate it. One thing that was interesting was that the sleep timer has two speeds: 32KHz and 1Hz. It is a 24-bit timer, making the shortest sleep possible approximately 30ms and the longest possible sleep around 194 days! More details in the SDK.
-----
If one browses through a disassembly or hex dump of Arm Cortex code and sees C520 and D928 in adjacent blocks, this is 99.9% watchdog-related code for a handful of Arm licensees, mostly Kinetis/Freescale/NXP. Same deal with various NVM or debug port unlock keys.
Plotting the data section of the ODrive motor driver binary, you can easily find the 2048-entry sine lookup table.
I think looking for weird constants is a good idea. Even better is to have the tools at-hand to identify these easier, so hexdump in parallel with ASCII/int/float representations, plot data, automatically look up register names from an SVD as part of parsing I2C/SPI/CAN data streams, etc.
One example that immediately comes to mind is how division by a fixed divider is usually optimized by the compiler into something like "quotient * 21378123891231 >> 5". Meanwhile a value like 0xFF0000 is probably an unremarkable mask for the third byte.
Depending on what device you're reverse-engineering, the random integers can be:
* Hard-coded encryption keys
* Hard-coded IP addresses (remember, IPv4 addresses are 32 bits long, which is convenient for many embedded CPUs and even some MCUs)
* Hard-coded (instead of random) hash salts
...and generally all manner of things that shouldn't have been hard-coded. That's not to say all constants are bad, but I can see 0xFF000000 and know it's probably just an uninteresting bitmask, then see 0x22C1211E, and maybe with some training (I can't do it by sight alone, for sure) see that it's an IP address in AWS address space.
On the other side of the spectrum (compared what other have written) are e.g. constants in control loops. This could e.g. be a PID controller and the values for P, I and D, may have been chosen very carefully. Changing them slightly may render the application useless.
Good advice at the end plus a link to an unrelated but equally interesting hardware debug/reverse engineer: