030903-Automatic Checksum Generator
030903-Automatic Checksum Generator
Automating Checksums
You promised you were going to stop two hours ago, but you've finally fixed that one last
bug. The ROM simulator, emulator, and logic analyzer have been put away. The
embedded EPROM in your microcontroller has been programmed for the first time. You
carefully insert it into the board, apply power--and nothing happens.
Does this scenario sound familiar? It's something many embedded systems programmers
have had to deal with. When it happens, one of the first things to check is whether or not
the system's EPROM was programmed correctly. A common and simple way to verify that
an EPROM contains valid code and the microcontroller is at least partially working is to
put a checksum somewhere in the EPROM and add startup code to perform a checksum
test. When all is well, some form of output (an LED, for example) is enabled saying
everything is alright.
The problem is checksum initialization. All too often the checksum test fails because the
correct checksum was never put into EPROM. If entering the checksum is a manual task
in the list of steps to generate an EPROM, it will sometimes be forgotten or done
incorrectly. However, the checksum-generation task can be automated. This article
discusses a program to accomplish this task.
A PROPER HEX
The hexsum program calculates the checksum of an Intel-format hex file and outputs a
new, updated hex file containing the checksum. The program uses the output of compilers,
assemblers, linkers, or hex-file generators as its input. A makefile might have an entry like:
program.hex: $(OBJECTS)
link $(OBJECTS) -o temp.hex
hexsum temp.hex program.hex
remove temp.hex
PROMs are then created from the updated hex file, using a utility appropriate for your
device programmer. (For an example hex-to-binary converter utility, see the LOAD
program described in Ray Duncan's "Microsoft's ROMable DOS," When in ROM, June
1989, pp. 13-15.)
The hexsum program works with standard Intel-format hex files created by many MCS
51-family and Z80 assemblers and compilers. These hex files are composed of a series of
records. Each record starts with a colon and ends with a checksum of all bytes within the
record except the colon. Records may be separated by either the MS-DOS-style
terminator (carriage return, line feed, character pair) or the UNIX-style newline (line
feed).
Table 1 shows the format for Intel hex-file records. Only two record types are of interest
to developers of 8-bit systems: data records (type 00) and the end-of-file record (type 01).
In data records, the address field (hhll) is the data field's starting address. The end-of-file
record signals the end of input processing. The address field contains the execution-start
address. Additional data may be located after the end-of-file record. This location is where
symbol information is stored for systems that support symbolic debugging. Data after the
end-of-file record must be ignored:
: 0100FF02FE
: 040100010203F5
: 00000001FF
This sequence of records represents a very short file consisting of a single data byte, 02, at
address OFFH, followed by three data bytes, 01, 02, 03, and the end-of-file marker with
an origin address of 0000.
PROGRAM OPERATION
Checksum generation is controlled by four command-line options. These options are used
to control the range of addresses on which the checksum is performed, the address to
store the calculated checksum, and the unreferenced memory-fill value. You set the low
and high bounds with the options -- l and -- h respectively, specifying the addresses with a
following value.
The checksum address option -- a allows you to store the calculated checksum at a
particular address. The default is OFFFFH. When the given checksum address is within
the -- l and -- h bounds, it will not be included in the calculation. To make things easier,
I've included a fill-value option, --f, which allows you to initialize unused ROM locations.
The default value is OFFH, but for diagnostic purposes you might want to use the value of
a 1-byte instruction instead.
Command-line values and addresses can be in decimal, octal, or hexadecimal. If the first
digit is 0 and the second digit is x or X, the number is interpreted as a hex integer. If the
first digit is 0 and the second digit is in the range 0 through 7, the number is interpreted as
an octal integer. Numbers starting with the digits 1 through 9 are interpreted as decimal
integers. Spaces between the option letter, value, and address are optional. A typical
command line might be:
* Calculate the size of memory to checksum based upon the low and high bounds. Use
default bounds unless new bounds were on the command line.
* Allocate memory.
* Fill memory with the fill value. Use the default value unless a new fill value was on the
command line.
* Read the input hex file, expanding into the memory array. Ignore out-of-bounds data.
Stop when an end-of-file record is read. All input except the end-of-file record is copied to
the output file.
* Calculate the checksum by summing all the data in array modulus 256. Do not include
the location of the checksum if it is within the array bounds.
* Create a hex record for the checksum and write the hex record to the output file.
Most of these steps are easy to implement. The UNIX getopt() library function is used to
parse the command line. Public domain versions of getopt() are available for compiling
under DOS.
The memory-allocation step needs to verify that the array size does not exceed 65,535
bytes. (65,535 bytes is the largest amount of memory that can be allocated for a single
object by some DOS compilers.) The largest hex file processed by hexsum can only
contain 65,535 bytes of data and 1 byte of checksum. The hex-file handling routines, in
addition to memory allocation, will have to change if hexsum is modified to support larger
files.
One of the design goals of hexsum was to work in a limited-memory DOS environment.
For this reason, the hex-file expansion routine is careful only to expand hex records
containing data within the army bounds. An easier but less memory-efficient
implementation would have always created a 64-kbyte array and used the bounds to limit
checksum calculation. A harder but definately more memory-efficient implementation
would not use the array at all. Listing 1 contains the hex-file expansion function used in
the hexsum program.
The expandFile function loops until an end-of-input file or end-of-file record is found. The
condition of the end-of-input file will be an error if an end-of-file hex record is not found.
Each input record is checked, and the program is aborted if the leading colon is not found.
The checksum of the input record is also verified using a conversion function, taInt(). The
toInt function performs hex-to-integer conversion given one nibble of a hex byte and a flag
set to a nonzero value when the passed nibble is the high-order nibble of a byte. If the hex
input record is bad, the program is aborted with an error message.
Validated hex-file records are divided into their component fields using scanf(). This
location is where the end-of-file test is performed. When the end-of-file record is found,
the checksum is calculated and output prior to the end-of-file record.
Data records are compared with the memory bounds. If any part of the record is within the
bounds, these parts are passed to fillMem(). The fillMem function only adds the portion of
the record within the array bounds to the array. The values lowAdrs and hiAdrs are global
data areas describing the array bounds. The fillMem function uses toInt to convert each
nibble pair to its decimal value.
The calculated checksum must be converted to a hex-file record and written to the output
file before the end-of-file record. The code to perform this operation is trivial. Given
sumAdrs, the address where the checksum will be placed, and checksum, the calculated
checksum, only two lines of code are needed:
The C source code for hexsum is available on the Embedded Systems Programming
bulletin board service at (415) 267-7674 and on CompuServe in Library 12 of
CLMFORUM. The code has been successfully compiled under DOS and several flavors of
UNIX, so it should be fairly portable. By integrating the program into your development
cycle, you should be able to reduce if not eliminate a common source of program error--
the programmer.
BY MARCO S. HYMAN
Marco S. Hyman has been a source of program error for almost 20 years. He is currently
working as a principal engineer for Ascend Communications Inc. in San Francisco, Calif.
He can be reached on Internet as marc@dumb cat.sf.ca.us or on uucp as pacbell!
dumbcat!marc.
Listing 1
void
char *mem;
FILE *inFile;
FILE *outFile;
int done = 0;
char inStr[maxLineLen];
int count;
int type;
int i;
int tempSum;
char data[maxLineLen];
thru F only. */
if (inStr[0] != ':') {
fprintf( stderr,
%s\n", inStr);
exit(2);
fprintf( stderr,
"hexsum:
exit(2);
}
output file. */
sscanf(inStr,
&type, data);
if (type == 01) {
fputs(inStr, outFile);
done = 1;
break;
fputs(inStr, outFile);
}
if (! done) {
fprintf( stderr,