Changelog:

These is documentation for the object file and executable file format to be used in the linking tasks in the linking+ISA tradeoffs lab and homework.

Parsing tool

To aid in avoiding typos that create syntax issues, etc., we have supplied a web-based parsing tool for the object file and executable formats described below. This tool will check that your answers are formatted correctly. (It is not for reformatting them.)

Object file format

Each file has 3 parts, separated by “***”:

  1. Machine code/data. Each file’s machine code is written in hexadecimal as a sequence of bytes, each byte written with two hexadecimal “nibbles” (even if its value is less than 16). The bytes are space-seperated in our examples, but this is not required. Each line of hexadecimal is prefixed with starting offset in hexadecimal followed by a “:”. Relocations to be filled in are represented with 0 bytes. Omitted bytes are assumed to be zeroes.

    Optionally, you may place coments on lines of machine code and data, by placing a | followed by the comment. (This, incidentally, makes it easier to paste machine code generated by yas.)

  2. A list of relocations. Each relocation is comma separated line containing:

    • the offset (as a hexadecimal number) of the place in the machine code to be filled in (this should be the location of the first byte of the address placeholder to fill in, which usually will not be the first byte of the instruction containing that address), and
    • the symbol to fill in at that location
  3. A list of symbols Each symbol is comma separated line containing:

    • the offset (as a hexadecimal number) of the machine code/data for the symbol (that is, where the corresponding label was declared), and
    • the name of symbol

Example:

      0x0: 11 22 33 44
0x4: 55 66 77 88
0x8: 99 AA BB CC
0xC: 00 00 00 00
0x10: 00 00 00 00
***
0xC,foo
***
0x0,start

    

specifies an object file containing 20 bytes of machine code and/or data, with a ‘start’ symbol at offset 0x0 and a relocation (whose corresponding machine code/data bytes happen to all be 0s before the replacement, though this is not required) to be replaced with the address of ‘foo’ at offset 0xC.

This would be the object file one would expect from assembling

      start:
    .quad 0x8877665544332211
    .byte 0x99
    .byte 0xAA
    .byte 0xBB
    .byte 0xCC
    .quad foo

    

where

The object file for the above example could also be written with less whitespace and different numbers of bytes per line, like:

      0x00: 11223344556677
0x07: 8899AABBCC00000000
0x10: 00000000
***
0x0C,foo
***
0x00,start

    

Or with a comment to aid reading as in:

      0x00: 112233445566778899AABBCC
0x0c: 0000000000000000 | placeholder for foo
***
0x0C,foo
***
0x00,start

    

Executable file format:

Executable files (for the homework only) consist of lines of hexadecimal byte values prefixed with with the address to which they should be loaded in hexadecimal followed by a “:”. For example:

      0x0: 11 22 33 44
0x4: 55 66 77 88
0x8: 99 AA BB CC
0xC: EE FF 00 11

    

specifies an executable that would load 16-bytes into memory starting at address 0x0.

Like for object files, it is also okay for this to be written with less whitespace and different numbers of bytes per line, like:

      0x00: 1122334455
0x05: 66778899AABBCCEE
0x0D: FF0011

    

The whitespace between the hexadecimal for each byte can be omitted, and more than four bytes can be placed per line.

This is essentially the same as the format that tools/yas produces.

Like in the code we use for HCLRS, executables should arrange for the address of main to be at 0x0, and then include the data machine/code from the other object files used to construct them in any order.