Changelog:

  • 23 March 2023: add missing size_t in the translate stub in tips
  • 24 March 2023: note that example assumes a POBITS of 12
  • 24 March 2023: note that we don’t want main() or translate() implementations submitted since that makes testing hard
  • 24 March 2023: update defunct references to PA02/PA03 to reference pagetable instead
  • 28 March 2023: add note that we don’t care about unused most significant virtual address bits

Implement a TLB wrapper around the page table interface from previous assignments. Note you will not need to implement a page table for this assignment; you’ll merely use its API, as defined in mlpt.h

1 Task

Implement a 4-way set-associative 16-set translation lookaside buffer with a single VPN → PA mapping per block1. In addition to its block, each cache line should also store

You should design your own data structures, etc, to make this work properly.

The public API should be the following tlb.h

#include "config.h" /* see pagtable guidance on this file */
#include "mlpt.h"   /* see pagetable this file */

/** invalidate all cache lines in the TLB */
void tlb_clear();

/**
 * return 0 if this virtual address does not have a valid
 * mapping in the TLB. Otherwise, return its LRU status: 1
 * if it is the most-recently used, 2 if the next-to-most,
 * etc.
 */
int tlb_peek(size_t va);

/**
 * If this virtual address is in the TLB, return its
 * corresponding physical address. If not, use
 * `translate(va)` to find that address, store the result
 * in the TLB, and return it. In either case, make its
 * cache line the most-recently used in its set.
 *
 * As an exception, if translate(va) returns -1, do not
 * update the TLB: just return -1.
 */
size_t tlb_translate(size_t va);

Note that we do not have a tlb_allocate. This is by design: allocation is performed by software (a part of the operating system, accessed through a system call) and not by the hardware.

You should submit tlb.h, and other .h and .c files you create for your implementation. You may also submit a Makefile if you wish, but do not need to do so. (Please, however, do not submit .c files that include main() or translate() functions (not disabled via an #ifdef or similar) since that will interfere with us testing your code.)

2 Separation of Concerns

Your TLB code must not depend on any implementation detail of your page tables. It may use translate, plus the #defines in config.h, and should append the page offset to the PPN it finds cached itself, but should not invoke any code or use any variables defined in your implementations of pagetable besides translate, LEVELS, and POBITS. Each call to translate must be to a page-beginning address, and that only if it is not in the cache.

If POBITS is 12 and tlb_translate is invoke with addresses 0x12345, 0x12468, and 0x13579 in that order,

  • tlb_translate(0x12345) should invoke translate(0x12000),
  • tlb_translate(0x12468) should not invoke translate at all (it’s a cache hit), and
  • tlb_translate(0x13579) should invoke translate(0x13000)

2.1 On unused virtual address bits

Depending on the value of LEVELS and POBITS, some of the most significant bits of a virtual address may be unused. We do not care how your TLB implementation deals with these bits being set, so it permissible for you to not use the value of LEVELS.

3 Tips

You can test this code without a working implementation of multi-level page tables, possibly even more easily than you can with a working MLPT, by creating a stub: a special implementation of translate that returns specific values designed to let you test the tlb_ functions. Even something as simple as

/** stub for the purpose of testing tlb_* functions */
size_t translate(size_t va) { return va; }

gives enough behavior to test most TLB behaviors; adding some logging code to the stub can help make sure that translate is only called as needed.

TLBs typically do not have a CPU-write-to-TLB functionality, so they do not need to be write-through or write-back.

4 Examples

With a POBITS of 12, if you stub translate(va) as return va < 0x1234000 ? va + 0x20000 : -1; then the following code should work

tlb_clear();
assert(tlb_peek(0) == 0);
assert(tlb_translate(0) == 0x20000);
assert(tlb_peek(0) == 1);
assert(tlb_translate(0x200) == 0x20200);
assert(tlb_peek(0) == 1);
assert(tlb_peek(0x200) == 1);
assert(tlb_translate(0x1200) == 0x21200);
assert(tlb_translate(0x5200) == 0x25200);
assert(tlb_translate(0x8200) == 0x28200);
assert(tlb_translate(0x2200) == 0x22200);
assert(tlb_peek(0x1000) == 1);
assert(tlb_peek(0x5000) == 1);
assert(tlb_peek(0x8000) == 1);
assert(tlb_peek(0x2000) == 1);
assert(tlb_peek(0x0000) == 1);
assert(tlb_translate(0x101200) == 0x121200);
assert(tlb_translate(0x801200) == 0x821200);
assert(tlb_translate(0x301200) == 0x321200);
assert(tlb_translate(0x501200) == 0x521200);
assert(tlb_translate(0xA01200) == 0xA21200);
assert(tlb_translate(0xA0001200) == -1);
assert(tlb_peek(0x001200) == 0);
assert(tlb_peek(0x101200) == 0);
assert(tlb_peek(0x301200) == 3);
assert(tlb_peek(0x501200) == 2);
assert(tlb_peek(0x801200) == 4);
assert(tlb_peek(0xA01200) == 1);
assert(tlb_translate(0x301800) == 0x321800);
assert(tlb_peek(0x001000) == 0);
assert(tlb_peek(0x101000) == 0);
assert(tlb_peek(0x301000) == 1);
assert(tlb_peek(0x501000) == 3);
assert(tlb_peek(0x801000) == 4);
assert(tlb_peek(0xA01000) == 2);

  1. You may hard-code 4 and 16 in your implementation, though a design that has these in #defines may actually help you keep your code organized.↩︎