Changelog:

27 March 2021: rename the copy-on-write tests to have shorter filenames to avoid them clashing in the xv6 filesystem if the given names are used
27 March 2021: add alloc_large2.c, alloc_large2_repeat.c, read_high1.c and read_high2.c tests
27 March 2021: add note to copy-on-write testing with freeing disabled; rearrange hint sections
27 March 2021: merge reminders about bitwise operators into hint section on checking and editing page table entries
27 March 2021: add several copy-on-write tests were specifically parent and child write and updating pagingtest.c to include more of the standalone tests; correct submission site link
1 April 2021: provide version of read_high1.c that avoids integer overflow compilation issue
9 April 2021: modify “the only reason for a page to be marked as read-only” to “the only reason for a user page to be marked as read-only”
14 April 2021: add note on read: Success message and make failing to build a .img file indicating that there’s no more space in the xv6 filesystem image

Your Task

(required for checkpoint) Add a new system call int dumppagetable(int pid) that outputs the page table of the process with pid pid to the console (like with cprintf()) as follows:
- Only last-level page table entries should be shown;
- Only page table entries for the user part of memory (bytes 0 through p->sz where p is the corresponding struct proc*) should be output;
- Output a line starting with START PAGE TABLE before the page table entry information, and one starting with END PAGE TABLE immediately after. You may, if you choose, put additional text afterwards on these lines;
- Output a line for each page table entry, with fields seperated by whitespace, in order from lowest virtual address to highest (you may use any kind of ASCII whitespace, even if it (like tabs) doesn’t display correctly in qemu’s graphical console);
- Optionally, output a header line describing the fields.
- The following fields (in this order) should be output for each present (valid) page table entry:
  1. the virtual page number used to lookup the page table entry in hexadecimal or the virtual address corresponding to the first byte of that virtual page number in hexadecimal (your choice which)
  2. the text ‘P’
  3. the text ‘U’ if the page is marked as user-mode accessible and ‘-‘ otherwise
  4. the text ‘W’ if the page is marked as writable and ‘-‘ otherwise
  5. the physical page number contained in the page table entry in hexadecimal or the physical address of the first byte of that physical page in hexadecimal (your choice which)
  6. any other fields of your choice (for example, information that will help you debug the later steps)
- For non-present pages, you may either omit them entirely from the output or output the following fields in this order:
  1. the virtual page number in hexadecimal or address in hexadecimal (your choice which)
  2. the text ‘-‘ (instead of ‘P’)
  3. any other fields of your choice (for example, information that will help you debug the later steps)
For example, one possible output might look like:
```
START PAGE TABLE (pid 54)
0 P U W 8a
1 P U W 8b
2 P - W 8c
3 P U W 8d
4 P U W 89
END PAGE TABLE
```
indicating that a process has five pages allocated, with virtual page numbers 0 through 4, to various physical page numbers in the range 0x89 through 0x8d, and virtual page 2 is marked as non-user-accessible.

We do not care what your system call does if the pid supplied is invalid or if pid supplied has its page table modified or freed while the system call occurs.

When successful, your system call should return 0.
(required for checkpoint) xv6 currently allocates stack and heap memory immediately. More commonly, OSs will allocate this memory on demand, saving memory when not all of the stack or heap is used immediately. Modify xv6 to allocate heap memory based on page faults rather than immediately. It is okay if non-heap memory is still allocated statically.

Your automatic allocation scheme:
- Should initialize all newly allocated memory to zeroes, like happens with a stock version of xv6.
- Should kill a process when it attempts to access memory outside of its allocation, including memory beyond the end of heap ( except that heap allocations should be made in whole pages, so it’s okay to allow processes can access slightly beyond the end of their heap if the end of their heap is in the middle of page) or kernel memory, like happens with a stock version of xv6.
- Should kill a process when it attempts to access memory requiring an allocation on demand and no more memory can be allocated. Make sure a message is printed to the console when this happens.
- Should never make the “guard page” xv6 allocates below the user stack accessible in user mode.
- Must make system calls that attempt to read or write to not-yet-allocated parts of the heap work. For example, malloc()ing a very large buffer, then using read() to fill it must work. We do not care, however, what happens if the process runs out of memory during such a system call and must be killed. (For the most part, this should just require treating page faults triggered by the kernel the same as page faults triggered from user mode.)
- Should not leak memory.
For suggestions on how to accomplish this, see the “allocating pages on demand” section of the hints below.
Add copy-on-write support for xv6’s fork() system call. xv6 currently makes a copy of each page of a process when it forks. Instead, you should not copy the page and instead mark each page as read-only. Then, when a protection fault happens, actually make a copy of the page, update the corresponding page table entry, and mark it as writeable. Your copy-on-write scheme:
- Should kill a process when it attempts to write to memory, but there is not enough memory to allocate a copy-on-write page. Make sure a message is printed to the console when it happens.
- Should not leak memory.
- Should never make the “guard page” xv6 allocates below the user stack accessible.
- Must make system calls that attempt to read or write to not-yet-allocated parts of the heap work. For example, malloc()ing a very large buffer, then using read() to fill it must work. We do not care, however, what happens if the process runs out of memory during such a system call. (For the most part, this should just require treating page faults triggered by the kernel the same as page faults triggered from user mode.)
For suggestions on how to accomplish this, see the “adding copy-on-write support” section of the hints below.
Run make submit to create an archive and upload the result to the submission site.

Testing

dumppagetable test program

dumppt.c is a sample program that calls dumppagetable(). If supplied no arguments, it calls it on its own pid, otherwise, you can pass a pid to it. To see the pids of running processes in xv6, try typing “control-P” (which triggers a built-in process list).

standalone tests for allocate-on-demand and copy-on-write

Since the xv6 shell uses fork and sbrk and you might break the implementation of these, it may be very difficult to debug when you break one of these. One strategy for debugging is to run xv6 with a test program acting as init (the first program the xv6 kernel runs). We have supplied some suitable test programs that work this way, or, alternately, can be run as normal programs (like the test programs you used for the lottery assignent).

Each of them uses this pagingtestlib.h header file [last updated 27 March 2021].

xv6’s file system only uses the first 14 characters of the filename. As of 27 March 2021, the test file names below should differ in the first 14 characters, so you won’t run into this issue if you use the files as is. But if you make new variants you may run into this. For example, if you had both alloc_small_then_fork and alloc_small_then_fork_changed listed in UPROGS, xv6 would only include one of them and running any command starting with alloc_small_th would run that program.

[Added 14 April 2021:] If, after adding these tests, you get a message about read: Success followed by the Makefile saying there was an error constructing some file ending in .img, that probably means that you exceeded the maximum allowed size for the xv6 filesystem image file. One workaround would be to include less files on the UPROGS list in the Makefile. Another would be to increase the FSSIZE constant in params.h

tests of heap allocation:
- alloc_small.c (allocate 4K and use some of it), also available with extra debugging output:
  - alloc_small_dump.c (allocate 4K and use some of it, but also call dumppagetable() some to aid debugging)
- alloc_small_then_fork.c (allocate 4K, use some of it, then fork())
- alloc_large.c (allocate a lot of memory and use some of it)
- alloc_large2.c (allocate a lot of memory and use a lot of it; if you pass alloc_large and fail this, you might not be handling cases where a new second-level page table is needed (wrong alloc argument to walkpgdir()?))
- alloc_large_then_fork.c (allocate a lot of memory, use some of it, then fork())
- alloc_large2_repeat.c (repeat 5 times: fork, then allocate and use lots of memory; if you pass all the other alloc tests and fail this, it is likely you are not freeing memory properly)
small tests to help debug copy on write with two processes:
- cow_minimal.c (allocate 4K, then use it, fork(), then use it in child, then use it in parent); also available with extra debugging output:
  - cow_minimal_dump.c (same as above, but also call dumppagetable() some to aid debugging)
- cow_minimal_parent_write.c (allocate 4K, then use it, fork(), then write+read it in parent, read it in child; requires version of pagingtestlib.h after 27 March 2021)
- cow_minimal_both_write.c (same as cow_minimal_parent_write, but both write)
small tests of copy on write with more than two processes:
- cow_minimal_repeat_mixed.c (repeat several times: allocate 4K, fork(), parent and/or child write, then both read)
- cow_minimal_parallel.c (cow_minimal, but running 5 children concurrently)
larger tests to help debug/test copy on write:
- cow_medium.c (allocate several MB, modify it in child processes after forking; if you pass the minimal tests and fail this one, most likely you are not handling needing a new second-level page table correctly (wrong alloc argument to walkpgdir()?))
- cow_medium_many_forks.c (same as above but with many child proceesses)
- cow_large_repeat_plus_alloc.c (allocate some unused heap, as well as using a large amount of memory before and after forking, and repeat this several times to maybe trigger memory leaks)
miscellaneous tests:
- read_guard_page.c (attempts to read from guard page, which should crash)
- read_high1.c [updated 1 April 2021 to fix potential compilation issue] (attempts to read from very large address, which should crash)
- read_high2.c (attempts to read from an address a 8KB past the end of the heap, which should crash)
- read_alloc.c (attempts to read() from file into heap memory that might need to be allocated on demand for the read)
- read_cow.c (attempts to read() from file into copy-on-write memory that might need to be copied for the read)

running tests as init

To run these files as init, put them in your xv6 directory and instead of running make qemu or make qemu-nox run something like

      make qemu-nox FS=fs-alloc_small-as-init.img

This will create a virtual disk fs-alloc_small-as-init.img which is a copy of the xv6 filesystem, but with init program replaced by alloc_small, then boot xv6 using this virtual disk instead of the normal fs.img virtual disk.

(When these tests work they should print a message like Test passed: .... before exiting.)

tests that dump page tables

The alloc_small_dump.c test calls dumppagetable several times showing you the contents of page before and after sbrk() is called and before and after the memory allocation is actually used. I suggest looking at this first if you are experiencing poor behavior with the larger tests.

The cow_minimal_dump.c test calls dumppagetable several times showing you the contents of page before and after forking and before and after writing to what should be copy-on-write memory. I strongly recommend looking at this output manually to see if what is happening matches your expectations — especially before trying to jump into debugging failures that only occur with much larger test cases.

The generic testing infrastructure in pagingtestlib.h will dump page tables several if you have enable_dump = 1; near the begining of the main() function in many of the standalone tests above. However, this can produce a very large amount of output.

combined test for allocate on demand and copy-on-write

We have supplied a program called pagingtest.c [last updated 27 March 2021] to help you test your implementation. For the most part, this is a combination of a selection of the standalone tests above. This test requires the same pagingtestlib.h header file [last updated 27 March 2021] that tests above use. This test is intended to be run to test a complete implementation; we do not recommend using (at least without modification) it to test implementations you know to be partial. We will run other and/or additional tests when grading your submission, so this is not a substitute for doing additional testing.

This includes tests for both allocate on demand and copy-on-write. The tests do a mix of things that should work regardless of whether you’ve implemented these features, and things which will run out of memory unless you’ve implemented allocate-on-demand and/or copy-on-write. When fork()ing fails (such as due to running out of memory), some of the copy-on-write tests may print a message about this and then a hang rather than printing a clear failure message. (Yes, this is probably a bug.)

Hints

xv6 book reading

Chapter 2 of the xv6 book describes how xv6 manages its page table structures.

Note on flushing the TLB

You will need to flush the TLB after a changing any valid page table entries that might be cached in the TLB. One way to do this is by reloading the page table base register CR3 using:
```
lcr3(V2P(myproc()->pgdir));
```
(It would be better to use an instruction that only flushes the TLB entries for a particular page, like invlpg, but xv6 doesn’t have a convenient way to run that.)
Not flushing the TLB is one cause of particular werid behavior with copy-on-write, where the TLB (strictly speaking, the virtual TLB qemu implements) may cache a page table entry marked as writeable rather than using the read-only version.

Handy xv6 functions

`walkpgdir` (find page table entry)

walkpgdir takes a page directory (first-level page table) and returns a pointer to the page table entry for a particular virtual address. It optionally will allocate any needed second-level page tables.
walkpgdir is declared as static in vm.c and is not declared in a header file like defs.h, so it can’t be used from other files. You are welcome to change this if you think that would be convenient.
You can find a page directory to pass to walkpgdir with something like myproc()->pgdir.
Make sure you check for walkpgdir returning NULL. Do not rely on a crash occuring.

On an error, like failing to allocate a second-level page table, walkpgdir return the address 0x0 (NULL) converted to a pte_t*.

Accessing address 0x0 will not crash, but instead will read memory from the code of the currently running program. Trying to interpret this memory as a page table entry is likely to lead to surprising results.

`mappages` and other ways to set page table entries

In my reference solution, I set page table entries by using the pointer returned by walkpgdir rather than using mappages. (But most students have not done this in their implementations.)
The version of mappages supplied with xv6 will panic (with the message panic: remap) if the specified virtual address is already mapped to a physical address. This is because in unmodified xv6, once a virtual address has a valid mapping, that mapping is never changed. If you don’t want this behavior, then you would need to modify mappages.
mappages takes a page directory (first-level page table), a virtual address, a physical address, and a size, and modifies that page directory so size bytes of virtual memory starting the specified virtual address point to the specified physical address.
If you pass mappages a range of address that spans multiple pages, it will remap multiple pages. For example, mappages(pgdir, 0x1800, PGSIZE, ...) will try to remap virtual page number 1 (virtual addresses 0x1000 through 0x1FFF) and virtual page number 2 (virtual addresses 0x2000 through 0x2FFF) since they both overlap with the range 0x1800 through 0x1800+PGSIZE = (0x2800). To avoid this problem, you can use PGROUNDDOWN() to convert an address in the middle of a page to an address at the beginning of the page. (For example, PGROUNDDOWN(0x1800) is 0x1000.)

`P2V` and `V2P` (converting physical to kernel virtual addresses and vice-versa)

P2V converts from a physical address to a virtual address in the kernel’s memory. The xv6 kernel makes sure that every page table maps (for the kernel’s use only) virtual address 0x80000000+x to physical address x, and P2V uses these addresses (that is, it adds 0x80000000). V2P does the opposite mapping.

`kalloc` and `kfree`: memory allocation

kalloc and kfree allocate or free a physical page. They both return pointers to the virtual address of the page in the kernel’s memory.

Checking and editing page table entries

You can check if a page table entry pointed to by a pte_t *pte is present with *pte & PTE_P. Similarly, you can check if it is writeable with *pte & PTE_W.

If you want to check if a page table entry is present and writeable use something like (*pte & PTE_P) && (*pte & PTE_W). ((*pte & PTE_P & PTE_W) has a very different effect.)
If you want to check if a page table entry is not present, you can use something like !(*pte & PTE_P). ((*pte & !PTE_P) and (*pte & ~PTE_P) have very different effects.)
You can set that page table entry to point to physical page mypage as writeable, present, and accessible in usermode using *pte = mypage | PTE_P | PTE_W | PTE_U;.
If you want to clear the PTE_W bit from a page table entry pointed to by pte, you could use something like *pte &= ~ PTE_W;

Allocating pages on demand

Currently, when a program tries to allocate more heap space it calls the sbrk() system call, which calls growproc(). This sets the sz variable to track the size of the allocation and uses allocuvm(). to create new page table entries. You can modify growproc() to still set the sz variable but avoid actually creating new page tables right away. With your modification, growproc() will no longer need to create new page table entries. Then you’d have make your page fault handler handle creating the new page table entries on demand that would have previously been created when new heap memory was allocated.

You could potentially modify allocuvm() instead of growproc(), but you would need to modify loaduvm() to handle loading data into a page table entry that is not yet allocated. (loaduvm() is used to implement exec() in xv6, for loading data from an executable into memory when the program starts.)

Adding a page fault handler

When a page fault occurs, the trap() function in traps.c is called. Currently, the default case in the switch statement is reached, which crashes the current program (or panics, if the page fault occurred in kernel mode).
traps.h defines the constant T_PGFLT, which is the code x86 uses to identify page faults. This will be placed in the trapno member variable of the trapframe.
When a page fault occurs in x86, the processor sets control register 2 (CR2) to the address the program was attempting to access. You can read this in xv6 using rcr2().
You need to make sure that your allocate-on-demand code does not trigger for out-of-range memory access, such as attempts to access kernel memory from userspace or accesses to unallocated (beyond end of heap) memory.
You should use code similar to what’s in allocuvm() to set each page table entry. But you should not actually call allocuvm() since it allocates all pages up to the maximum. If a program accesses address 0x400000, then only the page at address 0x400000 should be created, not all the pages between 0x0 and 0x400000.
To make fork() work correctly, copyuvm() should only copy pages that are actually allocated. Later on, you will probably modify this function more to implement copy-on-write.

Adding copy-on-write support

For copy-on-write, you should:
- change the code that copies the pages on fork() in copyuvm() to instead map each virtual page in the child process to the same physical page as the parent process and to mark the pages read-only in both processes,
- moving the copying (and setting of a new page table entry) that was previously done by copyuvm() to the page fault handler, which will be run whenever the process attepmts to write to a read-only page, and
- tracking how many times pages are referenced (“reference counting”) in order to delete pages only when they are no longer being used
You will need to flush the TLB whenever you change a valid page table entry that the processor may have cached. This most likely will be in both your copyuvm() implementation and your page fault handler.
Before you implement reference counting (or are sure of your reference counting implementation), I recommend testing your code with freeing (e.g. in deallocuvm) disabled. This will leak memory – which you eventually need to fix – but will allow you test the rest of your copy-on-write logic without worrying as much about your reference counting implementation.

Distinguishing copy-on-write page faults from others

In your copy of xv6, the only reason for a user page to be marked as a read-only is because it is involved in copy-on-write. (In a more featureful OS, there would be other types of read-only pages to consider, but xv6 does not presently have any other sort of read-only pages, even for code.)

So, you can use one of two ways of detecting a page fault was triggered by a page being read-only:
- Looking up the page table entry for the page that triggered the fault and is marked as present but not writable. You can use rcr2() to determine the faulting address, and then use walkpgdir() to find the corresponding page table entry (if it exists).
- Using the value of the processor’s “error code” (stored in tf->err) which will have a particular bit set if the page fault was caused by a page table entry not being writable. You can lookup the fields in this error code in the Intel manuals, volume 3A, page 6-40, section “Exception Error Code”.)
If one were to support having some memory read-only for reasons other than copy-on-write, one would need to add additional information to struct proc indicating where writable and truly read-only pages were located. (For example, it might have a table indicating where code is loaded.)

Copying data

The xv6 function memmove is handy for copying the contents of pages.

Where pages are deallocated

deallocuvm is called to deallocate user pages from a page table. Currently, it unconditionally calls kfree on each page. If pages are shared between processes, this may incorrectly free a page that other processes have a reference (via a page table entry) to.
If you make a copy of a page from the page fault handler, and, after you make this copy, there are no references to the original page, you would need to free the original page. Alternately, you could avoid making the copy in this case.

Counting references

You will need to track how many times each physical page that might be used for copy-on-write is used, commonly known as its “reference count”. While one could scan all pointers to pages to determine the reference count, this would be extremely slow, so I would recommend tracking the reference count of each page as you edit page tables and/or allocate pages and/or deallocate pages and/or etc.

A simple way to do this is an array like:
```
unsigned char cow_reference_count[PHYSTOP / PGSIZE];
```
Then, use the physical address of the page divided by PGSIZE (that is, the physical page number) to access the appropriate element of the array.

(PHYSTOP is the maximum physical memory address xv6 supports.)

If you want to store extra information for each physical page, you can use a similar approach but with an array of structs.
To aid in debugging reference count tracking, I would strongly recommend printing out reference counts from your dumppagetable functions. (You can add extra columns to the output.)
As part of tracking reference counts, you need to decide which pages you’re going to track the reference count for and what types of references you will count. There are multiple options which will work, but it is very important to be consistent:
- you could track reference counts for each page allocated with kalloc()
- you could track reference counts for pages each time they are placed in a the user part of page table
- you could track reference counts only for pages which are marked as read-only for copy-on-write
Which option you choose will change how you initialize, increment, and decrement reference counts. For example:
- for the first two options, you need to change reference counts for pages not (yet) involved in copy-on-write, which will involve modifying non-copy-on-write-related code; and
- for the last option, you need to make sure you don’t check the (presumably uninitialized) reference counts of non-copy-on-write pages, and therefore accidentally deallocate or leak memory for these pages
I strongly recommend having explicit checks that:
- you do not try to decrement a reference count that is already 0. (Note that checking if an unsigned reference count is negative after decrementing it will not do this.)
- the initial value of a “new” page’s reference count is what you expect
- the value of reference counts in copyuvm() is in the range you expect

On synchronization

When implementing copy-on-write, there is a potential for synchronization issues when modifying information about pages (like reference counts) shared between different processes from system calls. To avoid this, you can use a spinlock or disable interrupts when accessing such potentially shared data. However, it is extremely rare that I have seen this matter in our tests (and we don’t intend to take off credit if you are missing synchronization in your implementation).

xv6 disables interrupts while handling page faults — so on a single core (which we will use), you don’t need to worry about the page fault handlers being interrupted. However, system calls are run without page faults or timer interrupts being disabled.

Note on how kalloc works at boot

If you modify kalloc and kfree as part of your reference counting implementation, note that kfree is used at boot to setup the free list that kalloc reads from.

In particular, kfree is called from the functions kinit1 and kinit2 before any calls to kalloc have happened. In this way, rather than writing special code to initialize the list of free pages, xv6 gets to reuse the code it has for adding a newly free’d page to the list for this purpose.

On debugging

Refer to the notes from the xv6intro assignment on debugging. This includes information about interpreting panic messages.
It can be useful to deliberately disable deallocation based on reference counts to determine whether the problem you are experiencing is due to deallocating memory when it is still in use.

xv6’s guard page

xv6 allocates a “guard page” below the user stack to catch stack overflows. This is the only page in the user’s memory which is marked as present but not user-accessible in the stock version of xv6.

Examples of page table dumps from tests

Note that, for all of these examples, the actual output will vary depending on what options you choose when implementing your dumppagetable().

`alloc_small_dump.c`

Example output from alloc_small_dump.c might look like:

          testing allocating 4K and reading/writing to 1K segments of it
... and verifying that (at least some of) the heap is initialized to zeroes                                   
>> About to call dumppagetable() for allocation-pre-allocate
START PAGE TABLE pid=5
0       P       U       W       dfbe
1       P       U       W       df2e
2       P       U       W       df2f
3       P       U       W       dee6
4       P       -       W       dee5
5       P       U       W       dee4
>> Finished call to dumppagetable() for allocation-pre-allocate                                               
>> About to call dumppagetable() for allocation-pre-access                                                    
START PAGE TABLE pid=5
0       P       U       W       dfbe                                                                          
1       P       U       W       df2e                                                                          
2       P       U       W       df2f
3       P       U       W       dee6                                                                          
4       P       -       W       dee5                                                                          
5       P       U       W       dee4
6       -       -       -       0
>> Finished call to dumppagetable() for allocation-pre-access                                                 
>> About to call dumppagetable() for allocation-post-access
START PAGE TABLE pid=5
0       P       U       W       dfbe                                                                          
1       P       U       W       df2e
2       P       U       W       df2f                                                                          
3       P       U       W       dee6
4       P       -       W       dee5                                                                          
5       P       U       W       dee4
6       P       U       W       deeb                                                                          
>> Finished call to dumppagetable() for allocation-post-access                                                
Test passed: allocation passed (expand + write + read heap)

    
  

In the example above, initially the test program has 6 pages (virtual page numbers 0 to 5), resulting in the page table dump labeled allocation-pre-allocate.

Then, after sbrk() was called, the size of the program’s memory was increased, but nothing was added to the page table, resulting in the page table dump labeled allocation-pre-access. The dumppagetable() implementation that produced the output above choose not to omit non-present pages from the output, so one additional line is shown compared to the first dump, reflecting that heap size was increased for the process.

Then, the process accesses the memory in that newly allocated virtual address. This results in the final dump, labeled allocation-post-access. Here, a new page is made present, writable, and user-accessble,and a distinct physical page is allocated for it.

`cow_minimal_dump.c`

Example output:

          >> About to call dumppagetable() for copy-write-parent-before
START PAGE TABLE pid=3
0       P       U       W       deea
1       P       U       W       dee8
2       P       U       W       dee7
3       P       U       W       dee6
4       P       -       W       dee5
5       P       U       W       dee4
6       P       U       W       dfbd
>> Finished call to dumppagetable() for copy-write-parent-before
>> About to call dumppagetable() for copy-write-child-before-writes
START PAGE TABLE pid=4
0       P       U       -       deea
1       P       U       -       dee8
2       P       U       -       dee7
3       P       U       -       dee6
4       P       -       -       dee5
5       P       U       W       dee2
6       P       U       -       dfbd
>> Finished call to dumppagetable() for copy-write-child-before-writes
>> About to call dumppagetable() for copy-write-child-after-first-write
START PAGE TABLE pid=4
0       P       U       -       deea
1       P       U       -       dee8
2       P       U       -       dee7
3       P       U       -       dee6
4       P       -       -       dee5
5       P       U       W       dee2
6       P       U       W       dee4
>> Finished call to dumppagetable() for copy-write-child-after-first-write
>> About to call dumppagetable() for copy-write-parent-after
START PAGE TABLE pid=3
0       P       U       -       deea
1       P       U       -       dee8
2       P       U       -       dee7
3       P       U       -       dee6
4       P       -       -       dee5
5       P       U       W       dee3
6       P       U       -       dfbd
>> Finished call to dumppagetable() for copy-write-parent-after
>> About to call dumppagetable() for copy-write-child-after
START PAGE TABLE pid=4
0       P       U       -       deea
1       P       U       -       dee8
2       P       U       -       dee7
3       P       U       -       dee6
4       P       -       -       dee5
5       P       U       W       dee2
6       P       U       W       dee4
>> Finished call to dumppagetable() for copy-write-child-after
Test passed: copy on write test passed --- allocate 4096; fork 1 children; read+write small parts in each child

    
  

This test first includes a dump in the parent before forking, labeled copy-write-parent-before. In this dump, one page of heap is allocated, in this case virtual page 6.

The next two dumps are from the child. The first copy-write-child-before-writes is before the child runs anything that writes to the heap. In this case, the memory of the child is the same as the parent was before forking except:

the stack (virtual page 5 in this case) has been copied to a new physical page, because the child wrote to it as part of calling printf() and the system call wraper for dumppagetable(); and
all pages were marked read-only in the page table to allow for copy-on-write;

After that is a dump of the child’s memory, labeled child-write-child-after-first-write, after the child writes to the heap. In the example above, we see that virtual page 6, which is part of the child’s heap has been copied to a new physical page and made writable. In this case, the physical page chosen (dee4) is the same page that used to hold the parent’s stack. If the parent’s stack was still being stored at that physical page this would be a problem, but we can see from the following dump (labeled copy-write-parent-after) that the parent’s stack was copied to dee3, probably because it changed shortly after the fork call.

The final two dumps show the state of the parent and child’s memory as the test completes. This shows that nothing changed further in the child. In the parent, we see the effects of the parent needing to write to its stack, but its heap memory is unchanged (even though the parent has accessed it by reading it).

Handling/avoiding kernel page faults

For debugging, I strongly recommend having code to print out a message when a page fault occurs within your page fault handling code. Otherwise, it is extremely difficult to figure out what’s going on when you have a memory error in your page fault handling code.

The default handler code has a check (tf->cs & 3) == 0 to detect exceptions that occur in kernel mode, and triggers a panic in this case. You could use this to detect whether a page fault occurs in kernel mode for debugging purposes. Note, however, that this would be normal for a system calls like read() that’s reading into allocate-on-demand mmeory
xv6 supports nested exception handlers, but you must be careful to not exhaust the single kernel stack, which must have enough space to handle both a system call and a page fault that occurs within the system call.

When an exception occurs in kernel mode, the processor recognizes that it’s running in kernel mode already and doesn’t change the stack pointer to the top of the kernel stack, unlike an exception that occurs in user mode. This means that a page fault that occurs in the middle of a system call won’t overwrite information on the stack used by the system call.

(In contrast, when an exception occurs in user mode, the processor switches to the kernel-mode stack specified in the “Task State Segment”. This is set by switchuvm().)
If a page fault triggers a second page fault that triggers a third fault, the processor reboots the machine. This is also known as a triple fault.

Contents