quiz for week 1

Answer the following questions about the lecture material from this week.

Question 1 (6 points)

In lecture, we discussed the idea of a processor being connected to memory and I/O devices via common bus. Suppose we are executing the instruction movq (%rbx), %rax in this model. The processor will send messages over the bus in order to ___. Select all that apply.

Consider the following C code:

int x;
int *p;
x = 1077;
p = &x;

Suppose this is executed on a little-endian system where ints are 4 bytes and:

Identify the value of the byte in memory at each of the following addresses. If not enough information is given, instead write unknown and explain briefly in the comment.

Question 2 (2 points) (see above)

0x50000

Answer:
Question 3 (2 points) (see above)

0x50002

Answer:
Question 4 (2 points) (see above)

0x50004

Answer:
Question 5 (5 points)

Consider the following C snippet:

c = str[x+2];

where c is declared as a char, str is declared as a char * (pointer to char; and based on the context of the snippet a pointer to the beginning of an array of char), and x is declared as a long.

Suppose str is stored in the register %rax, x in %rdi, c in the register %bl, and %r8 through %r15 can be used for temporary values. Which (if any) of the following are correct translations to assembly? Select all that apply.

quiz for week 2

Answer the following questions about the lecture material from this week.

Question 1 (4 points)

Consider the following C snippet:

unsigned long x;
unsigned long *p;
...
*p = x * 2;

If:

then which of the following x86-64 assembly snippets are equivalent to the C statement above? Select all that apply.

Consider the following assembly snippet:

start:
    addq %r8, %r9
    cmpq %r9, %r10
    jle start
end:
Question 2 (4 points) (see above)

Suppose that while this assembly snippet executes, the computations performed by the addq and cmpq instructions do not result in integer overflow (or underflow) if we interpret their operands and results as signed integers. Then, after the snippet finishes executing, what will the values of the condition codes SF and ZF be?

Question 3 (4 points) (see above)

Assuming that the variables r8, r9, and r10 represent the values in the registers %r8, %r9, and %r10 respectively, which is equivalent C code to the above assembly snippet?

Suppose we have C code in two different files. One file, upperstr.c contains the following:

void upperstr(char *str) {
    for (int i = 0; str[i] != '\0'; i += 1) {
        if (str[i] >= 'a' && str[i] <= 'z') {
            str[i] += ('A'-'a');
        }
    }
}

And another file, printbig.c contains the following:

#include <stdio.h> /* for puts() */

void printbig(char *str) {
    upperstr(str);
    puts(str);
}

An executable is made from code in these files and others. As part of that process an object file upperstr.o is produced from upperstr.c and an object file printbig.o is produced from printbig.c.

Recall from lecture that object files generally contain:

Question 4 (4 points) (see above)

Ignoring information only present for debugging, we would expect upperstr.o to contain ____. Select all that apply.

Question 5 (4 points) (see above)

Ignoring information only present for debugging, we would expect printbig.o to contain ____. Select all that apply.

quiz for week 3

Answer the following questions about the lecture material from this week.

Question 1 (4 points)

Consider the expression

(((x & 0xFF0F) >> 2) & 0xFF0) == (((x & 0xFF0) >> 2) & 0xFF0F)

where x is a 32-bit unsigned integer. The set of values x for which the above expression is true is the same as the set of values for which the expression (x & Y) == 0 is true for some constant Y. What is the value of Y? Write your answer as a hexadecimal number.

Answer:

Consider the following incomplete C code (where ___s represent omitted constants):

const unsigned long A = ___;
const unsigned long B = ___;
unsigned long swapAndFlip(unsigned long x) {
    unsigned long low = x & 0xFF;
    unsigned long high = x >> 56;
    return (((low ^ A) << 56) | high | (x & B);
}

Suppose we want to make this function take a 64-bit integer, and return a new 64-bit integer, such that:

Question 2 (4 points) (see above)

What should the value of the constant A be? Write your answer as a hexadecimal integer.

Answer:
Question 3 (4 points) (see above)

What should the value of the constant B be? Write your answer as a hexadecimal integer.

Answer:

For each of the following pairs of instruction set design choices, identify which is more consistent with the reduced instruction set computer (RISC) design philosophy

Question 4 (2 points) (see above)
Question 5 (2 points) (see above)

quiz for week 4

Answer the following questions about the lecture material for this week.

For the following questions, consider this Y86-64 machine code. (The machine code is written as a series of bytes in hexadecimal, written in order from lowest to highest memory address.)

30 f8 08 00 00 00 00 00 00 08 60 9a 50 ba 00 00 00 00 00 00 00 00
Question 1 (2 points) (see above)

What is the mnemonic for the first instruction in the above machine code snippet?

Question 2 (4 points) (see above)

The first instruction in the above assembly snippet has a 64-bit integer constant in it. What is it? Write your answer as a hexadecimal number.

Answer:
Question 3 (4 points) (see above)

What is the mnemonic for the second instruction in the above machine code snippet?

Consider the following HCLRS snippet:

register xY {
    foo : 64 = 0x0;
    bar : 64 = 0x5;
}
x_foo = Y_foo + 1;
x_bar = x_foo + Y_bar;
pc = x_foo;
Question 4 (4 points) (see above)

Since the above snippet sets pc, it will read from the instruction memory, setting i10bytes based on the contents of memory (even though the snippet does not use that value). Before the first rising edge of the clock signal, what bytes of memory will be used to construct the value of i10bytes? Identify the bytes by their memory addresses. Select all that apply

Question 5 (4 points) (see above)

Initially, the value of the bar register is 0x5. After two rising edges of the clock signal, it will be ___. Write your answer as a hexadecimal number.

Answer:

quiz for week 5

Answer the following questions about the lecture material for this week.

Question 1 (6 points)

Suppose we start with the mov-to-register CPU we described in lecture which implements the Y86 instructions irmovq, mrmovq, and rrmovq. This processor design had MUXes (represented in HCLRS by case expressions) that chose between several inputs based primarily on what the current instruction was. When adding a new instruction, for some of these MUXes, we can use an existing input and modify when it is selected to include the new instruction. For others, we may need to add a new input to this MUX for the new instruction (or some other logic that achieves this effect). In a few cases, we may even need to add a new a MUX (or other logic that achieves the same effect).

Suppose we added the addq instruction to the mov-to-register CPU. For which of the following parts of the circuit would we need to add a new MUX or a new MUX input (or equivalent logic) rather than modifying the conditions for selecting existing MUX inputs? Select all that apply.

Question 2 (5 points)

In Y86, the popq instruction is encoded using 2 bytes. Suppose we made a variant of Y86 that instead encoded popq in one byte, where the high-order four bits of the first byte would be 0xB (like popq is in normal Y86) and the low-order four bits of the first byte would be the register index corresponding to the operand to popq.

Supporting this new encoding would likely ____. Select all that apply.

In the single-cycle processor design discussed in lecture, executing the code

pushq %rax
popq %rbx

involves the following operations:

Question 3 (7 points) (see above)

Which of the above operations could occur simulatenously or in any order in the single-cycle processor design? Select all that apply

Question 4 (4 points)

Suppose we added the instruction

immovq CONSTANT, DISPLACEMENT(rA)

to the single-cycle Y86-64 processor design we discussed in lecture. This instruction would store the 64-bit value CONSTANT in memory at an address computed by adding the value of the register rA to the 64-bit constant DISPLACEMENT.

While this instruction is executing, the value input to the data memory (called mem_input in HCLRS) will be equal to ______.

quiz for week 6

Answer the following questions about the lecture material for this week.

Suppose we are building a two-stage pipelined Y86-64 processor from the following components:

(and that other components use negligible time to make the following questions simpler.)

Most simply, each our two stages could perform the work of one or more stages from the five-stage pipelined design we discussed in lecture.

Question 1 (4 points) (see above)

If the first stage of the processor performs the work for Fetch and the second stage performs the work of Decode+Execute+Memory+Writeback, then the cycle time would be around _____ ps plus pipeline and/or program counter register delays.

Answer:
Question 2 (4 points) (see above)

To minimize the cycle time, these two stages should perform the work of _______ (for the first stage) and _______ (for the second stage).

Question 3 (4 points)

Suppose a single-cycle processor design has a cycle time of 3000 ps. If this design is modified into an 8-stage pipelined processor (using the same storage components and adding pipeline registers) and this results in a performance improvement, then the new cycle time would be greater than _____ ps and less than _____ ps. Choose the tightest bounds you can infer from the information in the question. Write your answer as two numbers of picoseconds seperated by a comma. (For example, if you think the answer is "greater than 100 ps and less than 200 ps" write "100,200".)

Answer:
Question 4 (4 points)

Suppose we construct a pipelined processor by modifying the single-cycle processor design we saw in lecture and that our pipeline processor has two stages:

With these stages, the pipeline registers should contain (at least when certain instructions are being run) ____. Select all that apply.

Question 5 (4 points)

Suppose we modified the four-stage pipelined addq processor we discussed in lecture to have three stages:

With these three stages, suppose we run the following assembly and resolve data hazards with stalling:

addq %r8, %r9  
addq %r10, %r9   
addq %r8, %r10      
addq %r8, %r9    

Suppose the first instruction is fetched during cycle 1 (and so performs its decode during cycle 2 and its writeback during cycle 3). During what cycle would the last add instruction run its writeback stage?

Answer:

quiz for week 7

Answer the following questions about the lecture material for this week.

For the following questions, consider the following assembly code:

addq %r8, %rcx                 
mrmovq 8(%rcx), %rax           
subq %rcx, %rax                
Question 1 (4 points) (see above)

Suppose the above assembly code was executed on a six-stage pipelined processor with the following pipeline stages:

  • Fetch
  • Decode
  • Execute
  • Memory 1
  • Memory 2
  • Writeback

The result of reading the data memory is not available until near the end of the Memory 2 stage. The processor uses a combination of stalling and forwarding to resolve data hazards. It supports all forwarding paths that would not result in a large increase in cycle time.

If the addq instruction completes its fetch stage during cycle 1, then the subq instruction will complete its writeback stage during cycle ___.

Answer:
Question 2 (4 points) (see above)

Suppose the above assembly code was executed on seven-stage pipelined processor, where the pipeline stages are:

  • Fetch
  • Decode
  • Execute 1
  • Execute 2
  • Execute 3
  • Memory
  • Writeback

In this processor, the result of an ALU operation (addition or subtraction) is not available until near the end of the Execute 3 stage, but the inputs to the operation must be available near the beginning of the Execute 1 stage. This processor uses a combination of stalling and forwarding to resolve hazards. It supports all forwarding paths that would not result in a large increase in cycle time.

If the addq instruction completes its fetch stage during cycle 1, then the subq instruction will complete its writeback stage during cycle ____.

Answer:
Question 3 (6 points)

Consider the following assembly code:

addq %r8, %r9
mrmovq 16(%r8), %r9
subq %r9, %r10
xorq %r9, %r11
rrmovq %r9, %r12

When the above instruction is executed on a five-stage pipelined processor which uses forwarding to resolve hazards in a way that minimizes stalling like we discussed in lecture, which of the following forwarding operations must occur? Select all that apply

quiz for week 8

Answer the following questions about the lecture material for this week.

Consider a 7-stage pipelined processor with the following stages:

  1. Fetch
  2. Decode 1
  3. Decode 2
  4. Execute 1
  5. Execute 2
  6. Memory
  7. Writeback

Assume these stages do the same thing as in the 5-stage processor we discussed in lecture, except that the decode and execute stages are split into two stages. For the split decode stage, the inputs to the original decode stage (like register indices) need to be available near the beginning of the decode 1 stage and outputs of the original decode stage (like register values) are only available near the end of the decode 2 stage. Similar is true for the execute 1 and execute 2 stages.

Assume the processor implements whatever forwarding is possible without dramatically increasing cycle time.

Question 1 (4 points) (see above)

Suppose the 7-stage processor is used to execute the following:

     addq %rax, %rcx       
     jle foo             
     rmmovq 8(%rax), %rax
     subq %rcx, %rdx
     halt
foo:
     xorq %r8, %r9
     xorq %r10, %r11
     xorq %r12, %r13
     xorq %r8, %r9
     xorq %r10, %r11
     xorq %r12, %r13

The processor implements jle using branch prediction that predicts the branch as always taken. The actual outcome of the branch prediction is computed during the conditional jump's execute 2 stage (which will use the condition codes updated by the previous instruction's execute 2 stage) and can be used to determine what instruction to fetch in the next cycle (but not earlier).

When the assembly above is executed, the jle is not taken, so some xorq instructions are fetched and then squashed. If the addq instruction is fetched during cycle 1, then the rmmovq instruction will be fetched during cycle ___.

Answer:
Question 2 (4 points)

Consider the five-stage pipelined processor design discussed in lecture but with a different branch prediction strategy: instead of predicting conditional jumps as always taken, predict them as taken if they target an address that is less than the jump instruction's and as not taken if they target an address than is greater than the jump instruction's. (This strategy is called "forward not-taken, backwards taken").

In this processor, incorrectly predicted branches require 2 extra cycles, instructions that use a value computed in memory by the previous instruction in the ALU require 1 extra cycle, and returns require 3 extra cycles. (And no other cases require extra cycles due to hazards.)

Suppose the instructions run in a program are:

About how many average cycles per instruction would the program experience on the processor described above?

Answer:

Consider the following C++ snippet:

for (int i = 0; i < 10000; ++i) {
    A[i] += B[i];
    A[i] *= C[i];
}
std::cout << "last A is " << A[9999] << std::endl;
Question 3 (2 points) (see above)

The accesses to elements of A will have _____ temporal locality than the accesses to i.

Question 4 (2 points) (see above)

Assuming about the same number of bytes of machine code are required for each, the accesses to the machine code for the += and *= statements in the loop will have _____ spatial locality than the machine code for the std::cout statement.

Question 5 (2 points) (see above)

Assuming about the same number of bytes of machine code are required for each, the accesses to the machine code for the += and *= statements in the loop will have _____ temporal locality than the machine code for the std::cout statement.

Question 6 (5 points)

Suppose the contents of a two-entry direct-mapped cache is as follows:

index (binary) validtag (binary)data bytes (hexadecimal)
00 10011F 3F 5F 7F
01 11102E 3E 4E 5E
10 100133 44 55 8F
11 10109E AE BF 01

(In the table above, data bytes are listed with the lowest address's value first (left-most).)

The byte of memory at address 0x11 (binary: 10001) is stored in this cache and has the value 0x3F.

Bytes from which of the following memory addresses are also stored in the cache shown above? (Note that addresses may be written without all leading 0s.) Select all that apply.

quiz for week 9

Answer the following questions about the lecture material for this week.

Consider a two-way set associative cache with the following contents:

set index validtagdata bytes validtagdata bytes
0 10xA201F 3F 2A 93 27 00 45 8A 10xC2033 44 55 92 91 90 0A 0C
1 0 10xC200F 1C 11 12 13 14 15 16
2 10xC2020 21 2F 2A 40 41 55 98 10x33295 33 12 59 43 3A 3C 3D
3 0 0

In the representation above, the data bytes are listed in hexadecimal, starting with the byte with the lowest memory address.

Question 1 (4 points) (see above)

The data byte 1C in the second way of the set with index 1 was retrieved from memory at address _____. Write your answer as a hexadecimal number.

Answer:
Question 2 (4 points) (see above)

Give an example of an address that, if accessed, would result in a value being replaced in the cache above. Write your answer as a hexadecimal number.

Answer:
Question 3 (4 points)

Suppose a program accesses a cache in the following pattern:

Assume the above accesses use a cache that is not also used for other accesses and that the cache is initially empty.

If the cache used for the accesses above is a 4-byte, direct-mapped cache with a write-through, write-no-allocate policy and 2 byte blocks, how many accesses (reads or writes) to the next level of cache (or main memory if there is not next level) will be performed when the above cache accesses happen?

Answer:

Consider a 4MB (4194304 byte) cache, 4-way cache with 64-byte blocks, a first-in, first-out replacement policy, and a write-back, write-allocate policy.

Suppose the cache is initially empty (all valid bits and dirty bits set to 0), then the following accesses happen:

Question 4 (4 points) (see above)

Assuming 64-bit memory addresses, how many bits does the cache have (in total, across the whole cache) available to store tags?

Answer:
Question 5 (3 points) (see above)

After the accesses above, how many dirty bits will be set to 1?

Answer:
Question 6 (3 points) (see above)

After the accesses above, how many valid bits will be set to 1?

Answer:

quiz for week 10

Answer the following questions about the lecture material for this week.

Question 1 (4 points)

Consider a system with the following caches and performance results on a benchmark:

and main memory has a 200 cycle access time.

Assume that accesses to the next level of cache are not started until the access time on the current level completes.

What is the average memory access time (in cycles) on this system?

Answer:
Question 2 (4 points)

Many caches store tags, valid bits, and other metadata separately from the data. One result of this design is that misses can be detected more quickly, after completing an access to likely faster "tag store".

Suppose a cache using this design:

Without the optimization to detect hits or misses early, this cache would have an average memory access time of 5 + 10% * 100 = 15 cycles. What would the average memory access time with this optimization be?

Answer:
Question 3 (4 points)

Suppose we have a 2-way, 2-set data cache with 8 byte blocks and an LRU replacement policy and run the following C code:

int array[12];
... /* code omitted */
int sum = 0;
for (int j = 0; j < 2; j += 1) {
    for (int i = 0; i < 4; i += 1) {
        if (sum > array[i * 3 + j]) {
            sum += array[i * 3 + j];
        }
    }
}

Assume that:

How many data cache misses will occur when the loops above run?

[Note that unlike some of the examples in lecture, the number of misses for each of the two sets of the cache may be different.]

Answer:

Consider the following C snippets:

/* version A */
for (int i = 0; i < N; i += 1) {
    for (int j = 0; j < i; j += 1) {
        A[i * N + j] = D[j * N + i] + B[i] * C[j];
    }
}

/* version B */
for (int j = 0; j < N; j += 1) {
    for (int i = 0; i < j; i += 1) {
        A[i * N + j] += D[j * N + i] + B[i] * C[j];
    }
}
Question 4 (4 points) (see above)

Which version has better spatial locality for accesses to A?

Question 5 (4 points) (see above)

If the cache can store four array elements in each cache block (for each of the arrays A, B, C, or D) and the cache is too small to store N elements of any of the arrays, approximately how many cache misses would we expect to occur in version A for accesses to the array B only? (N^2 below represents N squared.)

Question 6 (4 points) (see above)

If the cache can store four array elements in each cache block (for each of the arrays A, B, C, or D) and the cache is too small to store N elements of any of the arrays, approximately how many cache misses would we expect to occur in version A for accesses to the array D only? (N^2 below represents N squared.)

quiz for week 11

Answer the following questions about the lecture material for this week.

Question 1 (4 points)

Consider the following C code:

for (int i = 0; i < N; i += 1) {
    for (int j = 0; j < i; j += 1) {
        A[i] += B[i * N + j] * C[j * N + i];
    }
}

Consider the following attempts at optimizing the above C code:

/* version A */
for (int ii = 0; ii < N; ii += 2) {
    for (int i = 0; i < ii + 2 && i < N; i += 1) {
        for (int jj = 0; jj < N; jj += 2) {
            for (int j = 0; j < jj + 2 && j < i; j += 1) {
                A[i] += B[i * N + j] * C[j * N + i];
            }
        }
    }
}

/* version B */
for (int ii = 0; ii < N; ii += 2) {
    for (int jj = 0; jj < ii + 2; jj += 2) {
        for (int i = 0; i < ii + 2 && i < N; i += 1) {
            for (int j = 0; j < jj + 2 && j < i; j += 1) {
                A[i] += B[i * N + j] * C[j * N + i];
            }
        }
    }
}

/* version C */
for (int i = 0; i < N; i += 1) {
    int j = 0;
    for (; j + 1 < i; j += 2) {
        A[i] += B[i * N + j] * C[j * N + i];
        A[i] += B[i * N + j+1] * C[(j+1) * N + i];
    }
    if (j < i)
        A[i] += B[i * N + j] * C[j * N + i];
}

/* version D */
for (int j = 0; j < N - 1; j += 1) {
    int i;
    for (i = 0; i + 1 <= j ; i += 2) {
        A[i] += B[i * N + j] * C[j * N + i];
        A[i] += B[(i+1) * N + j] * C[j * N + i+1];
    }
    if (i <= j)
        A[i] += B[i * N + j] * C[j * N + i];
}

Assuming N is very large (relative to the cache size), which of the above attempts at such optimizations are most likely to be successful at reducing the number of data cache misses?

Question 2 (4 points)

Consider the folowing assembly code:

xorq %rbx, %r8
andq %rbx, %r9
subq %r8, %rbx
addq %rbx, %r9

after register renaming, the physical register that andq reads to obtain the value of %rbx will be the same as ____. Select all that apply.

Consider the following assembly code:

    xorq    %rax, %rax
outer_loop:
    movq    %rax, %r8           /* A */
    movq    %rax, %rcx          /* B */
    imul    $8192, %r8          /* C */
    addq    %rdi, %r8           /* D */
inner_loop:
    movq    (%rdx,%rcx,8), %r9  /* E */
    movq    (%rsi,%rax,8), %r10 /* F */
    addq    %r10, %r9           /* G */
    movq    %r9, (%r8,%rcx,8)   /* H */
    incq    %rcx                /* I */
    cmpq    $1024, %rcx         /* J */
    jne     inner_loop          /* K */
    incq    %rax                /* L */
    cmpq    $1024, %rax         /* M */
    jne     outer_loop

In this code there are two nested loops. The outer one which uses %rax as its index variable, and the inner one uses %rcx as its index varaiable.

Question 3 (4 points) (see above)

In an out-of-order processor with branch prediction and several execution units (such as ALUs and data cache read/write ports) of each type, multiple instances of addq instruction labelled G (from different iterations of the inner loop) can be performed _____.

(Assume this out of order processor can perform loads and stores out-of-order.)

Question 4 (4 points) (see above)

In order to unroll the inner loop in the code above, one would most likely duplicate some of the instructions above (possibly with changes to where they expect their operands, etc.) and/or convert some of them to a form that performs the same work that multiple copies of the instruction would perform (such as converting an incq %rax into an addq $2, %rax).

In order to unroll the inner loop, which instructions would be so duplicated or transformed? (Instructions are identified by the letters in comments above.)

Question 5 (3 points) (see above)

After loop unrolling is performed, the addq instruction labeled G would ______ from a multiple accumulators transformation.

quiz for week 12

Answer the following questions about the lecture material for this week.

Question 1 (4 points)

Consider the following assembly function 'satSum':

satSum:
    movq %rdi, %rax
    addq %rsi, %rax
    cmpq %rsi, %rax
    jb retMax
    cmpq %rdi, %rax
    jb retMax
    ret
retMax:
    movq $-1, %rax
    ret

Inlining this function would most likely allow an optimizing compiler to avoid including an instruction very similar to ___. Select all that apply.

For the following questions, consider the following C snippet:

for (int i = 0; i < N; i += 1) {
    for (int j = 0; j < N; j += 1) {
        A[i] += (A[i] - B[j]) * C[j];
    }
}

where A, B, and C are declared as int*s and N is declared as an int.

Question 2 (4 points) (see above)

On the snippet above, which of the following optimizations cannot be performed without adding extra checks due to aliasing? Select all that apply.

Question 3 (4 points) (see above)

On the snippet above, which of the following optimizations would be likely to result in substantially worse performance if N was very small (but still varied between executions of the code above)? Select all that apply.

Question 4 (4 points) (see above)

On the snippet above, which of the following optimizations would be likely to significantly increase the amount of machine code used to store the above loops? Select all that apply.

quiz for week 14

Answer the following questions about the lecture material for the two lectures before Thanksgiving break.

Question 1 (4 points)

The vector instructions we discussed in lectures supported loading a vector of values from (or storing a vector of values to) a single location in memory. In addition to these types of vector load and store instructions, some vector instruction sets include support for an instruction that takes a vector of memory locations and fills a vector by loading an element from each of those memory locations (or stores a vector by storing an element to each of those locations). These more featureful vector load and store instructions are typically substantially slower than those that load from (or store to) a single location.

When vectorizing which of the following snippets would these special vector load and store instructions be useful? Select all that apply.

Question 2 (6 points)

Consider a single-core system with an operating system that uses time multiplexing (with context switching) to share the processor between multiple processes.

Initially, process A is running the following assembly snippet:

movq $100, %rax
movq $200, %rcx
addq %rcx, %rax

After the second movq and before the addq instruction can complete, the operating system switches to process B. After process B starts running, ____. Select all that apply.

Suppose a process attempts to execute the following, whose machine code is located in memory at address 0x200000:

movq $0x100000, %r8
movq (%r8), %r9
addq %r9, %r10

Suppose the memory address 0x100000 is not accessible, so accessing it causes a crash in the second movq instruction.

[editing note: originally the above sentence had some minor typos: it erroneously duplicated "causes a crash" and had the wrong number of 0s in 0x100000]

Question 3 (4 points) (see above)

An exception will occur as part of the execution of the above. When it does part or parts of the exception table will be used. These part(s) will contain ____. Select all that apply.

Suppose an operating system with a single core is running two processes.

Initially process A is active, and it attempts to reads from a file. The data from the file is not yet available because it has not been read from the disk. While it is being read from the disk, process B runs, performing a part of a long computation. Then process A finishes reading from the file.

Question 4 (4 points) (see above)

During this procedure, there is a context switch from process A to process B. This context switch most likely occurs ____.

Question 5 (4 points) (see above)

Suppose process B stops running and process A is restarted almost immediately after the data is retrieved from the file. For this to happen, most likely ___.

quiz for week 15

Answer the following questions about the lecture material for this week.

Consider a system with 2048-byte (2 to 11th power bytes) pages, and the following page table:

virtual page #validphysical page #
0x00
0x110x34
0x210x35
0x310x36
0x410x12
0x510x10
0x610x06
0x710x04

(Since this system has 2048 byte pages, page offsets are 11 bits.)

Question 1 (4 points) (see above)

If a program running on this system accesses address 0x1432, then what physical address will be accessed? Write your answer as a hexadecimal number. If an exception will occur instead, write "fault".

Answer:

Suppose a system has the following virtual memory configuration:

Question 2 (4 points) (see above)

When a program accesses a virtual address, the processor reads a page table entry from physical address 0x402000 and then reads a value from physical address 0x201234. What virtual address did the program access? Write your answer as a hexadecimal number. If not enough information is provided, write "unknown" and explain in the comment.

Answer:
Question 3 (3 points) (see above)

Suppose a program accesses virtual address 0x1007. The processor reads a page table entry which, when interpreted as 4-byte integer, is 0x00099003 as part of trying to resolve this access. Which of the following is true about this virtual memory access? Select all that apply.

Suppose a system has the following virtual memory configuration:

Question 4 (4 points) (see above)

While executing a move instruction that copies from address 0x123456789A into a register, the processor looks up a first-level page table entry at address 0x400240. That page table entry has its valid bit and user-mode-accessible bit set, and a physical page number of 0x456. From what physical address will the processor try to read the second-level page table entry for this address from?

Remember to account for page table entries being larger than 1 byte.

Write your answer as a hexadecimal number. If not enough information is provided, write "unknown" and explain in the comments. If an exception will occur instead, write "unknown" and explain in the comments

Answer: