S2024 quizzes

quiz for week 2

Answer the following questions about the lecture material for this week.

Question 1 (4 points)

Suppose we create a library with a function foo:

int foo(int x);

This function is declared in a header file bar.h and the library implementation is linked into a file libbar.so.

If we modify the function foo to have the prototype

int foo(double x);

and update bar.h and libbar.so accordingly, we find that executables that use the library and used to work now crash (even though there are no relevant bugs in the library itself).

Which of the following would either fix this issue or have avoided this issue in the first place? Select all that apply.

Question 2 (4 points)

Consider the following erroneous Makefile.

all: executable

foo.c: foo.h
    gcc -Wall -c foo.c

main.c: foo.h
    gcc -Wall -c main.c

executable: foo.c main.c
    gcc -Wall -o executable foo.c main.c

clean:
    rm --force main.o foo.o executable

(Assume the commands in each rule are prefixed by tab characters.)

Which of the following is true about this erroneous Makefile? Select all that apply.

Suppose a program displays its logo when it starts. Rather than store that logo in a separate data file, the program includes it in its executable. To do this, multiple steps are used:

  1. First a program converts the logo file from SVG format in the file logo.svg into BMP format in the file logo.bmp:

    svg2png logo.svg logo.bmp
    
  2. Then, the BMP file logo.bmp is converted to a file logo.h file that defines an array of unsigned chars logo using the xxd utility:

    xxd -i logo.bmp logo.h
    
  3. The file logo.h is included by main.c, along with the utility header file graphics.h and the C standard library headers <stdlib.h> and <stdio.h>.

Consider the following incomplete Makefile that implements this scheme, including building the final executalbe program. In this incomplete Makefile, lines or partial lines of code that are not yet complete are replaced by ### BLANK X ### where X is some number:

all: program

### BLANK 1 ###
    svg2png logo.svg logo.bmp

### BLANK 2 ###
    xxd -i logo.bmp logo.h

main.o: ### BLANK 3 ###

graphics.o: graphics.h

%.o: %.c
    clang -c $^

program: main.o graphics.o
    clang -o program main.o graphics.o

(Assume the commands in each rule are prefixed by tab characters.)

Question 3 (4 points) (see above)

What would be appropriate to put in place of ### BLANK 2 ###?

Answer:
Question 4 (4 points) (see above)

What would be appropriate to put in place of ### BLANK 3 ###?

Answer:
Question 5 (4 points)

Suppose the file "first.txt" has the following access control list:

user:A:rw-
user:B:r--
group:C:rw-
group:D:r--
other::---

and the file "second.txt" has the following access control list:

user:B:rw-
group:C:rw-
other::---

(The other::... lines provide default permissions for cases where other entries of the access control list do not apply.)

It's possible for a running program to be able to read and write "second.txt" but not to be able to both read and write "first.txt". Which of the following is a sufficient condition for this to be true? Select all that apply.

Question 6 (4 points)

Suppose we wanted to allow users to collaborate on a shared system like the department machines (portal, etc.)

In particular, one student A wants to collect comments from their collaborator students B and C and then reply to those comments. Students B and C should not be able to see each other's comments or the replies intended for the other student. But no one else should be able to read those comments.

Suppose to implement this scheme, we have separate files for B's comments and the corresponding replies and for C's comments and the corresponding replies.

Suppose we want to set the permissions on these files to restrict access appropriately, but instead of being able to use full access control lists (where we can specify read/write/execute access individually for arbitrary numbers of users and groups), we have to use more limited chmod-style permissions (where we can only specify read/write/execute access for a single user and for a single group).

We would need to create some groups of users to enable this. (When we say a "group contains users", we mean that everytime one of those users runs a program it runs as part of that group.) What sets of new groups could enable this? Select all that apply.

click here to preview key
Return to course pageReturn to index 

quiz for week 3

Answer the following questions about the lecture material for this week.

Suppose a single-core Unix-like system is running three processes, one for a text editor, one for a compiler, and one for simulation. The following sequence of events happens:

  1. initially, the compiler is running on the processor to generate assembly in memory;
  2. then, the compiler starts waiting for more source code to be read from disk;
  3. while the compiler is waiting, the simulation runs and does computations in memory, including using many functions in <math.h>
  4. then, a keypress occurs and as a result, the text editor immediately runs to process the keypress and update the screen, and then ask for more input
  5. then, after the keypress is handled, the simulation resumes running, continuing to do computations in memory
  6. then, the compiler's read from disk finishes, and the compiler resumes generating assembly
  7. the compiler opens and write a file back to disk to store the assembly it computed and starts waiting for this to complete
  8. while the compiler is waiting for the data to finish being written by the disk, the simulation resumes running, performing computations in memory
Question 1 (4 points) (see above)

Sometimes a system call's completion will be triggered by a non-system call exception (although system calls are always started with the system call itself -- a system call exception).

During what steps in the above scenario is this likely to happen?

Write the number of steps above; for example, if it would happen in steps 4 and 7, write "47"

Answer:
Question 2 (4 points) (see above)

Some of the steps above are likely not to involve any system calls exceptions occuring. (That is steps in which no system call exception handler starts. The steps may or may not involve a system call handler that was started in a previous step resuming or continuing running.) Which ones?

Write the number of steps above; for example, if the answer is steps 4 and 7, write "47"

Answer:
Question 3 (4 points) (see above)

Some of the steps above may require multiple exceptions to occur. Identify them.

Write the number of steps above; for example, if it would happen in steps 4 and 7, write "47"

Answer:

On Linux, the sched_yield system call requests that the OS to context switch to another thread/process (if there is one to switch to). (sched_yield does not do anything else.)

For the following questions, suppose we are running a single-core Linux system with only two processes and that sched_yield calls and processes exiting are the only reason context switches occur, and that each call to sched_yield causes a context switch if another process can be run.

Suppose the first process is running the program:

int main() {
    int x = 1;
    printf("%d", x); fflush(stdout);
    x = x + 1;
    sched_yield();
    printf("%d", x); fflush(stdout);
    x = x + 1;
    sched_yield();
    printf("%d", x); fflush(stdout);
    exit(0);
}

and the second process is running the program:

int main() {
    printf("A"); fflush(stdout);
    sched_yield();
    printf("B"); fflush(stdout);
    sched_yield();
    printf("C"); fflush(stdout);
    exit(0);
}

(Assume all appropriate header files are #included.)

Suppose we start both of these programs at the same time, but arrange for the first process to initially have access to the processor.

Question 4 (4 points) (see above)

If both programs output to the same terminal, what is a possible output one could see on that terminal?

Answer:
Question 5 (4 points) (see above)

Running printf and/or fflush may require performing system calls. If so, ___. Select all that apply.

Question 6 (4 points) (see above)

Suppose the assembly for first process's code stores the value x in the callee-saved register %r15 and does not use the stack to store it.

Then, while the second process is outputting B, the value of x is most likely ____.

click here to preview key
Return to course pageReturn to index 

quiz for week 4

Answer the following questions about the lecture materials for this week.

For the following questions, suppose we run two processes. The first process runs with PID 48 and has a signal handler setup for SIGUSR1 with the following code:

int global = 0;
void handle_usr1_program1(int signum) {
    global += 1;
    int temp = global * 2 + global;
    printf("%d", temp); fflush(stdout);
    kill(54, SIGUSR1);
    printf("A"); fflush(stdout);
}

and the second process runs with PID 54 and has a signal handler setup for SIGUSR1 with the following code:

int global = 0;
void handle_usr1_program2(int signum) {
    global += 1;
    printf("%d", global); fflush(stdout);
}

Assume the code outside signal handlers in these processes does not do anything relevant to these questions (other than not exiting or exiting as described below) and that the signal handlers are setup using sigaction with an sa_flags value of SA_RESTART.

Question 1 (4 points) (see above)

If both processes are running and outputting to the same terminal, no code that would change global has run in any process, and we send SIGUSR1 to the first process, what is a possible output?

Answer:
Question 2 (4 points) (see above)

handle_usr1_program2 can run as a result of handle_usr1_program1's call to kill. When handle_usr1_program2 runs this way, ____. Select all that apply.

Question 3 (4 points) (see above)

Suppose initially both processes are running, but then the second process terminates and a message about it exiting is printed by the shell that was running it. If we wait a while after this happens, and then send SIGUSR1 to the first process, which of the following may happen depending on the details of other running programs? Select all that apply.

Consider the following program that uses the POSIX API:

pid_t orig_pid, other_pid;
void handle_signal(int signum) {
    if (signum == SIGUSR1) {
        printf("%d:SIGUSR1 ", (int) other_pid); fflush(stdout);
        if (other_pid != 0)
            kill(other_pid, SIGUSR2);
        else
            kill(orig_pid, SIGUSR2);
    } else {
        printf("%d:SIGUSR2 ", (int) other_pid); fflush(stdout);
        exit(0);
    }
}

int main() {
    struct sigaction sa;
    sa.sa_handler = &handle_signal;
    sigemptyset(&sa.sa_mask);
    sa.sa_flags = 0;
    sigaction(SIGUSR1, &sa, NULL);
    sigaction(SIGUSR2, &sa, NULL);
    orig_pid = getpid();
    other_pid = fork();
    if (other_pid == 0) {
        /* child process */
        kill(orig_pid, SIGUSR1);
        while (1) pause();
    } else {
        /* parent process */
        kill(other_pid, SIGUSR1);
        while (1) pause();
    }
}

Assume all relevant header files are #included, and that the call to fork does not fail.

For the questions below, assume that when this program runs that a signal handler is never started while a signal handler is already running in the same process.

The pause() function called by the code above waits until the current process receives a signal.

Question 4 (5 points) (see above)

Under the assumption above, which are possible outputs of this program? Select all that apply.

Consider the following program that uses the POSIX API:

int global = 0;
int main() {
    pid_t pid1, pid2;
    int status;
    pid1 = fork();
    if (pid1 == 0) {
        pid2 = fork();
        if (pid2 == 0) {
            global += 1;
        } else {
            global += 2;
            waitpid(pid2, &status, NULL);
        }
    } else {
        waitpid(pid1, &status, NULL); 
        pid2 = fork();
        if (pid2 == 0) {
            global += 4;
        } else {
            global += 8;
            waitpid(pid2, &status, NULL);
        }
    }
    printf("%d\n", global);
    return 0;
}

Assume all relevant header files are included and fork and waitpid do not fail.

Question 5 (4 points) (see above)

When we run this program, its output will include "4". This will be outputted by ____.

(For the purpose of this question, a child process of some process X is a process that was created directly by process X calling fork().)

Question 6 (4 points) (see above)

If we run this program once, what is the maximum number of simulatenously running processes that could result?

Answer:
click here to preview key
Return to course pageReturn to index 

quiz for week 5

Answer the following quesitons about the lecture material for this week.

Suppose an analyst wants to run an analysis on a large list of data files. They have a program they wrote called ./analyze, which reads a file from stdin and appends to a summary file.

Doing this by hand, they'd run commands like:

./analyze < first-file
./analyze < second-file
./analyze < third-file
...

They find that typing all these commands is tedious and fails to take advantage of their system having four cores, which could speed up the analysis by analyzing multiple files at the same time. So, they write a function to analyze a large list of files:

void analyze_all(int num_files, const char **filenames) {
    pid_t *pids = malloc(num_files * sizeof(pid_t));  // was erroneously not pointer on quiz as released
    for (int i = 0; i < num_files; i += 1) {
        pids[i] = fork();
        if (pids[i] < 0) {
            handle_error();
        } else if (pids[i] == 0) {
            int fd = open(filenames[i], O_RDONLY);
            if (fd < 0)
                handle_error();
            if (dup2(fd, STDIN_FILENO) < 0)
                handle_error();
            close(fd);
            const char *argv[] = {"./analyze", NULL};
            if (execv("./analyze", argv) < 0)
                handle_error();
        }
        /* LOCATION 1 */
    }
    /* LOCATION 2 */
}

This seems to function, but they discover that there are two problems:

The following questions ask about modifying the code to fix these issues. Throughout the questions, assume that the handle_error() function is never reached.

Question 1 (4 points) (see above)

Suppose to solve the problem of the function not waiting for the analyze programs to finish, they add the code waitpid(pids[i], NULL, 0) at the line labeled "LOCATION 1". This will solve the problem of not waiting for the analyze programs to finish, but create other problems. Which ones? Select all that apply.

Question 2 (4 points) (see above)

As a revised solution, they place code at "LOCATION 1" as follows:

if (i >= 4) {
    waitpid(pids[i-3], NULL, 0);
}

To complete this solution, what code would be best to place at LOCATION 2?

Question 3 (4 points) (see above)

After using the solution of the previous question, the analyst finds that sometimes only one instance of analyze is running, failing to take advantage of all four cores available on their system. This seems to happen even when the list of files is very long and the loop in analyze_all is near the beginning of the list of files. What is most likely happening?

Consider a system with a virtual memory system with:

and the following page table:

virtual page number valid physical page number read allowed? write allocated? execute allocated? user mode access allowed?
00
110x31011
210x41011
310x11101
410x71101
50
610x01101
710x21110
Question 4 (4 points) (see above)

What is the maximum value a physical page number could have on this system?

Answer:
Question 5 (4 points) (see above)

Consider executing the C snippet

*p += 1;

on this processor in a program running in user mode where p is a pointer to unsigned char. Assume the compiler does not optimize away the memory accesses. This will succeed without exceptions occuring if p points to a value in virtual page ____. Select all that apply.

Question 6 (4 points) (see above)

The virtual address 0x1123 on this system corresponds to what physical address? Write your answer as a hexadecimal number. If there is no corresponding physical address so accessing virtual address 0x1123 would cause an exception (such as a page fault) write "fault".

Answer:
click here to preview key
Return to course pageReturn to index 

quiz for week 6

Answer the following questions about the lecture material for this week.

Suppose a system with 256 byte pages is running a process with the following memory layout (all address virtual, all ranges inclusive):

Question 1 (4 points) (see above)

Suppose the system uses allocate-on-demand to allocate memory for its heap and stack. In this scheme, the page table initially has no valid entries for the program's heap or stack, and it is updated as the program accesses memory (in response to the exceptions triggered by the memory accesses to invalid pages). Assume that when updating the page table this way, the operating system only allocates as much memory as needed to make the access that triggered the exception work.

If the process runs code that:

  • writes to a global array at address 0x310 through 0x33f (inclusive)
  • writes to an array on the heap address 0x5f0 through 0x61f (inclusive)
  • reads the array on the heap at addresses 0x5f0 through 0x61f, computes the sum, and writes to a global variable at address 0x380

then, after that code completes, which of the following will be true? Select all that apply.

Question 2 (5 points) (see above)

Suppose:

  • initially all accessible virtual pages have a corresponding physical page (so they can be accessed without an exception occurring), and
  • the system uses copy on write and does the minimum copying possible

and the process fork()s and then:

  • the child process runs code at addresses 0x1f0 through 0x210 that:
    • copies a global variable at address 0x310 to the stack at address 0xe40
    • copies a global variable at address 0x340 to the stack at address 0xe41
    • copies a global variable at address 0x410 to the stack at address 0xe42
  • the parent process runs code at addresses 0x220 through 0x240 that:
    • modifies a value on the stack at address 0xe08

and no other relevant memory accesses occur.

After these events occur, which of the following is true? Select all that apply.

Question 3 (4 points) (see above)

In lecture, we mentioned the Unix mmap() call, which allows a program to access the contents of a file as if it is part of its own memory.

Suppose a program opens and mmap's a file as follows:

int fd = open("file.dat", O_RDWR);
char *p;
p = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);

This makes it so p can be used like a 4096 element array of chars that is stored in the file.

If we run two instances of the program at the same time and from the same directory, and open and mmap do not fail, then _____. Select all that apply.

Consider a system which uses single-level page tables with:

Question 4 (4 points) (see above)

While executing the instruction

movq 0x123450, %rax

which reads an 8-byte integer from 0x123450 and stores it in the register %rax, the processor accesses the page table to perform the read.

If the page table base register contains 0xA0000, then what physical address will the processor read a page table entry from? Write your answer as a hexadecimal number. If not enough information is provided, write "unknown" an explain briefly in the comments.

Answer:
Question 5 (4 points) (see above)

Consider the access described in the previous question.

Suppose the mov instruction runs in user mode, and after reading the page table entry from the address that is the answer to the previous question, the processor then reads an integer from physical address 0x903450 and copies the integer into %rax.

In order for this to happen, what is a possible value of that page table entry?

Write your answer as a hexadecimal number (interpreting the page table entry as a 64-bit integer).

Answer:

Consider a system which uses three-level page tables with:

Question 6 (4 points) (see above)

Suppose when the processor is translating the virtual address 0x00123456789ABC, the following sequence of events happens:

  • based on a page table base register of value of 0x10000000, the processor accesses a first-level page table entry at address 0x10000020
  • based on the value retrieved from the first-level page table entry, the processor discovers that the relevant second-level page table is in physical page 0x54, which starts at byte address 0x540000.
  • after looking up the appropriate entry in the second-level page table, the processor discovers that the relevant third-level page table is in physical page 0x60, which starts at byte address 0x600000.
  • in the third-level page table, the processor reads a page table entry from address 0x60b3c0 and discovers that the final translation of 0x00123456789ABC should be 0x9009ABC

From what physical address did the processor access a second-level page table entry in this lookup? Write your answer as a hexadecimal number. If not enough information is provided, write "unknown" and explain what additional information is needed in the comments.

Answer:
click here to preview key
Return to course pageReturn to index 

quiz for week 7

Answer the following questions about the lecture material for this week.

Consider the following C snippet that computes two values sum and squared_sum from an array of doubles A with a constant number N elements.

double sum = 0.0;
for (int i = 0; i < N; ++i) {
    sum += A[i];
}
double squared_sum = 0.0;
for (int i = 0; i < N; ++i) {
    squared_sum += A[i] * A[i];
}
Question 1 (4 points) (see above)

Which of the following changes would improve the temporal and/or spatial locality of the snippet's accesses to A? Select all that apply.

For the following questions, consider a direct-mapped data cache with 16 byte cache blocks.

Question 2 (4 points) (see above)

If the addresses 0x401 and 0x542 map to two different cache sets, then what is the minimum number of sets the cache could have?

Assume that the cache must have a power of two number of sets.

Answer:
Question 3 (4 points) (see above)

Suppose the cache has 4 sets (and therefore 2 set index bits) and is initially empty; and a processor accesses the cache as follows:

  • reads 2 bytes from an address with tag 0x43, set index 2, offset 4
  • reads 2 bytes from an address with tag 0x43, set index 2, offset 6
  • reads 2 bytes from an address with tag 0x43, set index 3, offset 0

When the cache performs these accesses, how many bytes will it read from memory (or the next level of cache)?

Answer:
Question 4 (4 points) (see above)

Suppose the cache has 256 sets (and therefore 8 set index bits) and addresses passed to the cache are 32 bits in size. Including space for valid bits, tags, and the actual data stored in the cache, how many bits of storage does the cache contain?

(Remember to account for bytes taking up 8 bits.)

Answer:

Consider a two-way set associative cache whose contents are as follows. Data is written as as sequence of bytes, starting with the lowest address (lowest offset), with each byte written in hexadecimal. The "LRU way" bit indicates which way is least recently used in each set.

way 0way 1
index (written in binary) validtag (written in binary)data validtag (written in binary)data LRU way
000 10000012 13 14 15 100100A0 B1 C2 D2 0
001 0 0 1
010 100100E1 F2 A3 B4 100101D4 D5 D6 D7 1
011 10100000 11 22 33 10101001 44 66 88 0
100 0 0 0
101 11111109 19 29 39 0 1
110 0 0 0
111 0 0 0
Question 5 (4 points) (see above)

The value 0xB1 has index 0, tag 4, and offset 1 (index, tag, and offset all written in decimal). What is its address? Write your answer as a binary number.

Answer:
Question 6 (4 points) (see above)

When reading a byte from the address with tag 00000 (binary), index 011 (binary), and offset 3 (decimal), the cache will ___. Select all that apply.

click here to preview key
Return to course pageReturn to index 

quiz for week 9

Answer the following questions about the lecture material for this week.

Consider a direct-mapped data cache with

Suppose the cache is initially empty and a program using this cache does the following sequence of accesses, in this order:

In the following questions ask about what happens during the accesses above. Do not include anything that will happen while processing later accesses, even if they are affected by the accesses above.

Question 1 (3 points) (see above)

How many bytes will be written to the next level of cache or main memory during the accesses above?

Answer:
Question 2 (3 points) (see above)

How many bytes will be read from the next level of cache or main memory during the accesses above?

Answer:

Consider a direct-mapped data cache with

on a system where virtual memory is not in use.

Question 3 (4 points) (see above)

Suppose a program uses this cache to access an array of 8192 four-byte integers which is located at address 0x1000000.

If a program reads from array[100], then a cache block containing array[100] will be stored in the cache. If the program then accesses array[X] for some X, then the cache block containing array[100] will be evicted from the cache. Give a possible value of X that is within the bounds of the array.

Answer:

Suppose a system has:

Question 4 (3 points) (see above)

When translating the virtual address 0xABCDEF, the system will check the TLB set with index _____.

Answer:
Question 5 (4 points) (see above)

Suppose a program running on this system has an 8192 byte array of bytes. Depending on how the program's page tables are setup, sometimes index 4096 of the array and index 0 of the array map to the same data cache set, so they cannot be stored simulatenously in the cache, even though they are both present in memory. Which of the following must be true for this situation to happen? Select all that apply.

Consider the following code that uses multiple threads using the POSIX ("pthreads") API.

int x, y;
void *thread_function(void *argument) {
    int *p = (int*) argument;
    *p = *p + x;
    return &y;
}

int main() {
    pthread_t thread;
    int z;
    x = 10; y = 20;
    if (0 != pthread_create(&thread, NULL, thread_function, &z))
        handle_error();
    printf("here\n");
    void *result;
    if (0 != pthread_join(thread, &result))
        handle_error();
    printf("%d %d %d %p\n", x, y, z, result);
    return 0;
}
Question 6 (3 points) (see above)

The *p = *p + x line modifies ____.

Question 7 (4 points) (see above)

The message here may be output ___. Select all that apply.

click here to preview key
Return to course pageReturn to index 

quiz for week 10

Answer the following questions about the lecture material for this week.

Consider the following C code that uses the POSIX API and implements a binary search tree search function that can be called from multiple threads at the same time:

struct tree_node {
    int value;
    struct tree_node *left;
    struct tree_node *right;
    pthread_mutex_t lock;
};

bool Find(struct tree_node *node, int value) {
    bool result = false;
    if (node != NULL) {
        struct tree_node *next = NULL;
        pthread_mutex_lock(&node->lock);
        if (value == node->value) {
            result = true;
        } else {
            if (value < node->value) {
                next = node->left;
            } else {
                next = node->right;
            }
        }
        pthread_mutex_unlock(&node->lock);
        if (next != NULL) {
            result = Find(next, value);
        }
    }
    return result;
}

This Find function would be called by passing the root node of the tree as node and some integer value to search for.

Question 1 (4 points) (see above)

Suppose we run multiple instances of the above Find function from multiple threads at the same time. Depending on the tree and the arugments to Find, the threads will be able to make more or less effective use of mutliple cores.

Which of the following changes would increase how effectively the threads use multiple cores? Select all that apply.

Question 2 (4 points) (see above)

The locking in the Find routine should enable writing routines that manipulate the tree in parallel with calls to Find. Consider the following code that attempts to manipulate the tree:

pthread_mutex_lock(&root->lock);
struct tree_node *left;
left = root->left;
pthread_mutex_unlock(&root->lock);
pthread_mutex_lock(&left->lock);
struct tree_node *left_left;
left_left = left->left;
left->value = left_left->value;     /* Line A */
left->left = left_left->left;       /* Line B */
left->right = left_left->right;
free(left_left);
pthread_mutex_unlock(&left->lock);

Despite the locking, some race conditions can occur when this code is run at the same time as calls to Find. Which of the following are possible? Select all that apply.

Consider the following (incomplete) C code that uses barriers:

pthread_barrier_t barrier;
int global = 0;
int N, M;
void *f1(void *ignored) {
    for (int i = 0; i < N; i += 1) {
        pthread_barrier_wait(&barrier);
        printf("f1: %d, global: %d\n", i, global);
        global += i + 1;
        pthread_barrier_wait(&barrier);
    }
    return NULL;
}

void *f2(void *ignored) {
    for (int i = 0; i < M; i += 1) {
        pthread_barrier_wait(&barrier);
        if (_____________ /* BLANK 1 */) {
            global += 10;
        }
    }
    return NULL;
}

void run() {
    pthread_barrier_init(&barrier, NULL, 2);
    pthread_t t1, t2;
    N = 100;
    M = _________ /* BLANK 2 */;
    pthread_create(&t1, NULL, f1, NULL);
    pthread_create(&t2, NULL, f2, NULL);
    pthread_join(t1, NULL);
    pthread_join(t2, NULL);
    pthread_barrier_destroy(&barrier);
}

The _s marked "BLANK x" (for some x) represent some omitted code which is the subject of a question below.

Question 3 (4 points) (see above)

In order to make global have the largest value possible without creating race conditions and without adding additional assignments to global, what is an appropriate expression to place in the blank marked BLANK 1?

Answer:
Question 4 (4 points) (see above)

What is an appropriate expression to place in the blank marked BLANK 2 to allow run() to terminate?

Answer:

Consider the following C code that uses the POSIX API (where identifiers in ALL_CAPITALS are integers defined elsewhere):

struct PageVersion {
    pthread_mutex_t lock;

    /* only accessed after locking 'lock' */
    char *label;
    char *data;
    struct Page *page;

    /* only accessed after locking 'page->lock' */
    struct PageVersion *next;
};

struct Page {
    pthread_mutex_t lock;
    struct PageVersion *all_versions;
    struct PageVersion *active_version;
};

struct PageVersion *add_version_to(struct Page *page, char *label, char *data) {
    struct PageVersion *version = malloc(sizeof(struct PageVersion));
    version->label = label;
    version->data = data;
    version->page = page;
    pthread_mutex_init(&version->lock);
    pthread_mutex_lock(&page->lock);
    version->next = page->all_versions;
    page->all_versions = version;
    pthread_mutex_unlock(&page->lock);
    return version;
}

void move_version_to_page(struct PageVersion *version, struct Page *new_page) {
    pthread_mutex_lock(&version->lock);
    pthread_mutex_lock(&version->page->lock);
    pthread_mutex_lock(&new_page->lock);
    struct PageVersion *iterator = version->page->all_versions;
    if (iterator == version) {
        version->page->all_versions = version->next;
    } else {
        while (iterator->next != version) {
            iterator = iterator->next;
        }
        iterator->next = iterator->next->next;
    }
    version->next = new_page->all_versions;
    new_page->all_versions = version;
    pthread_mutex_unlock(&new_page->lock);
    pthread_mutex_unlock(&version->lock);
    pthread_mutex_unlock(&version->page->lock);
}

bool set_active_by_label(struct Page *page, char *label) {
    bool found = false;
    pthread_mutex_lock(&page->lock);
    struct PageVersion *iterator = page->all_versions;
    while (iterator) {
        pthread_mutex_lock(&iterator->lock);
        if (0 == strcmp(iterator->label, label)) {
            found = true;
            page->active_version = iterator;
            break;
        }
        iterator = iterator->next;
        pthread_mutex_unlock(&iterator->lock);
    }
    pthread_mutex_unlock(&page->lock);
    return found;
}
Question 5 (5 points) (see above)

It is possible for simulatenous calls to the functions above to cause a deadlock. Which of the following calls (if run simulatenously from different threads) could result in a deadlock? Select all that apply.

In the calls below A, B, C, etc. represent distinct struct Page* values and A1, A2, A3, etc. represent distinct PageVersion* values that are on the list of all versions in A.

Question 6 (4 points) (see above)

One possible deadlock occurs when move_version_to_page(A1, B) and move_version_to_page(B1, A) are called simulatenously (using the conventions for A1, B1, A, B from the previous question).

Which of the following changes would resolve this deadlock? Select all that apply

click here to preview key
Return to course pageReturn to index 

quiz for week 11

Answer the following questions about the lecture material for this week.

Consider the following incomplete code that uses monitors:

pthread_mutex_t lock;
char *files[NUM_FILES];
bool file_is_ready_to_process[NUM_FILES];
bool file_is_processed[NUM_FILES];
pthread_cond_t file_is_ready_cv[NUM_FILES];
pthread_cond_t file_is_processed_cv[NUM_FILES];
pthread_cond_t any_file_processed_cv;

void PrepareFile(int index) {
    SetupFileToBeProcessed(files[index]);
    pthread_mutex_lock(&lock);
    file_is_ready_to_process[index] = true;
    pthread_cond_broadcast(&file_is_ready_cv[index]);
    pthread_mutex_unlock(&lock);
}

void ProcessFileWhenReady(int index) {
    pthread_mutex_lock(&lock);
    while (!file_is_ready_to_process[index]) {
        pthread_cond_wait(&file_is_ready_cv[index], &lock);
    }
    pthread_mutex_unlock(&lock);    /* LINE A */
    ProcessFile(files[index]);
    pthread_mutex_lock(&lock);      /* LINE B */
    file_is_processed[index] = true;
    pthread_cond_broadcast(&any_file_processed_cv);
    pthread_cond_broadcast(&file_is_processed_cv[index]);
    pthread_mutex_unlock(&lock);
}

void WaitForTwoFilesToBeProcessed(int index1, int index2) {
    pthread_mutex_lock(&lock);

    while (_______________________________________________ /* BLANK 1 */) {

        pthread_cond_wait(&file_is_processed_cv[_____ /* BLANK 2 */], &lock);
    }
    while (_______________________________________________ /* BLANK 3 */) {

        pthread_cond_wait(&file_is_processed_cv[_____ /* BLANK 4 */], &lock);
    }
    pthread_mutex_unlock(&lock);
}

/* returns first file_index_list[i] such that file_is_processed[file_index_list[i]]
   is false; if no such entry is exists, returns -1
 */
int FirstNonReadyFile(int *file_index_list, int files_index_list_length) {
    for (int i = 0; i < file_index_list_length; i += 1) {
        if (!file_is_processed[file_index_list[i]]))
            return file_index_list[i];
    }
    return -1;
}

void WaitForManyFilesToBeProcessed(int *file_index_list, int file_index_list_length) {
    pthread_mutex_lock(&lock);
    while (_______________________________________________ /* BLANK 5 */) {

        pthread_cond_wait(&any_file_processed_cv, &lock);  /* LINE C */
    }
    pthread_mutex_unlock(&lock);
}

Assume all relevant header files are included (and NUM_FILES is an integer constant larger than 10).

A line of _s marked with a comment like /* BLANK 1 */ indicates a place where code may need to be added to complete the code above.

When WaitForTwoFilesToBeProcessed(index1, index2) returns, file_is_processed[index1] and file_is_processed[index2] should both be true because they were set by ProecssFileWhenReady() in other threads.

Similarly, when WaitForManyFilesToBeProcessed(index_array, index_array_length) returns, file_is_processed[index_array[i]] should be true for all i from 0 to index_array_length - 1, inclusive.

Question 1 (4 points) (see above)

what code would be most appropriate to place in the blanks marked /* BLANK 1 */ and /* BLANK 3 */?

Question 2 (4 points) (see above)

Which of the following code would be most appropriate to place in the blank marked /* BLANK 5 */?

Question 3 (5 points) (see above)

Assuming the best answer to the previous question, the line marked LINE C could be replaced with which of the following without making WaitForManyFilesToBeProcessed behave incorrectly? Select all that apply.

Question 4 (4 points) (see above)

If the pthread_mutex_unlock and pthread_mutex_lock calls marked with the comments LINE A and LINE B above were removed, then ____. Select all that apply.

Suppose two machines A and B are connected via a network that implements the "mailbox model" we discussed in lecture.

To send data reliably this way, one machine sends a message with that data, and expects the receiving machine to send an acknowledgments of each message with data sent. When acknowledgments are not received after waiting a long time, the data-sending machine is responsible for resending the message with data.

Question 5 (4 points) (see above)

Suppose machine A sends machine B ten messages with data using this scheme. Machine B's network has a problem where every other acknowledgment message (starting with the first one sent) is lost. (So, for example, if machine B attempted to send four acknowledgments in total, the first and third would be lost, regardless of which messages with data they were acknowledging.) Also, the network loses the first message the first time it is sent.

Including messages that are lost, how many messages are sent in total?

Answer:
Question 6 (4 points) (see above)

Suppose it takes 1 time unit to send a message with data, 10 time units for it to cross the network and be received, and 1 time unit to send an acknowledgment of that message, and 10 time units for the acknowledgment to cross the network be received.

If machine A sends machine B ten messages with data, how long (in time units) will it take for machine B to receive all ten messages with data assuming:

  • there are no lost or corrupted messages
  • machine A waits until it receives an acknowledgment of message with data N before sending message with data N + 1
Answer:
click here to preview key
Return to course pageReturn to index 

quiz for week 12

Answer the following quesitons about the lecture material for this week.

Suppose machine A and machine B are on separate wifi networks. The routers for these wifi networks are connected to a common wired network.

Question 1 (4 points) (see above)

If machine A sends a packet to B and B sends a packet acknowledging reciept of the original packet, what is the minimum number of link-layer frames (link-layer messages) that must be sent as part of this process? (Briefly identify them in the comments.)

Answer:
Question 2 (3 points) (see above)

As part of sending the packet from machine A to machine B, the wifi router for machine B's wifi network will send a frame (link-layer message) to machine B (possibly contained within another message).

What will the source address of this frame represent?

Question 3 (3 points) (see above)

As part of sending the packet from machine A to machine B, the wifi router for machine B's wifi network will send a packet (network-layer message) to machine B (which will be contained within link-layer frames as it crosses the network).

What will the source address of this packet represent?

Question 4 (4 points)

Suppose a machine moves from a wifi network to a wired network while it has active connections. If we wanted to make this change as seamless as possible for the programs using those connections, which of the following would be the best strategy?

Suppose A, B, and C attempt to secure communication using (symmetric) cryptography. To do this, they have three shared keys:

Suppose A is sending data to B via C and wants to ensure that C cannot interfere or eavesdrop despite C's role in forwarding the message. To do this, A will encode the message using some cryptographic scheme before providing it to C to forward. Suppose (like the examples in lecture) we have an encryption and decryption functions E(key, message) (which produces a ciphertext) and D(key, ciphertext) (which recovers the original message), and a MAC function E(key, message) (which produces a MAC "tag"). In the options below, E(x, y) means the result of applying E to x and y, and similar for MAC(x, y).

The following questions ask which schemes prevent C from interfereing or eavesdropping in a particular way.

Question 5 (4 points) (see above)

Assuming C only forwards A's message unchanged, which of the following schemes would prevent C from learning information about the contents of the data other than its length that it did not already know? Select all that apply.

Question 6 (4 points) (see above)

Which of the following schemes would prevent C from changing the data without B being able to easily detect this? (For this question, you may assume that C can guess what the data A intended to send was, even if C cannot infer this from the message.) Select all that apply.

click here to preview key
Return to course pageReturn to index 

quiz for week 13

Answer the following questions about the lecture material for this week.

Consider a system for allowing students to submit course registration requests to the registrar.

The system uses the following insecure (as partly revealed in the correct answers to the questions below) scheme.

Question 1 (7 points) (see above)

An attacker with the ability to read, replace, and introduce new messages on the network can do which of the following? (You may assume that the attacker has access to all relevant public keys and that asymmetric encrypted messages cannot be meaningfully changed by transforming the ciphertext, but they can be replaced by the attacker with new messages or messages obtained from elsewhere. In addition, you may assume the attacker is aware of the scheme described above, including the choice of cryptographic functions (hash, encryption, digital signature, etc.) in use.) Select all that apply.

Question 2 (5 points)

In lecture, we mentioned that in a public-key infrastructure, often rather than just presenting one certificate to distribute a public key, an entity will present a "chain" of several certificates, where the public key used to verify the second certificate is contained with in the first certificate, the public key used to verify the third is contained within the second, and so on. The final certificate in the chain will have the public key of interest, for example the public key for the web site the browser is contacting. Which of the following are advantages of this scheme compared to just sending a single certificate? Select all that apply.

As an alternative to the asymmetric key exchange scheme we discussed in lecture (which we mentioned was commonly used in TLS), two communicating parties A and B can use asymmetric encryption or signatures in place of the special "asymmetric key exchange" operator. There is a secure way to do this, but the following questions discuss insecure ideas.

In the question below PE and PD represents asymmetric encryption and decryption. Assume that A and B have shared public keys for encryption and signature verification in advance, and that the corresponding private keys are never available to any attacker.

Question 3 (5 points) (see above)

Consider the following insecure protocol:

  • A chooses a new, secret value K
  • A sends to B: PE(B's public key, K)
  • B sends to A: PE(A's public key, K)
  • then A and B use K as a symmetric key for symmetric authenticated encryption

Which of the following are omissions of this scheme assuming no additional steps are added to mitigate them? Select all that apply.

Question 4 (3 points)

Suppose a pipelined processor divides up instruction execution into 4 stages and is running a program with instruction A followed by instruction B followed by instruction C.

(Assume nothing stops the processor from starting an instruction's first stage immediately after the previous instruction finishes that stage.)

If instruction A executes its first stage during cycle number 0, then we would expect instruction C to executing its last stage during cycle number ___.

Answer:
Question 5 (3 points)

A pipelined processor with a cycle time of 1 ns and 8 pipeline stages will take ____ to run 1000 instructions.

(Assume nothing stops the processor from starting an instruction's first stage immediately after the previous instruction finishes that stage.)

click here to preview key
Return to course pageReturn to index 

quiz for week 14

Question 1 (4 points)

Suppose a single-cycle processor takes 1000 picoseconds to execute each instruction.

Suppose it is divided into a 8 pipeline stages, using pipeline registers that have a register delay of 20 picoseconds.

Assuming that the pipelined processor can start one instruction every cycle, one would expect the pipelined processor to have a latency (time from instruction start to finish, including all stages of the instruction) of at least ____ picoseconds. (Give the tightest bound possible with the given information. If needed, state any assumptions in the comments.)

Answer:
Question 2 (4 points)

Consider a 7-stage pipelined processor with a cycle time of 500 ps.

Suppose the processor runs 1000000 instructions and:

What is the average throughput of the processor in picoseconds per instruction, rounded to the nearest picosecond per instruction?

Answer:

Consider a seven-stage pipelined processor with the following pipeline stages:

Assume the stages do the same thing as the stages we discussed in lecture, except some of the stages we discussed in lecture are divided into multiple parts. When this happens, the divided stages need the inputs near the beginning of the stage for the first part and only have the results of that stage near the end of the stage for the last part.

Assume the processor resolves data hazards with a combination of stalling and forwarding.

Question 3 (4 points) (see above)

When executing

addq %r8, %r9
subq %r9, %r10
xorq %r9, %r11
imulq %r8, %r12

if the addq completes its fetch stage in cycle 0 then the imulq will complete its writeback stage in cycle ___. Account for any stalling necessary to resolve data hazards in the code above (and remember that the stalling will be combined with forwarding).

(In each of the instructions above, the first operand is a source and the second is a source and destination of the instruction.)

Answer:
Question 4 (6 points) (see above)

Consider executing the following assembly, with the following pipeline diagram:

instruction / cycle:    0  1  2  3  4  5  6  7  8  9  10 11 12
addq %r8, %r9           F  D  E1 E2 E3 M  W
imulq %r10, %r13           F  D  E1 E2 E3 M  W
movq (%r9), %r10              F  D  E1 E2 E3 M  W
subq %r8, %r9                    F  D  E1 E2 E3 M  W
nop                                 F  D  E1 E2 E3 M  W
nop                                    F  D  E1 E2 E3 M  W
xorq %r10, %r9                            F  D  E1 E2 E3 M  W

the processor will use forwarding alone (the explicit nops are sufficient to avoid stalling) to resolved data hazards. (In each of the instructions above, the first operand is a source and the second is a source and destination of the instruction.) Which of the following forwarding operations will occur? Select all that apply.

Consider the following C code:

int x = 100;
do {
    if (5 == x % 10)
        foo();
    quux();
    x -= 1;
} while (x >= 0);
Question 5 (4 points) (see above)

Suppose the above code is executed with a branch predictor that predicts all conditional branches as taken if and only if they were taken the last time time they were run. (Branches are identified by the address of the corresponding instruction.)

If the loop above is run repeatedly (by surrounding some outer loop) then how many branch mispredictions would occur each time the loop runs?

Assume that:

  • the initial predictions for each conditional branches are based on what happened at the end of a previous run of the loop;
  • the foo and quux functions contain no conditional branches;
  • the compiler generates code that has one conditional branch instruction to implement the do-while condition and one conditional branch instruction to implement the 5 == x % 10 condition;
  • each conditional branch instruction is taken if the condition written in the C code (5 == x % 10 or x >= 0) is true;
  • each conditional branch instruction runs once per iteration of the do-while loop;
Answer:
Question 6 (4 points) (see above)

Which (if any) of the following changes to branch prediction would decrease the number of branch mispredictions versus the previous scenario? Select all that apply.

click here to preview key
Return to course pageReturn to index 

quiz for week 15

Answer the following questions about the lecture material for this week.

For the following questions, consider an out-of-order processor with a similar design to what we discussed in lecture (with register renaming and a single instruction queue).

Suppose the processor executing the following assembly code:

addq %r8, %r9
subq %r9, %r10
xorq %r8, %r12
andq %r8, %r10
imulq %r12, %r10

(In each of the instructions above, the first operand is a source and the second is a source and destination of the instruction.)

Question 1 (4 points) (see above)

The computation for addq %r8, %r9 could occur at the same time as the computation for ____. Select all that apply.

Question 2 (3 points) (see above)

The computation for xorq %r8, %r12 could occur at the same time as the computation for ____. Select all that apply.

Question 3 (4 points) (see above)

If the processor has two ALUs that each can do the computation for any instruction in one cycle, then the minimum number of cycles it would need to do the computation for the instructions above would be ____ cycles.

Answer:
Question 4 (4 points)

Consider the following C code:

unsigned char array[4096];
for (int i = 0; i < 4096; i += 1) { 
    array[i] += 1;
}
*pointer += 1;

Suppose we run this:

In that case, the for loop will fill the data cache with the elements of array, and then the addition to *pointer will evict something something from the cache.

Suppose we don't know the exact value of pointer but have limited it to two possibilities. By timing how long it takes to access parts of array after running this code, we can sometimes distinguish between two values and sometimes cannot. For which pairs of possible values could we distinguish them by timing access to the array? Select all that apply.

Consider the following C code:

unsigned char table[256] = {
    0,
    11 % 256 , (2 * 11) % 256,
    (3 * 11) % 256, (4 * 11) % 256,
    ... /* and so on */
};


int Example(int x, int y) {
    if (x < 256) {
        unsigned char result = table[x];
        return table[(result + y) % 256];
    } else {
        return -1;
    }
}

Suppose table is located at address 0x100000, and the system has an 4-way L1 cache with 64-byte blocks and 256 sets.

To simplify this problem, assume that virtual memory is not in use.

Question 5 (4 points) (see above)

Suppose this function is vulnerable to a Spectre-style attack through which an attacker who can control the values of x and y and has access the ability to detect what's evicted from the cache can determine information about parts of memory they are not supposed to be able to access.

If attacker wants to find out about the 1-byte value in memory at address 0x143000, what value would be best to supply for x to perform this attack? Write your answer as a hexadecimal number.

Answer:
Question 6 (4 points) (see above)

Suppose an attacker finds out that for the value of x that is the correct answer to the previous question, setting y to 15 results in an access that evicts a value from set index 2 of the L1 cache, and setting y to 16 results in an access that evicts a value from set index 3 of the L1 cache.

Based on this behavior, give a possible 1-byte value from memory at 0x143000.

Answer:
click here to preview key
Return to course pageReturn to index