Consider the following C snippet:
extern int strcmp(const char *, const char*);
const char *foo(const char *bar) {
if (!strcmp(bar, "special")) {
return "value 1";
} else {
return "value 2";
}
}
Suppose this snippet is contained in the file foo.c
, which is compiled into
an object file foo.o
, and then combined with a main.o
to produce an executable
called main.exe
which calls the foo()
function whenever it is run.
Question 1 (1 points): (see above) Ignoring information only present for debugging, the string "value 1"
is likely to appear
in _______.
Select all that apply
Question 2 (1 points): (see above) Ignoring information only present for debugging, the name "foo" is likely to appear in ________.
Select all that apply
Consider the following assembly function example
:
example:
movq %rdi, %rax
loop:
subq $10, %rdi
jg loop
ret
Recall that in the x86-64 calling convention, the first argument to a
function is in the register %rdi
and the return value is
in %rax
.
Question 3 (0 points): (see above) [We are dropping this question due an error discovered after we released the quiz.] Which of the following C snippets is equivalent to this function?
Question 4 (1 points): (see above) When this function returns, what are possible values of the ZF
condition code?
Select all that apply
Question 5 (1 points): (see above) When this function returns, what are possible values of the SF
condition code?
Select all that apply
Read section 2.1.6-7 of the textbook and answer the following questions.
Question 6 (1 points): (see above) Using bitwise operators, integers can be used to represent sets, where each bit being
sets indicates whether an element with a particular index is included in the set. If
this is being done in C, if a
and b
are integers representing sets, then
producing a set containing the elements in both a
and b
(sometimes called
"set intersection") can be done with the expression:
Read section 4-4.1.3 of the textbook and answer the following questions.
Question 7 (1 points): (see above) An instruction set architecture (ISA) determines
Select all that apply
Question 8 (1 points): (see above) In section 4.1.3, the textbook explains how
rmmovq %rsp, 0x123456789abcd(%rdx)
is encoded as
40 42 cd ab 89 67 45 23 01 00
. If we were to instead encode
irmovq $0x123456789abcd, %rsp
, what would its
encoding be?
Question 1 (1 points): Which of the following C expressions are always true, assuming x
and y
are 32-bit unsigned ints between 0 and 9999999, inclusive?
Select all that apply
Question 2 (1 points): Consider the following incomplete C function:
unsigned int copy_lsb(unsigned int x) {
CODE HERE
}
Which of the following could replace CODE HERE
to make this function return a 32-bit
unsigned int where each byte is a copy of the least significnat byte of the argument x
?
Select all that apply
Consider the following C snippet:
int array[4] = {0x1234, 0x5678, 0xABCD, 0xEF01};
Suppose on a system with 32-bit, little-endian integers and 64-bit pointers, the array is located at byte address 0x10000.
Question 3 (1 points): (see above) What is the contents of the byte at address 0x10005?
Question 4 (1 points): (see above) The C expression array + 3
is a pointer containing what address?
Question 5 (1 points): If a calculation is undefined behavior in C, then ______.
Select all that apply
For these questions, read section 4.2.2, 4.2.3 (but you may ignore details of HCL syntax) and 4.2.5.
Question 6 (1 points): (see above) Our textbook distinguishes combinatorial and sequential circuits. Which of the
following are true about combinatorial circuits?
Select all that apply
Question 1 (1 points): Consider the following HCLRS code:
x = [
y > 10 : 100;
y + z == 15 : 200;
z > 10 : 300;
1 : 400;
];
For which of the following values of y
and z
will the value of x
be 200?
Select all that apply
Question 2 (1 points): In the register design we discussed in lecture, the value output by a register changes
Question 3 (1 points): Which of the following are more likely to be attributes of a
processor implementing RISC-like instruction set than a CISC-like instruction set?
Select all that apply
Question 4 (1 points): In the register file design we discussed in lecture, writing to the register file is controlled by pairs of inputs. One input in each of these pairs specifies a 64-bit value and the other specifies an 8-bit register number. Suppose we have a processor which uses these inputs to change a value in the register file for some of the instructions it runs, but for other instructions it leaves the register file values unchanged. During clock cycles in which a register should not change, the processor should
For these questions, review sections 4.3.3-4.3.4 of the textbook. (If necessary for context, you may want to also review earlier parts of section 4.3.)
Question 5 (1 points): (see above) In the single-cycle processor design described by our textbook, the values stored in condition code registers are updated _______________ corresponding values are stored in the register file.
Question 6 (1 points): (see above) In the single-cycle processor design described by our textbook, during a pushq %rax
instruction,
the register file srcA
and srcB
inputs should be the register numbers corresponding to:
For these questions, consider the single-cycle processor design strategy described in our textbook and in lecture (the processor executes exactly one instruction per cycle), and the register file and memory components described lecture and our textbook.
Question 1 (1 points): (see above) When executing a call function
instruction, the 64-bit address input of the data memory should be equal
to
Question 2 (1 points): (see above) Executing a ret
instruction requires several operations:
In what order to these occur in the single-cycle hardware implementation?
Question 3 (1 points): (see above) In lecture, we described a processor design that could execute just the Y86 irmovq
, rrmovq
, mrmovq
, and rmmovq
instructions. Suppose we wanted to extend this processor to execute a new immovq
instruction
that would take a 64-bit constant and move it to a memory address specified with a 64-bit displacement
and a base register.
Which would be useful changes to make to the processor as part of adding this instruction?
Select all that apply
Question 4 (1 points): Consider a pipelined Y86 processor with two stages that performs the instruction memory and register file reads in the first stage
and data memory reads and register file writes as part of the second stage. Suppose this processor
is performing a memory read for an rmmovq
instruction. While this is happening,
the data memory's 64-bit value-to-write input (mem_input
in HCLRS) will be equal to
Question 5 (1 points): Consider a pipelined processor with ten stages, where the clock cycle has 500 ps between rising edges. When an operation starts executing in the piepline, it will usually finish executing after
Question 6 (1 points): Modifying a pipelined processor with 4 stages to instead have only 2 stages
will generally increase ________.
Select all that apply
Question 1 (1 points): Consider the following Y86 assembly snippet:
addq %rax, %rbx
subq %rbx, %rcx
xorq %rdx, %rax
rmmovq %rax, 8(%r8)
andq %r8, %rbx
When executing on a five-stage pipelined proecssor like the one we described in lecture, there are data hazards because ________.
Consider the following Y86 assembly snippet:
subq %rcx, %rdx
jle later
addq %r8, %r9
addq %r9, %r10
later:
xorq %rdx, %rcx
Assume the initial values of %rcx
and %rdx
are set such that the
jle
is taken.
Question 2 (1 points): (see above) Consider a five-stage pipelined processor using the stages we discussed in lecture
and uses stalling only to handle data hazards and control hazards.
(Assume the whether a conditional jump is taken is not determined until near
the end of the jump's execute stage.)
If the subq
instruction is fetched during cycle 0, then the xorq
will complete its writeback stage during cycle _____.
Question 3 (1 points): (see above) Consider a five-stage pipelined processor using the stages we discussed in lecture
and uses stalling and forwarding to handle data hazards, and
branch prediction that predicts conditional jumps as always taken to
handle control hazards. (Assume the processor users forwarding whenever it
would not require substantially increasing the critical path length and therefore the cycle time.)
If the subq
instruciton is fetched during cycle 0, then the xorq
will complete its writeback stage during cycle _____.
Consider the following Y86 assembly snippet:
addq %rax, %rbx
subq %rax, %rcx
xorq %rbx, %rcx
Suppose the above assembly snippet is run a six-stage (in-order) pipelined processor which does not implement forwarding and uses the following stages:
The values of registers read are not available until the "Decode 2" stage.
Question 4 (1 points): (see above) To avoid data hazards when executing the above code using stalling only (and no forwarding), for how many cycles must a stall occur in the processor? (Count each cycle during which no new instruction can be started as a single stall.)
Question 5 (1 points): (see above) To avoid data hazards when executing the above code using stalling and forwarding, for how many cycles must a stall occur in the processor? (Count each cycle during which no new instruction can be started as a single stall. Assume the processor uses forwarding whenever it would not require substantially increasing the critical path length and therefore the cycle time.)
Question 1 (1 points): Consider a 512B Cache with 4 blocks per set and 8 byte blocks. Which of the following hex values correspond to the index for address 0xABCD?
Question 2 (1 points): Assume that you have a 32B direct mapped cache with 16B blocks with a LRU replacement policy. How many misses will result from reading the following addresses 0xABC 0xACE 0xACA 0xACF 0xAAA 0xACE ? Assume the cache starts off empty.
Question 3 (1 points): Increasing the associativity of the cache while keeping the number of set the same, will decrease the number of
Question 4 (1 points): Consider the following series of read and write accesses. Where reads are marked with the letter R and writes are marked with the letter W. Assume at 64B fully associative cache with 4 blocks per set. This cache is using write-allocate policy and a LRU replacement policy. How many misses will the following access pattern generate: (W) 0xBAD (W) 0xABC (R) 0xBAC (R) 0xBOA (W) 0xBOA (W) 0xBED (W) 0xBEC? Assume the cache starts off empty.
For these quesitons, read section 5.2, the introduction to section 5.7 and the first paragraph of section 5.7.1, and section 5.7.3.
Question 5 (1 points): (see above) Suppose a superscalar, out-of-order processor can start 3 integer multiplications per cycle (using 3 pipelined integer multiplication circuits) and each integer multiplication takes 4 cycles to complete. By our textbook's terminology, what is the throughput bound on multiplications in this processor?
Consider the following C code:
char array[4 * 1024];
int sum;
...
for (int k = 0; k < 2; ++k) {
for (int i = 0; i < 4; i += 1) {
sum += array[i * 1024] * array[16 + i * 16];
}
}
For each of the questions below, consider the behavior of cache during the outermost for loop. Assume the cache is empty when the for loop starts, only accesses to the array use the cache, and the neither the compiler nor the processor does that reorders or omits array accesses.
Assume the address of the array is a multiple of the cache size ( so array[0] is at the beginning of a cache block that maps to set index 0 in the cache).
Question 1 (1 points): (see above) With a 2KB direct-mapped cache with 16B cache blocks, how many cache misses will occur?
Question 2 (1 points): (see above) With a 128B 4-way set associative cache with 16B cache blocks and an LRU replacement policy, how many cache misses will occur?
Suppose a superscalar out-of-order processor is executing
addq %r9, %10
addq %r11, %rcx
mrmovq 0(%rcx), %r8
addq %r12, %rcx
mrmovq 0(%rcx), %r9
subq %r8, %r9
mrmovq 8(%rcx), %r11
Question 3 (1 points): (see above) If the processor had enough functional units
The mrmovq (%rcx), %r9
instructuion could be executed
at the same time as
Select all that apply
Question 4 (1 points): (see above) If the processor uses register renaming where architectural registers are mapped
to physical registers, then most likely ______.
Select all that apply
Consider the following C function:
int size; /* global variable, set elsewhere */
void example(int *A, int *B) {
for (int j = 0; j < size; ++j) {
for (int i = 0; i < size; ++i) { /* (1) */
B[i] += A[i] * A[j];
}
}
}
Question 1 (1 points): (see above) Consider a version of the C function with the loops swapped:
void example_swapped_loops(int *A, int *B) {
for (int i = 0; i < size; ++i) {
for (int j = 0; j < size; ++j) {
B[i] += A[i] * A[j];
}
}
}
Which of the following expressions are sufficient for the example_swapped_loops
code to have an equivalent effect to the original code above?
(Assume comparisons with pointers compare the pointer addresses.)
Select all that apply
Question 2 (1 points): (see above) Because of concerns about aliasing, a compiler often will not keep A[j]
in
register between iterations of the for loop labelled (1). What would be an effective
way for a programmer to prevent this?
Select all that apply
Question 3 (1 points): If performed excessively, function inlining increases _____.
Select all that apply
Question 1 (1 points): When performing a context switch, from process A to process B, ____________ saves the
value of registers like %rax
in __________.
Question 2 (1 points): Which of the following are true about running an exception handler?
Select all that apply
Suppose a system uses virtual memory where:
Question 3 (1 points): (see above) How large are physical page numbers in bits? Write your answer as a base-10 number.
Question 4 (1 points): (see above) What is the virtual page number for the virtual address 0x12345
? Write your answer as a hexadecimal number.
Question 5 (1 points): (see above) Suppose the page table entry for the virtual address 0x10007
is valid and contains the physical
page number 4
. What is the physical address that corresponds to virtual address 0x10007
? Write your
answer as a hexadecimal number.
Question 6 (1 points): (see above) If the page table base address is physical (byte) address 0x1000
then what is the physical (byte) address of
the page table entry for virtual page number 3? Write your answer as a hexadecimal number.
Consider a system with:
Question 1 (1 points): (see above) If the page table base pointer contains the physical (byte) address
0x100000
, then what is the physical (byte) address of
the first-level page table entry for virtual address 0x10000
?
Write your answer as a hexadecimal number.
Question 2 (1 points): (see above) If the page table base pointer contains the physical (byte) address
0x100000
, then what is the physical (byte) address of
the first-level page table entry for virtual address 0x80010000
?
Write your answer as a hexadecimal number.A
Question 3 (1 points): (see above) If the base address for the second-level page table corresponding
to virtual adress 0x10000
is (physical byte address) 0x40000
, then
what is the physical (byte) address of the second-level page
table entry for virtual address 0x80010000
?
Write your answer as a hexadecimal number.
Question 4 (1 points): (see above) On this system, accessing the virtual address 0x80341
accesses the
same TLB set as accessing the virtual address _______.
Select all that apply
Question 5 (1 points): Which of the following is/are true about the process of swapping a page out of DRAM and into disk:
Select all that apply
Question 6 (1 points): Consider TLB cache in a 4-level paging machine. Which values are stored in the TLB?
Select all that apply
Question 1 (1 points): Suppose a system uses 8192-byte pages.
Assuming the L1 cache uses tags which store parts of physical addresses,
with which of the following L1 cache configurations could it overlap TLB accesses and cache accesses?
Select all that apply
Consider a system with a 16-entry, 2-way TLB with an LRU replacement policy and 4096-byte pages. A program runs on this system with an initially empty TLB and accesses one byte at the following virtual addresses in this order:
0x01023
0x05023
0x0103F
0xF9000
0xF8FF8
0xE9FF8
0x01023
(Assume the pages corresponding to all virtual addresses are valid in the page table.)
Question 2 (1 points): (see above) Which of the following accesses are TLB misses?
Select all that apply
Question 3 (1 points): (see above) After these accesses complete, how many page table entries will the TLB store?
Question 4 (1 points): Increasing the size of pages will _____________.
Select all that apply