Quiz Viewer

Question 1: The textbook discusses how compilers cannot optimize this code

void twiddle1(long *xp, long *yp) { 
    *xp += *yp;
    *xp += *yp;
}

into this code

void twiddle2(long *xp, long *yp) { 
    *xp += 2 * (*yp);
}

Which of the following changes to would allow the optimization?
Select all that apply

making yp a long ** instead of a long *
making yp a long instead of a long *
making xp a long instead of a long * and returning a long instead of a void
making xp a long ** instead of a long *
using int instead of long throughout
making both xp and yp longs instead of long *s and returning a long instead of a void
making both xp and yp long **s instead of long *s

Question 2: If the runtime of a particular program given input size n is f(n), then the "cycles per element" number the textbook refers to is

Question 3: Consider the code

int rfind(const char *s, char c) {
    int ans = -1;
    for(int i = 0; i < strlen(s); i += 1) {
        if (s[i] == c) ans = i;
    }
    return ans;
}

If we pull strlen(s) out of the loop, like so

    int N = strlen(s);
    for(int i = 0; i < N; i += 1) {

how will the runtime of the code change?

it will be basically the same it was before
it will be faster by an additive constant (e.g., 1 microsecond faster or the like)
it will be faster by a multiplicative factor (e.g., 2× faster or the like)
it will be slower by an additive constant (e.g., 1 microsecond slower or the like)
it will be slower by a multiplicative factor (e.g., 2× slower or the like)
it will change to a faster asymptotic complexity class (changing big-O)
it will change to a slower asymptotic complexity class (changing big-O)

Question 4: Consider the code

struct foo {
    int x,y,z;
}
int sum(foo x) {
    return x.x + x.y + x.z;
}
int megasum(const foo *bits, const int N) {
    int ans = 0;
    for(int i = 0; i < N; i += 1) {
        ans += sum(bits[i]);
    }
    return ans;
}

If we remove calls to sum, like so

        ans += bits[i].x + bits[i].y + bits[i].z;

how will the runtime of the code change?

it will be basically the same it was before
it will be faster by an additive constant (e.g., 1 microsecond faster or the like)
it will be faster by a multiplicative factor (e.g., 2× faster or the like)
it will be slower by an additive constant (e.g., 1 microsecond slower or the like)
it will be slower by a multiplicative factor (e.g., 2× slower or the like)
it will change to a faster asymptotic complexity class (changing big-O)
it will change to a slower asymptotic complexity class (changing big-O)

Question 5: Which of the following are sufficient indications to know that your code has unneeded memory references?
Select all that apply

it reads every value of a large array, then reads them all again
it writes every value of a large array, then writes them all again
it reads the same address in every pass of a loop
it writes the same address in every pass of a loop

Question 6: Consider loop 1:

for(int i = 0; i < N; i += 1) {
    baz(a[i]);
}

and loop 2:

for(int i = 0; i < N; i += 2) {
    baz(a[i]);
    baz(a[i+1]);
}

Which of the following are true?
Select all that apply

loop 1 computes fewer comparisons than loop 2
loop 1 computes more comparisons than loop 2
loop 1 executes fewer conditional jumps than loop 2
loop 1 executes more conditional jumps than loop 2

Question 7: Figure 5.12 suggests integer multiplication has latency 3, issue 1. Assuming x, y, z, and w are integers, how many more cycles will it take to run (x + y) + z + w than to run x + y + (z + w)? Answer as a normal (possibly negative) base-10 integer.

Answer:

Question 8: In general if I unroll a loop more I expect it to perform faster, but this is not always true in practice. Suppose I find that unrolling 50× is not faster than unrolling 15×. Which of the following hardware changes would be most likely to cause the more-unrolled version be faster?

a bigger register file
a bigger L1 cache
a faster L1 cache
using power-of-two size unrolling (e.g., 64 and 16 instead of 50 and 15)
better branch prediction