Unless we explicitly mention the compiler, assume the compiler performs few optimizations (it uses registers to store local variables, but not much else).
Question 1: The textbook discusses how compilers cannot optimize this code
void twiddle1(long *xp, long *yp) {
*xp += *yp;
*xp += *yp;
}
into this code
void twiddle2(long *xp, long *yp) {
*xp += 2 * (*yp);
}
Which of the following changes to would allow the optimization?
Select all that apply
Question 2: If the runtime of a particular program given input size n is f(n), then the "cycles per element" number the textbook refers to is
Question 3: Consider the code
int rfind(const char *s, char c) {
int ans = -1;
for(int i = 0; i < strlen(s); i += 1) {
if (s[i] == c) ans = i;
}
return ans;
}
If we pull strlen(s)
out of the loop, like so
int N = strlen(s);
for(int i = 0; i < N; i += 1) {
how will the runtime of the code change?
Question 4: Consider the code
struct foo {
int x,y,z;
}
int sum(foo x) {
return x.x + x.y + x.z;
}
int megasum(const foo *bits, const int N) {
int ans = 0;
for(int i = 0; i < N; i += 1) {
ans += sum(bits[i]);
}
return ans;
}
If we remove calls to sum
, like so
ans += bits[i].x + bits[i].y + bits[i].z;
how will the runtime of the code change?
Question 5: Which of the following are sufficient indications to know that your code has unneeded memory references?
Select all that apply
Question 6: Consider loop 1:
for(int i = 0; i < N; i += 1) {
baz(a[i]);
}
and loop 2:
for(int i = 0; i < N; i += 2) {
baz(a[i]);
baz(a[i+1]);
}
Which of the following are true?
Select all that apply
Question 7: Figure 5.12 suggests integer multiplication has latency 3, issue 1.
Assuming x
, y
, z
, and w
are integers,
how many more cycles will it take to run (x + y) + z + w
than to run x + y + (z + w)
?
Answer as a normal (possibly negative) base-10 integer.
Question 8: In general if I unroll a loop more I expect it to perform faster, but this is not always true in practice. Suppose I find that unrolling 50× is not faster than unrolling 15×. Which of the following hardware changes would be most likely to cause the more-unrolled version be faster?