Problem Set 3: Subtyping and Proof-Carrying Code |
Out: 21 March 2000 Due: Tuesday, 11 April in class |
Problem set answers may be hand-written, but only if your hand writting is neat enough for us to read it.
Warning: This problem set is believed to be substantially harder than Problem Set 2. You are encouraged to start thinking about these problems early (everything relevant to this problem set has already been covered in class), and after you have tried them on your own to collaborate with your classmates. (Remember to list everyone you collaborated with.)
Note: This problem set contains a large amount of background material. We recommend that you skim through the whole problem set before trying to answer any of the questions.
One of the (many) reasons Java programs run slowly is because of the overhead associated with all the run-time checking. A smart compiler can eliminate much of the unnecessary run-time checking, but this is only useful if it can also construct a proof that convinces an untrusting JavaVM (that doesn't see the source code) that it is safe to execute the program without the run-time checks. The goal of this problem set is to use axiomatic semantics and proof-carrying code techniques to remove run-time checking from a Java program.
Consider the following Java class:
public class Scrunch { public static String Scrunch (String a[]) { Object [] ar = new Object [100]; for (int i = 0; i < a.length; i++) { ar[i] = a[i]; } String s = ""; for (int i = 0; i < a.length; i++) { s = s.concat ((String) ar[i]); } return s; } }For simplicity, we use a code example that would not appear in a (reasonable) Java program. In a real Java program, this would make more sense if ar were a Vector (which because of the lack of parameterized types in Java must be a Vector of Objects) instead of an Object [].
A Java compiler (Sun's JDK) produces the following byte codes for the Scrunch method (you can see this for yourself by running javap -c -verbose class):
Method java.lang.String Scrunch(java.lang.String[]) | |
0 bipush 100 | Push the constant 100 on the stack |
2 anewarray class <Class java.lang.Object> | Pop the top of the stack, and construct a new array of element type java.lang.Object of that size |
5 astore_1 | Store the top of the stack (the array we just created) in local 1 (corresponds to ar) |
6 iconst_0 | Push the constant 0 |
7 istore_2 | Store it in local 2 (corresponds to i) |
8 goto 20 | Jump to instruction numbered 20 |
11 aload_1 | Push local 1 (ar) |
12 iload_2 | Push local 2 (i) |
13 aload_0 | Push parameter (a) |
14 iload_2 | Push local 2 (i) |
The top of stack is now: [i, a, i, ar, ...] | |
15 aaload | Pop i (top), a (next) from stack; push a[i] |
aaload performs run-time bounds checking | |
16 aastore | Pop a[i] (top), i (next), ar (next) from stack; store a[i] in ar[i] |
17 iinc 2 1 | Increment the integer in local 2 by one (i++) |
20 iload_2 | Push the integer in local 2 (i) |
21 aload_0 | Push the object in local 0 (the parameter, args) |
22 arraylength | Replace stack top with length of array |
23 if_icmplt 11 | Pop x (top), y (next) from stack; if y < x jump to 11 (beginning of loop) |
26 ldc <String ""> | Push the constant String "" |
28 astore_3 | Store top of stack in local 3 (s) |
29 iconst_0 | Push constant 0 |
30 istore 4 | Store it in local 4 (i in second loop) |
32 goto 50 | |
35 aload_3 | Push local 3 (s) |
36 aload_1 | Push local 1 (ar) |
37 iload 4 | Push local 4 (i) |
39 aaload | Pop i (top), ar (next) from stack; push ar[i] |
40 checkcast <Class java.lang.String> | If the runtime type of the top of the stack (ar[i]) is not a subtype of java.lang.String, issue a run-time type error; otherwise, continue knowing its type is a subtype of java.lang.String |
43 invokevirtual <Method java.lang.String concat(java.lang.String)> | Invoke the method concat on the object at the top of the stack. Note that if this object is an instance of a subtype of java.lang.String that overrides to concat method, the method in the subtype is called. Pass the next item on the stack as an argument. |
46 astore_3 | Store the result (which is put on top of the stack) in local 3 (s) |
47 iinc 4 1 | Increment local 4 (i) by 1 |
50 iload 4 | Push local 4 (i) |
52 aload_0 | Push local 0 (a) |
53 arraylength | Replace top of stack with its length (a.length) |
54 if_icmplt 35 | if i < a.length goto 35 (continue loop) |
57 aload_3 | Push s on stack |
58 areturn | Return to caller; result is on top of stack. |
To write the VCGen, we use a shorthand that uses arguments to represent the stack slots. You may assume the byte code verifier ensures the top stack slots match the argument types. You may assume no object is NULL.
The form of your VCGen should be:
VCGen (PC) = if Inst[PC] = safe_aaload <i: int> <a: array> <predicate for safe_aaload> /\ VCGen (PC + 1) ... else % all non-safe instructions VCGen (PC + 1) (Don't worry about falling off the end or multiple-word instructions.)Complete VCGen with rules for the instructions in JVML+safe.
Tortilla Systems certifying optimizing compiler generates the following code for Scrunch:
Method java.lang.String Scrunch(java.lang.String[]) 0 bipush 100 2 anewarray classThis code differs from the code produced by Sun's JDK in four ways:5 astore_1 6 iconst_0 7 istore_2 8 check aload_0.length <= 100 9 invariant aload_0.length <= 100 /\ iload_2 >= 0 /\ aload_1.length = 100 10 iload_2 11 aload_0 12 arraylength 13 if_icmpge 22 14 aload_1 15 iload_2 16 aload_0 17 iload_2 18 safe_aaload 19 safe_aastore 20 iinc 2 1 21 goto 9 22 ldc 23 astore_3 24 iconst_0 25 istore 4 26 invariant ??? 27 iload 4 28 aload_0 29 arraylength 30 if_icmpge 40 31 aload_3 32 aload_1 33 iload 4 34 safe_aaload 35 checkcast 36 invokevirtual 37 astore_3 38 iinc 4 1 39 goto 26 40 aload_3 41 areturn
For simplicity, we can view the loop between instructions 9 and 21 as:
while i < a.length do ar[i] := a[i] % safe array loads and stores i := i + 1 end
To prove the aaload in instruction 18 is safe, we need to show:
VCGen (18 safe_aaload <i> <a>) == i >= 0 /\ i < a.length /\ VCGen (19)(Hint: you should check that your answer to question 1 would produce this predicate.)
We assume (for now) VCGen (19) is true, and show i >= 0 /\ i < a.length.
Since the invariant was provided by the untrustworthy code supplier, we cannot assume it is correct. Instead, we must prove the invariant holds. Then, we use the invariant to prove VCGen(18).
The axiomatic semantics partial correctness (since its a safety proof, we don't care about showing termination) rule for while is:
Inv is given by instruction 9:P => Inv, Inv { Pred } => Inv, Inv /\ Pred { Statement } Inv, (Inv /\ ~Pred) => Q, ___________________________________ P { while Pred do Statement end } Q
9 invariant aload_0.length <= 100 /\ iload_2 >= 0 /\ aload_1.length = 100We can rewrite this as:
a.length <= 100 /\ i >= 0 /\ ar.length = 100
P can be any predicate that we can prove from the code before the loop. The check clause gives a.length <= 100, instructions 0-5 give ar.length = 100 and instructions 6-7 give i = 0. This is argued informally, but could be shown using axiomatic semantics rules for assignment along with a specification of anewarray. This gives,
P == a.length <= 100 /\ ar.length = 100 /\ i = 0
Q is what we need to be true after the loop. Since we won't know this until doing the second loop, we can start with the weakest possible post-condition, Q = true. In question 4, you will find a stronger post-condition is needed, and change Q.
We prove each antecedent clause in turn:
a.length <= 100 /\ ar.length = 100 /\ i = 0 => a.length <= 100 /\ i >= 0 /\ ar.length = 100This is true since i = 0 => i >= 0 and all the other clauses match exactly.
Trivially true, since Pred = i < a.length is side-effect free.
We need to show:(a.length <= 100 /\ i_0 >= 0 /\ ar.length = 100) /\ (i_0 < a.length) /\ i_0 = i { ar[i] := a[i] i := i + 1 } a.length <= 100 /\ i >= 0 /\ ar.length = 100We push the second assignment using the axiomatic semantics assignment rule:(a.length <= 100 /\ i_0 >= 0 /\ ar.length = 100) /\ (i_0 < a.length) /\ i_0 = i{ ar[i] := a[i] } a.length <= 100 /\ (i_0 + 1) >= 0 /\ ar.length = 100The first assignment does not change the length of either array or the value of i, so we need to show:a.length <= 100 /\ i_0 >= 0 /\ ar.length = 100 /\ i < a.length /\ i_0 = i ==> a.length <= 100 /\ (i_0 + 1) >= 0 /\ ar.length = 100This holds, since if i >= 0 we know i + 1 must also be >= 0.
Since Q is true, this always holds.
a.length <= 100 /\ i >= 0 /\ ar.length = 100 /\ (i < a.length) ==> i >= 0 /\ i < a.length /\ VCGen (19)This is trivially true.
a. (.05) Predicate.
Show the verification predicate your VCGen generates for instruction 19: safe_aastore <a[i]> <i> <ar>.
b. (.05) Proof.
Show the proof that VCGen (19) is satisfied
(assuming VCGen (20) is true). You may use everything that was used
in the proof of VCGen (18) above.
while i < a.length do s := s.concat ((String) ar[i]) % safe load i := i + 1 endYou should construct your arguments at the same level of detail as the proof for the first loop above.
b. (.15) Invariant.
Write out a loop invariant (missing from
instruction 26) that will be sufficient to prove your verification
predicate for instruction 34.
c. (.20) Proof.
Use the axiomatic semantics rule for
while to prove VCGen (34) is true (assuming VCGen (35)). Your
proof should follow the structure of the proof of VCGen (18) --- you
should prove the invariant holds first, and then use the invariant to
prove VCGen (34).
4 (.20/.50 with Challenge) Safe Cast
The next generation Tortilla systems virtual machine adds an
instruction safe_cast <type>. Unlike
checkcast which does (expensive) run-time checking to ensure
the run-time type satisfies the cast constraint, safe_cast
implies the type constraint can be verified statically. Our goal is
to replace the checkcast in instruction 35 with
35 safe_cast <Class java.lang.String>a. (.05) VCGen
b. (.05)
Show VCGen (35), the verification predicate generated for the safe_cast version of instruction 35.
c. (.10)
Prove VCGen (35) holds for the second loop. You will need to
strengthen the invariant, and assume a stronger pre-condition on entry
to the loop.
d. (.30) (Challenge)
Prove that the pre-condition you used for the second loop is true.
You will need to strengthen the invariant for the first loop.
5. (.20) Subtyping
Being oxygen-deprived at the top of the Eiffel tower, Mertrude Bryer
suggests adding the following typing judgments to Java:
S <= T (<= means is a subtype of) ____________________ [monotonic-arrays] array[S] <= array[T] P_1 <= Q_1, ..., P_n <= Q_n, S <= T _________________________________________ [monotonic-procedures] proc (P_1, ..., P_n) returns (S) <= proc (Q_1, ... , Q_n) returns (T)
Show that an attacker could exploit these rules by passing an argument to Scrunch that leads to a type safety violation. This means it passes the Java type checker, but contains a type error that is not detected at run time.
University of Virginia CS 655: Programming Languages |
cs655-staff@cs.virginia.edu Last modified: Mon Feb 26 12:48:14 2001 |