10 Compiler 4: Adding Control Flow (Original)
10.1 Assignment Summary
The goals of this assignment are to (1) practice exposing low-level features from the target language to the compiler, (2) introduce designing abstractions to tame low-level operations, (3) demonstrate the problems that control flow causes with program analysis, (4) introduce linking, and (5) practice thinking about programming at different levels of abstraction. We’ll expose labels and jmp instructions from x64, provide abstractions for using them in the source language, compile the latter abstraction to the former implementation, and practice reasoning about our new source language.
This assignment is due Friday Febuary 14, 2020 at 11:59pm; you have 2 weeks this time.
By Friday Febuary 7, 2020 at 11:59pm, you should have tests for all exercises, should have finished implementing 1/3–1/2 of the exercises. We will provide interim feedback by running any tests you’ve pushed to your repository against the reference solution at this time.
Assignment Checklist
You should find a new repository in your https://github.students.cs.ubc.ca account named a4_<team ids> with a code skeleton and a support library. You should complete the assignment in that git repository and push it to the GitHub Students instance.
You should first copy your solution to Compiler 3: Register Allocation (Corrected) into the starter code provided.
If we patch the code skeleton or library after the release, the most recent version will be available here at a4-skeleton.rkt and a4-compiler-lib.rkt, and the functional graph library a4-graph-lib.rkt. The name of the skeleton is a4-skeleton.rkt to avoid accidentally overwriting your files, but your file in the Git repository should be named a4.rkt.
Per Organizing Racket Code, you are not required to keep all of your code in a single file, but a4.rkt must export the at least the names exported by a4-skeleton.rkt, and you must follow the instructions on organizing your code from Organizing Racket Code.
; Exercise 20 (define (values-lang-fib n) ...) ; Exercise 21 ; When I compared the two implementations I found ...
For non-obvious test cases, add a comment explaining which language feature or edge case you are testing. For trivial tests, such as (check-exn exn:fail? (thunk (check-values-lang '()))), this is not necessary and will only make your code less readable.
Redesign and extend the implementation of generate-x64, a compiler from Paren-x64 v3 to x64.
Design and implement link-paren-x64, a linker for the Paren-x64 v3 (equivalently, a compiler from Paren-x64 v3 to Paren-x64-rt).
Redesign and extend the implementation of interp-paren-x64, the interpreter for Paren-x64 v3
Redesign and extend the implementation of patch-instructions, a compiler from Paren-asm v2 to Paren-x64 v3.
Design and implement flatten-program, a compiler from Block-asm-lang to Paren-asm v2.
Design and implement expose-basic-blocks, a compiler from Block-nested-lang to Block-asm-lang.
Redesign and extend the implementation of replace-locations, a compiler from Block-assigned-lang to Block-nested-lang.
Design and implement discard-call-live, a compiler from Block-jump-live-lang to Block-assigned-lang.
Redesign and extend the implementation of assign-registers, a compiler from Conflict-lang v2 to Block-jump-live-lang.
Redesign and extend the implementation of conflict-analysis, a compiler from Undead-block-lang to Conflict-lang v2.
Redesign and extend the implementation of undead-analysis, a compiler from Block-locals-lang to Undead-block-lang.
Redesign and extend the implementation of uncover-locals, a compiler from Block-lang to Block-locals-lang.
Redesign and extend the implementation of select-instructions, a compiler from Values-unique-lang v2 to Block-lang.
Redesign and extend the implementation of uniquify, a compiler from Values-lang v2 to Values-unique-lang v2.
10.2 Language Diagram
10.3 Related Reading
You can find additional reading material on register allocation and undead analysis at the resources below. These comes from other courses that use a compiler that is similar to, but different from, your compiler. Their intermediate languages are different, and their compiler pipeline is different. There is no one universal truth of how these passes work for all compilers in all situations, so you will need to generalize from these lessons and apply them to your compiler.
R. Kent Dybvig’s notes on register allocation kents-notes-on-register-alloc.html. This language allows mixing abstract locations and real locations in the same language, unlike your intermediate languages.
These notes are courtesy of R. Kent Dybvig, author of the Chez Scheme compiler https://github.com/cisco/ChezScheme.
Essentials of Compilation, Chapters 3.2, 3.3, and 3.4 book.pdf. These chapters include notes on calling conventions and function call/return, which your compiler should not support yet.
This is a freely available book written by the group at Indiana University: https://github.com/IUCompilerCourse/Essentials-of-Compilation
10.4 Preface: What’s wrong with Values-lang
The language Values-lang has a significant limitation: we can only expression simple, straight-line arithmetic computations. We’ll never be able to write any interesting programs!
In this assignment, we will expose a machine feature, control flow instructions in the form of labels and jmp instructions, and systematically abstract these. This requires changes to nearly every pass and every intermediate language.
We’ll proceed bottom-up, but first let’s take a look at our goal. We want to extend Values-lang with control-flow features: a form of if expression, top-level function definitions, and non-returning function calls (tail calls).
p | ::= | (module b ... e) | ||
b | ::= | (define x (lambda (x ...) e)) | ||
e | ::= | tail | ||
| | (let ([x n]) e) | |||
| | (if (cmp v v) e e) | |||
n | ::= | v | ||
| | (binop v v) | |||
tail | ::= | n | ||
| | (apply x v ...) | |||
x | ::= | name | ||
v | ::= | int64 | ||
| | x | |||
binop | ::= | * | ||
| | + | |||
cmp | ::= | neq? | ||
| | eq? | |||
| | < | |||
| | <= | |||
| | > | |||
| | >= |
Note that an if expression, (if (cmp v v) e e), cannot branch on an arbitrary expression. It is restricted so that a comparison between two values must appear in the predicate position. This is a necessary artifact inherited from x64. Even by the end of this assignment, we will not yet have enough support from the low-level language to add values that represent the result of a comparison, i.e., we won’t have booleans.
Note also that a function can only be called as the very last expression. This is because we will not have introduced enough to implement returning from a function, so all function calls are restricted to only jump forward.
Despite the limitations of Values-lang v2, the language is finally expressive enough to write interesting programs. We’ll therefore implement a type checking tool to help us program in Values-lang v2, and spend some time writing programs to understand its expressiveness and its limitations. If you prefer, you can jump to the last few exercises and complete those first.
10.5 Exposing Control-Flow Primitives
We’ll start by designing a new Paren-x64 v3 to expose the additional features of x64 necessary to implement control flow abstractions. This extends the previous Paren-x64 v2 with comparison operations, labels, and conditional and unconditional jump operations.
p | ::= | (begin s ...) | ||
s | ::= | (set! loc triv) | ||
| | (set! reg loc) | |||
| | (set! reg_1 (binop reg_1 int32)) | |||
| | (set! reg_1 (binop reg_1 loc)) | |||
| | (define label s) | |||
| | (jump trg) | |||
| | (compare reg opand) | |||
| | (jump-if cmp label) | |||
trg | ::= | reg | ||
| | label | |||
triv | ::= | trg | ||
| | int64 | |||
opand | ::= | int64 | ||
| | reg | |||
loc | ::= | reg | ||
| | addr | |||
reg | ::= | rsp | ||
| | rbp | |||
| | rax | |||
| | rbx | |||
| | rcx | |||
| | rdx | |||
| | rsi | |||
| | rdi | |||
| | r8 | |||
| | r9 | |||
| | r10 | |||
| | r11 | |||
| | r12 | |||
| | r13 | |||
| | r14 | |||
| | r15 | |||
addr | ::= | (rbp - dispoffset) | ||
| | (rbp + dispoffset) | |||
binop | ::= | * | ||
| | + | |||
cmp | ::= | neq? | ||
| | eq? | |||
| | < | |||
| | <= | |||
| | > | |||
| | >= |
(define (label? s) (and (symbol? s) (regexp-match-exact? #rx"L\\..+\\.[0-9]+" (symbol->string s))))
L1: |
L2: |
mov rax, 42 |
In Paren-x64 v3, we model labels with the (define label s) instruction, which defines a label label at the instruction s in the instruction sequence. This corresponds to the x64 string label:\n s. Note that they can be nested, allowing the same behavior as chaining labels in x64. For convenience, we assume all labels are symbols of the form L.<name>.<number>, and are unique.
The new comparison instruction (compare reg opand) corresponds to the x64 instruction cmp reg, opand. This instruction compares reg to opand and sets some flags in the machine describing their relation, such as whether reg is less than opand, or whether they are equal. The flags are used by the next condition jump instruction.
The conditional jump instructions in x64, in the same order as the cmp, are: jne trg, je trg, jl trg, jle trg, jg trg, and jge trg. Each corresponds to "jump to trg if the comparison flag is set to ___". For example, the instruction je trg jumps to trg if the comparison flag "equal" is set.
In Paren-x64 v3, we abstract the various conditional jump instructions into a single instruction with multiple flags. The instruction je l corresponds to (jump-if eq? l). jl l jumps to l if comparison flag "less than" is set, and corresponds to (jump-if < l). The rest of the instructions follow this pattern.
Exercise 1: Redesign and extend the implementation of generate-x64 to support control-flow primitives. The source language is Paren-x64 v3 and the target language is x64.
Labels and jumps are a small change to the language syntactically, but have a large effect on the semantics. You’ll notice this, for example, when writing the interpreter for a language with labels and jumps compared to a language without them.
We can no longer write the Paren-x64 v3 interpreter in one simple loop
over the instructions. Instead, we need some way to resolve labels. That way,
when running the interpreter, we can easily jump to any expression at any
time—
We’re going to write a simple linker for Paren-x64 v3 to give you a rough idea of how the operating system’s linker works. We’ll use a low-level linking implementation that is similar to the operating systems linker: we first resolve all labels to their address in memory (in our case, their index in the instruction sequence) and then implement jumps by simply setting a program counter to the instruction’s address.
To do this, we design a new language Paren-x64-rt, which represents the run-time language used by the interpreter after linking.
p | ::= | (begin s ...) | ||
s | ::= | (set! loc triv) | ||
| | (set! reg loc) | |||
| | (set! reg_1 (binop reg_1 int32)) | |||
| | (set! reg_1 (binop reg_1 loc)) | |||
| | (jump trg) | |||
| | (compare reg opand) | |||
| | (jump-if cmp pc-addr) | |||
trg | ::= | reg | ||
| | pc-addr | |||
triv | ::= | trg | ||
| | int64 | |||
opand | ::= | int64 | ||
| | reg | |||
loc | ::= | reg | ||
| | addr | |||
reg | ::= | rsp | ||
| | rbp | |||
| | rax | |||
| | rbx | |||
| | rcx | |||
| | rdx | |||
| | rsi | |||
| | rdi | |||
| | r8 | |||
| | r9 | |||
| | r10 | |||
| | r11 | |||
| | r12 | |||
| | r13 | |||
| | r14 | |||
| | r15 | |||
addr | ::= | (rbp - dispoffset) | ||
| | (rbp + dispoffset) | |||
binop | ::= | * | ||
| | + | |||
cmp | ::= | neq? | ||
| | eq? | |||
| | < | |||
| | <= | |||
| | > | |||
| | >= |
(define pc-addr? natural-number/c)
We remove the instruction (define label s), which the linker will remove, and turn all label values into pc-addr, a representation of an address recognized by the interpreter. In our case, since programs are represented as lists of instructions, a pc-addr is a natural number representing the position of the instruction in the list.
Exercise 2: Design and implement link-paren-x64, which resolves all labels into the address of the instruction in the instruction sequence. The source language is Paren-x64 v3 and the target language is Paren-x64-rt.
Exercise 3: Redesign and extend the implementation of interp-paren-x64 to support control-flow primitives. The source language is Paren-x64 v3 and the output is a Racket integer.
It should use link-paren-x64 to resolve all labels, and should use a program counter to loop over the instruction sequence instead of folding over the sequence. This means some inner helper function accepts Paren-x64-rt programs.
You may want to refer to the code skeleton in a4-skeleton.rkt.
Next, we redesign Paren-asm v2, exposing the new instructions while abstracting away from the machine constraints about which instructions work on which physical locations. Now jumps can target arbitrary locations, and compare can compare arbitrary locations. We can also move labels into locations.
p | ::= | (begin s ...) | ||
s | ::= | (set! loc triv) | ||
| | (set! loc (binop loc opand)) | |||
| | (define label s) | |||
| | (halt opand) | |||
| | (jump trg) | |||
| | (compare loc opand) | |||
| | (jump-if cmp label) | |||
binop | ::= | * | ||
| | + | |||
triv | ::= | opand | ||
| | label | |||
opand | ::= | int64 | ||
| | loc | |||
trg | ::= | label | ||
| | loc | |||
loc | ::= | reg | ||
| | addr | |||
reg | ::= | rsp | ||
| | rbp | |||
| | rbx | |||
| | rcx | |||
| | rdx | |||
| | rsi | |||
| | rdi | |||
| | r8 | |||
| | r9 | |||
| | r10 | |||
| | r11 | |||
| | r12 | |||
| | r13 | |||
| | r14 | |||
| | r15 | |||
addr | ::= | (rbp - dispoffset) | ||
| | (rbp + dispoffset) | |||
cmp | ::= | neq? | ||
| | eq? | |||
| | < | |||
| | <= | |||
| | > | |||
| | >= |
While we’re redesigning, we also lift some restrictions to allow our compiler to "not be stupid". We generalize Paren-asm v2 to allow literal values in some operands. We allow jumps to take either a location or a label directly, and allow binops and halt to take either a value or a location. Relaxing this restriction makes Paren-asm v2 less uniform, but makes it easier to work with, and allows us to produce better code.
Forcing ourselves to "be stupid" is a common hazard when designing abstraction
boundaries. Previously, in our attempt to abstract away from annoying details
of x64 and make a more uniform language, we actually forced
our compiler to produce worse code. We required all operations in
Paren-asm to work uniformly over locations only, even
though x64 supports working over values for many instructions. This
required all compiler passes that target Paren-asm to introduce extra
instructions. While we could "optimize" those instructions away, it’s often
better to "not be stupid"—
Exercise 4: Redesign and extend the implementation of patch-instructions to implement the new instructions jump and compare instructions that don’t exist in x64 by instruction sequences that are valid in x64. The source language is Paren-asm v2 and the target is Paren-x64 v3.
It will be tricky to implement the instruction (compare addr addr) with only one temporary register. You can do it in four Paren-x64 v3 instructions.
You should continue to use rax as a temporary register.
10.6 Blocks: Abstracting labeled instructions
In Paren-asm v2, control can jump to any instruction at any time. This makes giving semantics to programs, such as writing an interpreter or an analysis, very difficult. At any label, we must make assumptions about the state of the machine, since the state is not affected only by the sequence of instructions that came before the label in the instruction sequence, but potentially arbitrary instructions that were executed prior to a jumping to the label.
To simplify reasoning about programs, we can organizing code into labeled blocks, assuming that control flow only enters the beginning of a block and exits the end of a block. This block abstraction simplifies reasoning about control flow. We have more structure on which to hang assumption, and can make more assumptions about code when writing analyses. We want to introduce this abstraction before we have to extend the register allocator, which currently does the most analysis in our compiler.
We design Block-asm-lang, a block-structured abstract assembly language in which sequences of statements are organized into blocks, and code can jump between blocks. Labels are no longer instructions that can happen anywhere; instead, each block is labeled. Jumps cannot appear just anywhere; instead, they happen only at the end of a block.
p | ::= | (module b ... b) | ||
b | ::= | (define label info tail) | ||
info | ::= | (any ...) | ||
tail | ::= | (begin s ... tail) | ||
| | (jump trg) | |||
| | (if (cmp loc opand) (jump trg) (jump trg)) | |||
| | (halt opand) | |||
s | ::= | (set! loc triv) | ||
| | (set! loc (binop loc opand)) | |||
binop | ::= | * | ||
| | + | |||
triv | ::= | opand | ||
| | label | |||
opand | ::= | int64 | ||
| | loc | |||
trg | ::= | label | ||
| | loc | |||
loc | ::= | reg | ||
| | addr | |||
reg | ::= | rsp | ||
| | rbp | |||
| | rbx | |||
| | rcx | |||
| | rdx | |||
| | rsi | |||
| | rdi | |||
| | r8 | |||
| | r9 | |||
| | r10 | |||
| | r11 | |||
| | r12 | |||
| | r13 | |||
| | r14 | |||
| | r15 | |||
addr | ::= | (rbp - dispoffset) | ||
| | (rbp + dispoffset) | |||
cmp | ::= | neq? | ||
| | eq? | |||
| | < | |||
| | <= | |||
| | > | |||
| | >= |
In Block-asm-lang, a program is a non-empty sequence of labeled blocks. We consider the first block in the sequence to be the start of the program. A tail represents a self-contained block of statements. Jumps can only appear at the end of blocks, and jumps only enter the beginning of blocks. The halt instruction should only appear at the end of the final block; it cannot stop control flow, but only indicates that if the program has ended, the opand is the final value.
Note that now there is an info field for each block. Many analyses that previously worked over the entire program must now work over each block.
Exercise 5: Design and implement flatten-program to flatten all blocks into straight-line code. The source language is Block-asm-lang and the target language is Paren-asm v2.
Challenge 1: In Paren-asm v2, it’s unnecessary to have a jump when the target of the jump is the next instruction. Design and implement an optimization pass, called inline-jump, that eliminates these unnecessary jumps. The source language is Paren-asm v2 and target is Paren-asm v2
Challenge 2: Branches are often thought of as expensive, so real compilers often perform trace scheduling, an optimization that rearranges blocks and takes advantage of fall-through between blocks to reduce the number of jumps.
Implement a function trace-schedule to perform this optimization. The source language is Block-asm-lang and the target language is Block-asm-lang.
This optimization works in conjunction with the previous optimization.
Before we start modifying the register allocator, we will make a simplifying assumption: we should allow nested tails. This means nested if statements, and begins in the branches of an if statement.
It is not obvious why we would want to make this change, and this is one place where our back-to-front design fails us. In truth, compiler design, like most software design, is an interative process. We must look ahead a few steps in the compiler to see the problem.
If we were following our nose and propagating our new features up to the next abstraction layer, the next step might be to simply start on the register allocator, extending it with support for blocks. As soon as we do that, we realize that analyzing arbitrary jump instructions is kind of complicated. However, analyzing if is somewhat less complicated, as the structure of the control flow is more constrained. By allowing tails to be nested, we can allow more ifs and fewer apparent jumps in the source language of the analyses. The fewer jumps, the better job the analysis can do, and the better code it can produce. So we want to minimizes jumps in the language that we analyze.
We introduce Block-nested-lang, which allows nesting some sequences of expressions that would otherwise require additional blocks and jumps. In particular, if and begin can be nested. This means we could have an if statement of the form (if (cmp loc loc) (begin s ... (jump loc)) (begin s ... (jump loc))). Compiling this require introducing a new fresh label. The nesting structure allows all earlier compiler passes to ignore jumps that are quite easy to eliminate. In general, jumps are difficult to deal with (as we will see), so avoiding them for as long as possible is generally a good design choice.
p | ::= | (module b ... b) | ||
b | ::= | (define label info tail) | ||
info | ::= | (any ...) | ||
tail | ::= | (begin s ... tail) | ||
| | (jump trg) | |||
| | (if (cmp loc opand) tail tail) | |||
| | (halt opand) | |||
s | ::= | (set! loc triv) | ||
| | (set! loc (binop loc opand)) | |||
binop | ::= | * | ||
| | + | |||
triv | ::= | opand | ||
| | label | |||
opand | ::= | int64 | ||
| | loc | |||
trg | ::= | label | ||
| | loc | |||
loc | ::= | reg | ||
| | addr | |||
reg | ::= | rsp | ||
| | rbp | |||
| | rbx | |||
| | rcx | |||
| | rdx | |||
| | rsi | |||
| | rdi | |||
| | r8 | |||
| | r9 | |||
| | r10 | |||
| | r11 | |||
| | r12 | |||
| | r13 | |||
| | r14 | |||
| | r15 | |||
addr | ::= | (rbp - dispoffset) | ||
| | (rbp + dispoffset) | |||
cmp | ::= | neq? | ||
| | eq? | |||
| | < | |||
| | <= | |||
| | > | |||
| | >= |
Exercise 6: Design and implement expose-basic-blocks, which creates new blocks of any nested tails and replaces those nested tails with jumps. The source language is Block-nested-lang and the target language is Block-asm-lang.
You may want to use fresh-label from a4-compiler-lib.rkt.
Next we need to extend the register allocator and analyses to support blocks. Each block will have its own assignment of variables, and conflicts and undead sets will be computed per block.
We design Block-assigned-lang below, an extension of Loc-assigned-lang with blocks.
p | ::= | (module b ... b) | ||
b | ::= | (define label info tail) | ||
info | ::= | ((assignment ((aloc rloc) ...)) any ...) | ||
tail | ::= | (begin s ... tail) | ||
| | (jump trg) | |||
| | (if (cmp aloc opand) tail tail) | |||
| | (halt opand) | |||
s | ::= | (set! aloc triv) | ||
| | (set! aloc (binop aloc opand)) | |||
binop | ::= | * | ||
| | + | |||
triv | ::= | opand | ||
| | label | |||
opand | ::= | int64 | ||
| | aloc | |||
trg | ::= | label | ||
| | aloc | |||
rloc | ::= | reg | ||
| | addr | |||
reg | ::= | rsp | ||
| | rbp | |||
| | rbx | |||
| | rcx | |||
| | rdx | |||
| | rsi | |||
| | rdi | |||
| | r8 | |||
| | r9 | |||
| | r10 | |||
| | r11 | |||
| | r12 | |||
| | r13 | |||
| | r14 | |||
| | r15 | |||
addr | ::= | (rbp - dispoffset) | ||
| | (rbp + dispoffset) | |||
cmp | ::= | neq? | ||
| | eq? | |||
| | < | |||
| | <= | |||
| | > | |||
| | >= |
(define (aloc? s) (and (symbol? s) (not (register? s)) (not (label? s)) (regexp-match-exact? #rx".+\\.[0-9]+" (symbol->string s))))
Note that we modify the definition of aloc? to ensure alocs and labels are distinct.
Exercise 7: Redesign and extend the implementation of replace-locations to support blocks and the control-flow primitives. The source language is Block-assigned-lang and the target language is Block-nested-lang.
Before we can extend our analyses, we need to assume that the program has some additional structure: that each jump tells us which locations are live across a jump. Without this, conflict analysis and undead analysis will have some trouble analyzing a jump statement.
Below, we design Block-jump-live-lang with a modified jump instruction to include all locations that the target of the jump might use. You can think of these as the "arguments" to the block. However, their only purpose here it to communicate to the analysis which locations should be considered undead.
Notice that if we did not allow nested tails, the branches of each if expression would require "arguments". While we get to assume the arguments are given to us for now, later we must produce these arguments. Trying to produce arguments for every if expression would add needless complexity, slow down our analyses, and make our compiler work harder to optimize jumps.
There are many design choices in a compiler like this. If we introduce an abstraction too late, our compiler may suffer. However, as we saw in the design of Paren-asm vs Paren-asm v2, premature abstraction boundaries can cause us to suffer as well.
p | ::= | (module b ... b) | ||
b | ::= | (define label info tail) | ||
info | ::= | ((assignment ((aloc rloc) ...)) any ...) | ||
tail | ::= | (begin s ... tail) | ||
| | (jump trg aloc ...) | |||
| | (if (cmp aloc opand) tail tail) | |||
| | (halt opand) | |||
s | ::= | (set! aloc triv) | ||
| | (set! aloc (binop aloc opand)) | |||
binop | ::= | * | ||
| | + | |||
triv | ::= | opand | ||
| | label | |||
opand | ::= | int64 | ||
| | aloc | |||
trg | ::= | label | ||
| | aloc | |||
rloc | ::= | reg | ||
| | addr | |||
reg | ::= | rsp | ||
| | rbp | |||
| | rbx | |||
| | rcx | |||
| | rdx | |||
| | rsi | |||
| | rdi | |||
| | r8 | |||
| | r9 | |||
| | r10 | |||
| | r11 | |||
| | r12 | |||
| | r13 | |||
| | r14 | |||
| | r15 | |||
addr | ::= | (rbp - dispoffset) | ||
| | (rbp + dispoffset) | |||
cmp | ::= | neq? | ||
| | eq? | |||
| | < | |||
| | <= | |||
| | > | |||
| | >= |
Exercise 8: Design and implement the function discard-call-live, which throws away the set of locations live across a jump, i.e., translates each (jump trg aloc ...) into (jump trg), discarding the undeadness annotations. The source language is Block-jump-live-lang and the target is Block-assigned-lang.
Next, we must extend the register allocator to support blocks. As before, the register allocator gets to assume it has information about locals and conflicts. We extend that idea to blocks: each block is now annotated with its locals and its conflicts. Conceptually, the allocator should now run the same algorithm as before, but on each block independently. However, to different blocks may refer to the same abstract location. Conceptually, an abstract location is just the name of a unique location, so all references to it must be to the same physical location. This means the allocator should always assign the same abstract location the same physical location.
We design Conflict-lang v2 below with blocks and control-flow primitives.
p | ::= | (module b ... b) | |||||
b | ::= | (define label info tail) | |||||
info | ::= |
| |||||
tail | ::= | (begin s ... tail) | |||||
| | (jump trg aloc ...) | ||||||
| | (if (cmp aloc opand) tail tail) | ||||||
| | (halt opand) | ||||||
s | ::= | (set! aloc triv) | |||||
| | (set! aloc (binop aloc opand)) | ||||||
binop | ::= | * | |||||
| | + | ||||||
triv | ::= | opand | |||||
| | label | ||||||
opand | ::= | int64 | |||||
| | aloc | ||||||
trg | ::= | label | |||||
| | aloc | ||||||
cmp | ::= | neq? | |||||
| | eq? | ||||||
| | < | ||||||
| | <= | ||||||
| | > | ||||||
| | >= |
Exercise 9: Redesign and extend the implementation of assign-registers to assign registers for each block. The source language is Conflict-lang v2 and the target language is Block-jump-live-lang.
Recall that you may not use the register rax, since it is used to patch instructions, or rbp, since it contains the frame pointer.
We must also extend each analysis to support blocks.
First we extend conflict analysis. Below we define Undead-block-lang.
p | ::= | (module b ... b) | |||||
b | ::= | (define label info tail) | |||||
info | ::= |
| |||||
tail | ::= | (begin s ... tail) | |||||
| | (jump trg aloc ...) | ||||||
| | (if (cmp aloc opand) tail tail) | ||||||
| | (halt opand) | ||||||
s | ::= | (set! aloc triv) | |||||
| | (set! aloc (binop aloc opand)) | ||||||
binop | ::= | * | |||||
| | + | ||||||
triv | ::= | opand | |||||
| | label | ||||||
opand | ::= | int64 | |||||
| | aloc | ||||||
trg | ::= | label | |||||
| | aloc | ||||||
cmp | ::= | neq? | |||||
| | eq? | ||||||
| | < | ||||||
| | <= | ||||||
| | > | ||||||
| | >= |
Exercise 10: Redesign and extend the implementation of conflict-analysis to decorate each block with a conflict graph. The source language is Undead-block-lang and the target language is Conflict-lang v2.
Finally, we extend the undead analysis. We design Block-locals-lang below.
p | ::= | (module b ... b) | ||
b | ::= | (define label info tail) | ||
info | ::= | ((locals (aloc ...)) any ...) | ||
tail | ::= | (begin s ... tail) | ||
| | (jump trg aloc ...) | |||
| | (if (cmp aloc opand) tail tail) | |||
| | (halt opand) | |||
s | ::= | (set! aloc triv) | ||
| | (set! aloc (binop aloc opand)) | |||
binop | ::= | * | ||
| | + | |||
triv | ::= | opand | ||
| | label | |||
opand | ::= | int64 | ||
| | aloc | |||
trg | ::= | label | ||
| | aloc | |||
cmp | ::= | neq? | ||
| | eq? | |||
| | < | |||
| | <= | |||
| | > | |||
| | >= |
Exercise 11: Redesign and extend the implementation of undead-analysis to collect undead-in sets for each block. The source language is Block-locals-lang and the target is Undead-block-lang. Only the info field of each block of the output program is modified, with the addition of the the undead-in set.
Finally, we extend Loc-lang to Block-lang, a block-structured location-oriented language.
p | ::= | (module b ... b) | ||
b | ::= | (define label tail) | ||
tail | ::= | (begin s ... tail) | ||
| | (jump trg aloc ...) | |||
| | (if (cmp aloc opand) tail tail) | |||
| | (halt opand) | |||
s | ::= | (set! aloc triv) | ||
| | (set! aloc (binop aloc opand)) | |||
binop | ::= | * | ||
| | + | |||
triv | ::= | opand | ||
| | label | |||
opand | ::= | int64 | ||
| | aloc | |||
trg | ::= | label | ||
| | aloc | |||
cmp | ::= | neq? | ||
| | eq? | |||
| | < | |||
| | <= | |||
| | > | |||
| | >= |
Exercise 12: Redesign and extend the implementation of uncover-locals to add a list of all variables used in each block to the info field of the block. The source language is Block-lang and the target language is Block-locals-lang.
10.7 Abstracting Low-level Control Flow
We’ve successfully exposed low-level branching and jump operations from x64 up through Block-lang. Now we need to design value-oriented versions of if and jump.
In Values-lang v2, we extend expressions with an if expression that takes a comparison between two values. We have to restrict the predicate position because we have no explicit representation of the value that a comparison operation produces, i.e., we don’t have booleans. All the intermediate languages simply use the comparison instruction, which sets x64 flags that we do not yet have access to. The lower level languages we’ve design essentially treat compare-and-branch as an atomic statement.
We also add an apply expression that can call a declared function with some arguments. We must restrict calls to be in tail position, i.e., as the final expression in a block, since we have no way to return from a function call yet.
First we modify Values-unique-lang v2 to include the if and apply expressions.
Notice that we also modify a program syntax to be a list of function definitions followed by an expression. The final expression is where code begins executing.
p | ::= | (module b ... e) | ||
b | ::= | (define label (lambda (aloc ...) e)) | ||
e | ::= | tail | ||
| | (let ([aloc n]) e) | |||
| | (if (cmp v v) e e) | |||
n | ::= | v | ||
| | (binop v v) | |||
tail | ::= | n | ||
| | (apply x v ...) | |||
x | ::= | label | ||
| | aloc | |||
v | ::= | int64 | ||
| | x | |||
binop | ::= | * | ||
| | + | |||
cmp | ::= | neq? | ||
| | eq? | |||
| | < | |||
| | <= | |||
| | > | |||
| | >= |
Now that we have more than one kind of value (we have int64 and label), we begin to run into the problem of ill-typed operations. For example, our run-time system assumes that we halt with an integer, but the syntax of Values-unique-lang v2 allows the final expression to be a label. Similarly, a binop will only succeed if it is given two integers, but not if we try to add an integer and a label.
We have two options to make our language make sense: we can add dynamic type checking, or we can assume the problem never happens because the constraints of a higher language and its compilation strategy avoid this scenario. As we’re still dealing with quite a low-level language, we choose the latter option.
Similarly, we assume that labels are never used as arguments to binops, cmps. They can be used as arguments to apply, though.
Note that you won’t be able to write a checker to enforce this property on all programs, so you must simply assume it.
Exercise 13: Redesign and extend the implementation of select-instructions to translate value-oriented operations into location-oriented instruction. The source language is Values-unique-lang v2 and the target language is Block-lang.
Finally, we discharge the last assumption: that we know which names refer to which abstract locations, and thus introduce the last abstraction. The new source language, Values-lang v2, is defined below.
p | ::= | (module b ... e) | ||
b | ::= | (define x (lambda (x ...) e)) | ||
e | ::= | tail | ||
| | (let ([x n]) e) | |||
| | (if (cmp v v) e e) | |||
n | ::= | v | ||
| | (binop v v) | |||
tail | ::= | n | ||
| | (apply x v ...) | |||
x | ::= | name | ||
v | ::= | int64 | ||
| | x | |||
binop | ::= | * | ||
| | + | |||
cmp | ::= | neq? | ||
| | eq? | |||
| | < | |||
| | <= | |||
| | > | |||
| | >= |
We continue to assume the result of any block or program cannot be a label, and that labels are not used as argument to binop or cmp.
You may find this implementation of alpha equivalence for Values-lang v2 helpful: a4-alpha-equal.rkt. The code is annotated to demonstrate idiomatic Racket style, and explain feature you may be unfamiliar with.
Exercise 14: Redesign and extend the implementation of uniquify. The source language is Values-lang v2 and the target language is Values-unique-lang v2.
10.8 Thinking like a compiler
We now, finally, have a source language in which we can write interesting programs. It is not like most programming languages that you’re likely to have written programs in. It has limitations designed into it because of what we know how to implement, and implement efficiently. This will always be the case, in everything language you ever write, and so it is useful to learn to think in a new layer of abstraction.
Exercise 15: Redesign and implement the function check-values-lang, which checks that its input is a valid Values-lang v2 program, or raises a helpful error if not. In this version, you must check that undefined variables are never used; it is now an error for a programmer to write a free variable in Values-lang v2.
Exercise 16: Design and implement a type checker for Values-lang v2 called type-check-values-lang. The input is a syntactically valid Values-lang v2 program, and the output is a Values-lang v2 program. However, it should perform that following checks and potentially raise an error:
If apply is called on a known function, then there are exactly as many arguments as declared by the function. Otherwise, raise a type error. A known function is a name that definitely refers to a function.
If apply is called on a name that refers to a number, then raise a type error.
If a binop is called with a known function as either argument, then raise a type error.
If a cmp is called with a known function as either argument, then raise a type error.
Exercise 17: You cannot implement a type checker for Values-lang v2 that can catch all type errors possible in the language. Briefly explain why not, and give three examples of type errors still possible in Values-lang v2. Explain what will happen when each of these examples is executed. Think through the program; don’t just try to compile and run them. Suggest how Values-lang v2 could be modified so it would be possible to catch each of your examples. Can you imagine any benefits for your compiler if you could guarantee that these examples were impossible?
Exercise 18: Design and implement the function interp-values-lang, an interpreter for Values-lang v2. It should receive a Values-lang v2 program as input and return a Racket number?. The interpreter may call type-check-values-lang on its input.
Exercise 19: Design and implement the function values-lang-fact, a Racket function that should take a Racket natural number as input and return a Values-lang v2 program. The result should be a program that calls the factorial function, implemented in Values-lang v2, on the number provided.
While factorial should be implemented in Values-lang v2, since Values-lang v2 cannot take input from the user, we simulate input by generating a new Values-lang v2 program with the input provided.
Make sure you use the function to test your interpreter against your compiler.
> (interp-values-lang (values-lang-fact 1)) 1
> (interp-values-lang (values-lang-fact 5)) 120
> (interp-values-lang (values-lang-fact 6)) 720
Exercise 20: Design and implement the function values-lang-fib, a Racket function that should take a Racket natural number as input and return a Values-lang v2 program. The result should be a program that calls the Fibonacci function, implemented in Values-lang v2, on the number provided.
While the Fibonacci function should be implemented in Values-lang v2, since Values-lang v2 cannot take input from the user, we simulate input by generating a new Values-lang v2 program with the input provided.
Make sure you use the function to test your interpreter against and your compiler.
Exercise 21: Last week, you were asked to compared two versions of your compiler: one with assign-homes against one with assign-homes-opt. Now, you can express programs with loops and indirection. Try the experiment again. What do you find?
10.9 Hints
To compile halt to Paren-x64, it may help if you generate a labeled instruction that essentially does nothing. This way, you can insert a label at the end of your program. Otherwise, you’ll need to think carefully about how to arrange and generate blocks.
The register allocator must ensure that abstract locations shared across multiple blocks are assigned the same physical location. This is a small but subtle change to the algorithm.
Essentially, we foldr the previous algorithm over the list of blocks.
We apply the earlier algorithm to produce an assignment for each block. That algorithm returns an assignment. That assignment should be passed to the next block in the list as the current assignment. Before running the core algorithm, you should "pre-allocate" all names in the current assignment by removing them from the locals list and conflict graph. This prevents them from being re-allocated by the core algorithm. The core algorithm should be modified to return the current assignment in the base case.
The undead analysis is greatly simplified if you think hard about the data structure used to represent undead sets. In the previous assignment, undead sets were a list of sets of names. This works because programs were essentially lists of instructions, so the mapping was one-to-one. Now, programs are not lists, but trees.
Here’s an example of applying undead-analysis to a program with interesting features.
The reference compiler generates both undead-in and undead-out sets. Due to some confusion in the description of A3 vs the examples from A3, your compiler may use either. (Either can be used in conflict-analysis.)
Examples:> (require racket/pretty) > (pretty-print-current-style-table (pretty-print-extend-style-table (pretty-print-current-style-table) '(module) '(begin))) > (pretty-display (undead-analysis '(module (define L.main.1 ((locals (x.1 x.2 x.3 x.4))) (begin (set! x.1 0) (set! x.2 0) (begin (set! x.3 0) (set! x.4 0) (if (neq? x.3 x.4) (jump L.exit.1 x.1) (halt x.2))))) (define L.exit.1 ((locals (x.1))) (halt x.1))))) (module
(define L.main.1
((locals (x.1 x.2 x.3 x.4))
(undead-in
(() (x.1) ((x.2 x.1) (x.3 x.1 x.2) ((x.2 x.1 x.4 x.3) (x.1) (x.2)))))
(undead-out
((x.1) (x.1 x.2) ((x.2 x.1 x.3) (x.3 x.4 x.1 x.2) ((x.2 x.1) (x.1) ()))))
(undead
(() (x.1) ((x.2 x.1) (x.3 x.1 x.2) ((x.2 x.1 x.4 x.3) (x.1) (x.2))))))
(begin
(set! x.1 0)
(set! x.2 0)
(begin
(set! x.3 0)
(set! x.4 0)
(if (neq? x.3 x.4) (jump L.exit.1 x.1) (halt x.2)))))
(define L.exit.1
((locals (x.1)) (undead-in (x.1)) (undead-out ()) (undead (x.1)))
(halt x.1)))
For select-instructions, you need to translate an apply in to a jump. Given the abstractions of the lower languages, we can’t really "call a function". We can only set some locations and jump. So we can’t implement a call unless we know what abstract locations to set before jumping. This means we can only jump to a known function. We’ll need calling conventions before we can handle indirect jumps, i.e., jumps to labels that are not statically known.
This restricts the language further than the syntactic restrictions.
Given the abstractions of the lower languages, also don’t know how to implement "returning" from a function call, so we can’t have anything left to do after a function call. This restriction is already encoded in the syntax.Ron notes that if you really wanted to support indirect function calls without calling conventions, and you have a whole program, you could set all possible abstract locations for all labels that could be the target of the jump. This would be very slow, and result in lots of unreadable code, but might work.