15 Compiler 8: Structured Data Types

7.5.0.17

15 Compiler 8: Structured Data Types

15.1 Assignment Summary

The goal of this assignment is to introduce memory allocation and structured data. We will add cons pairs, vectors, and procedures. We will extend out tagged object representation to distinguished these data types dynamically.

This assignment is due Friday, March 20, 2020 at 11:59pm.

You can use the reference solution here, but if you hammer it it might go down and not come back up. https://www.students.cs.ubc.ca/~cs-411/2019w2/a8-interrogator.cgi

A file with tests is available here: a8-tests.rkt.

Assignment Checklist

You should find a new repository in your https://github.students.cs.ubc.ca account named a8_<team ids> with a code skeleton and a support library. You should complete the assignment in that git repository and push it to the GitHub Students instance.

You should first merge your solution to Compiler 7: Immediate Data Types with the starter code provided. The new starter code has the correct provides and includes a submodule to help you run your compiler on the command line if you want. The name of the skeleton is a8-skeleton.rkt to avoid accidentally overwriting your files, but your file in the Git repository should be named a8.rkt.

15.2 Language Diagram

15.3 Preface: What’s wrong with Exprs-Lang v7

Exprs-lang v7 gained proper data types and algebraic expressions, which is a huge step forward in expressivity and high-level reasoning. However, it still does not allow us to express structured data. This is greatly limits our expressivity. Real languages require structured data—such as strings, vectors, and linked lists—to express interesting programs over data larger than a single word. Functional languages use procedures, a data structure, to provide functions as first-class values.

To express data larger than a single word, we need support from the low-level languages to get access to locations larger than a single word in size. All our locations so far, registers and frame locations, are only a single word in size. We need access to heap pointers, memory locations whose size can be arbitrary.

Our strategy is to add three forms that specify-representation will use to create new data structures for its surface language. These forms are:

(alloc e) allocates a number of bytes specified by e and returns a pointer to the base address of those bytes.
(mref e_base e_index) dereferences the pointer at e_base with the offset specified by e_index. Thinking in terms of pointer arithmetic, this dereferences e_base + e_index. The value of e_base + e_index should always be word-aligned, i.e., a multiple of 8, and point to a an initialized heap allocated value.
(mset! e_base e_index e) stores the value of e in the address e_base + e_index, i.e., in the address given by pointer at e_base with the offset specified by e_index. The value of e_base + e_index should always be word-aligned, i.e., a multiple of 8.

To implement these new memory operations, or mops, we need to expose additional features from x64.

15.4 Exposing Heap Pointers in the Back-end

15.4.1 generate-x64

We start by exposing a generalized addr form in Paren-x64 v8 below, which will allow us to access arbitrary memory locations.

p	::=	(begin s ...)

s	::=	(set! loc triv)
	\|	(set! reg loc)
	\|	(set! reg_1 (binop reg_1 int32))
	\|	(set! reg_1 (binop reg_1 loc))
	\|	(define label s)
	\|	(jump trg)
	\|	(compare reg opand)
	\|	(jump-if cmp label)

trg	::=	reg
	\|	label

triv	::=	trg
	\|	int64

opand	::=	int64
	\|	reg

loc	::=	reg
	\|	addr

reg	::=	_...

addr	::=	(fbp + dispoffset)
	\|	(reg + int32)
	\|	(reg + reg)

binop	::=	*
	\|	+
	\|	-
	\|	bitwise-and
	\|	bitwise-ior
	\|	bitwise-xor
	\|	arithmetic-shift-right

cmp	::=	neq?
	\|	eq?
	\|	<
	\|	<=
	\|	>
	\|	>=

The language contains a new addr representing the x64 index-mode operand (reg + reg). This supports accessing a memory location by the index stored in another register. For example, in x64, we represent loading the nth element of an array into r10 using mov r10 [r11 + r12], where the base of the array is stored at r11 and the value of n is stored in r12.

The index-mode operand is not restricted to use a particular register, unlike the displacement-mode operand from Paren-x64 v7. We will use this feature to store pointers to structured data, and the register allocator will move those pointers into whichever registers it chooses.

We also allow a generalized form of displacement-mode operand. We can access the value of pointed to by base register reg at offset int32. by (reg + int32). This allows optimizing heap accesses when a constant offset is known, which is often the case for some data structures. The index is not restricted to be a multiple of 8, but it should be the case in our compiler that the value of the base plus the value of the offset is a multiple of 8.

All languages with direct access to registers, including Paren-x64 v8, are now parameterized by a new register, current-heap-allocation-pointer-register (abbreviated hbp). The run-time system initializes this register to point to the base of the heap. Allocation is implemented by copying the current value of this pointer, and incrementing it by the number of bytes we wish to allocate. The pointer must only be incremented by word-size multiples of bytes. Any other access to this register is now undefined behavior, similar to accesses to fbp that do not obey the stack of frames discipline.

Design digression:
A real language implementation would expose access to the mmap system call for allocation, and implement a strategy (such as garbage collection) to deallocate memory that is no longer used. Garbage collection is tricky to implement and requires too much time for this course, so we use a different strategy. Our implementation of allocation is trivial, and does not support de-allocation. We rely on the operating system to clean up memory after our process exits.

Exercise 1: Redesign and extend the implementation of generate-x64 to support index-mode operands. The source language is Paren-x64 v8, and output is a sequence of x64 instructions represented as a string.
This should be a one-line change if your compiler is well designed.

15.4.2 implement-mops

Like when we implemented fvars to support working with frame locations, we implement primitive memory operations, mops (pronounced em ops), to simplify working with heap addresses. We should do this before patch-instructions to avoid complicating the already complex logic for rewriting set! instructions.

Below, we define Paren-x64-mops v8.

p	::=	(begin s ...)

s	::=	(set! loc triv)
	\|	(set! reg loc)
	\|	(set! reg_1 (binop reg_1 int32))
	\|	(set! reg_1 (binop reg_1 loc))
	\|	(set! reg (mref reg index))
	\|	(mset! reg index triv)
	\|	(define label s)
	\|	(jump trg)
	\|	(compare reg opand)
	\|	(jump-if cmp label)

trg	::=	reg
	\|	label

triv	::=	trg
	\|	int64

opand	::=	int64
	\|	reg

loc	::=	reg
	\|	addr

index	::=	int32
	\|	reg

addr	::=	(fbp + dispoffset)
	\|	(reg + reg)

reg	::=	_...

binop	::=	_...

cmp	::=	_...

We add two new instructions that directly map to operations on heap addresses, either as index- or displacement-mode operands.

We could encode an addr using an mref, but the rest of our compiler already knows about addrs, and it represents a semantically different concept, so we leave it alone..

Exercise 2: Design and implement implement-mops. The source language is Paren-x64-mops v8 and the target is Paren-x64 v8. This should be a very simple compiler.

15.4.3 patch-instructions

Next we design Paren-asm v8. Below, we give a definition. However, your design may differ slightly since you’ve been responsible for the design of Paren-asm v6 and Paren-asm v7.

p	::=	(begin s ...)

s	::=	(set! loc triv)
	\|	(set! loc (binop loc opand))
	\|	(set! loc (mref loc index))
	\|	(mset! loc index triv)
	\|	(define label s)
	\|	(jump trg)
	\|	(compare loc opand)
	\|	(jump-if cmp label)

triv	::=	opand
	\|	label

opand	::=	int64
	\|	loc

trg	::=	label
	\|	loc

loc	::=	reg
	\|	addr

addr	::=	(fbp + dispoffset)

index	::=	int32
	\|	loc

binop	::=	_...

reg	::=	_...

cmp	::=	_...

By introducing mops, we implicitly restrict how heap addresses appear in the language and simplify the job of patch-instructions. The mops implicitly restrict heap addresses to being part of a move instruction, so we do not have to patch binary operation instructions despite apparently adding a new form of physical location. By making them separate forms, we only need to patch the new instructions, and leave old code untouched.

Exercise 3: Redesign and extend the implementation of patch-instructions. The source language is Paren-asm v8 and the target language is Paren-x64-mops v8.
You only need to add two cases, and you can write them very systematically, but they will take care to get right and cover all combinations.

15.4.4 Exposing mops up the pipeline

The new mops require minor changes to most of the pipeline up to Block-lang v8, where we will use them to implement data structures.

Exercise 4: Redesign and extend the implementation of
flatten-program, should require no changes
expose-basic-blocks, should require no changes
implement-fvars, should require minor changes to support mops. Note that we assume the fbp is not modified by mops.
replace-locations, should require minor changes to support mops.
discard-call-live, should require no changes.
assign-frame-variables, should require no changes.
assign-registers, should require no changes.
assign-frames, should require no changes.
pre-assign-frame-variables, should require no changes.
conflict-analysis, should require minor changes. Note that mops do not assign any registers or frame variables.
undead-analysis, should require minor changes. Note that mops do not assign any registers or frame variables.
uncover-locals, should require minor changes.

15.4.5 expose-allocation-pointer

Before we introduce structured data, we implement the alloc instruction to allow programs to allocate a bunch of bytes and not worry about the details of the allocation pointer. We want to do this after the passes that analyze physical locations, since then we do not have to update those passes to know that alloc introduces a reference to a register. However, we want to do this before we abstract away from all machine details so we do not need to expose registers beyond Block-lang v8.

We choose to insert this pass between uncover-locals and select-instructions.

Below, we design Block-lang v8, the source language for this pass.

p	::=	(module b ... b)

b	::=	(define label info tail)

info	::=	((new-frames ((aloc ...) ...)))

tail	::=	(begin s ... tail)
	\|	(jump trg loc ...)
	\|	(if (cmp loc opand) tail tail)

s	::=	(return-point label tail)
	\|	(set! loc triv)
	\|	(set! loc (binop loc opand))
	\|	(mset! loc index triv)
	\|	(set! loc (mref loc index))
	\|	(set! loc (alloc index))

triv	::=	opand
	\|	label

opand	::=	int64
	\|	loc

trg	::=	label
	\|	loc

loc	::=	rloc
	\|	aloc

rloc	::=	reg
	\|	fvar

index	::=	int32
	\|	loc

reg	::=	_...

binop	::=	_...

cmp	::=	_...

TODO: Should not reuse index as the operand for alloc; it is semantically a different concept, which happens to have the same representation.

We expose the earlier mops, and add a new one, (set! loc (alloc index)). This is the low-level form of our allocation operation, which we will abstract into an expression in Exprs-lang v8. In (set! loc (alloc index)), the index is restricted to be an int32 if it is an integer literal, for the same reasons as the restriction on binops. It must also be a multiple of a word size. This requirement is partly from the operating system’s mmap (which will usually ignore us if we violate the restriction and give us page-aligned memory anyway), but mostly to ensure every pointer we get has #b000 as its final three bits so we can tag all pointers.

We design the target language, Block-hbp-lang v8, below. This language removes the alloc index form and is the highest-level language parameterized by current-heap-base-pointer-register. This language, and all languages between it and x64, assumes all accesses to hbp obey the restrictions described in Paren-x64 v8.

p	::=	(module b ... b)

b	::=	(define label info tail)

info	::=	((new-frames ((aloc ...) ...)))

tail	::=	(begin s ... tail)
	\|	(jump trg loc ...)
	\|	(if (cmp loc opand) tail tail)

s	::=	(return-point label tail)
	\|	(set! loc triv)
	\|	(set! loc (binop loc opand))
	\|	(mset! loc index triv)
	\|	(set! loc (mref loc index))
	\|	(set! loc (alloc index))

triv	::=	opand
	\|	label

opand	::=	int64
	\|	loc

trg	::=	label
	\|	loc

loc	::=	rloc
	\|	aloc

rloc	::=	reg
	\|	fvar

index	::=	int32
	\|	loc

reg	::=	_...

binop	::=	_...

cmp	::=	_...

Intuitively, we will transform each `(set! ,loc (alloc ,index)) into

`(begin
(set! ,loc ,hbp)
(set! ,hbp (+ ,hbp ,index)))

Exercise 5: Design and implement the function expose-allocation-pointer. The source language is Block-lang v8 and the target language is Block-hbp-lang v8.

15.4.6 select-instructions

Below we design Impure-Values-bits-lang v8 with support for mops.

p	::=	(module b ... e)

b	::=	(define label (lambda (aloc ...) e))

c	::=	(let ([aloc n]) c)
	\|	(begin c ...)
	\|	(mset! v v v)

e	::=	n
	\|	(let ([aloc n]) e)
	\|	(if (cmp v v) e e)
	\|	(begin c ... e)

n	::=	v
	\|	(binop v v)
	\|	(apply v v ...)
	\|	(alloc v)
	\|	(mref v v)

v	::=	int64
	\|	label
	\|	aloc

binop	::=	*
	\|	+
	\|	-
	\|	bitwise-and
	\|	bitwise-ior
	\|	bitwise-xor
	\|	arithmetic-shift-right

cmp	::=	neq?
	\|	eq?
	\|	<
	\|	<=
	\|	>
	\|	>=

The language is an extension of Values-bits-lang v7. We add value forms of mref and alloc to the n nonterminal. We require these forms are used in a well-typed way. This is a simple extension.

However, supporting mset! requires that we introduce a notion of effectful, or impure, computation into the language. Previously, all expressions in the Values-lang languages were pure—they evaluated and produced the same value regardless of in which order expressions were evaluated. We could freely reorder expressions, as long as we respected scope. Now, however, mset! modifies memory during its execution. It not safe to reorder expressions after an mset!. Furthermore, mset! does not return a useful value.

To deal with this, we introduce a contextual distinction in the language. We add the nonterminal c to represent an impure computation. A c represents an expression that does not have a value, and is executed only for its effect. We can use c in certain expression contexts using begin. If we’re already in an impure context, that is, in a c, then we can freely nest other cs.

This contextual distinction is similar to the one we introduce to distinguish tail calls from non-tail calls.

Despite the conceptually complex change to the language, the transformation is straightforward. Previously, select-instructions essentially implemented the pure Values-bits-lang v7 using the impure Block-lang v7. To transform the effectful computations, we essentially splice them into the effectful computations select-instructions was already producing. The new mops n are already valid operands in the target language, so nothing special is needed to deal with them.

Exercise 6: Redesign and extend the implementation of the function select-instructions. The source language is Impure-Values-bits-lang v8 and the target language is Block-lang v8.

select-instructions is getting to be a big pass. It now does several things—flattens let into set!, implements calling conventions, and selects instructions to implement binary operations on values. If you haven’t already, you might want to redesign it as several smaller passes.

15.4.7 a-normalize

Next we design Impure-Exprs-bits-lang v8 with support for mops.

p	::=	(module b ... e)

b	::=	(define label (lambda (aloc ...) e))

c	::=	(begin c ...)
	\|	(mset! e e e)

e	::=	v
	\|	(let ([aloc e]) e)
	\|	(if (cmp e e) e e)
	\|	(begin c ... e)
	\|	(binop e e)
	\|	(apply e e ...)
	\|	(alloc e)
	\|	(mref e e)

v	::=	int64
	\|	label
	\|	aloc

binop	::=	*
	\|	+
	\|	-
	\|	bitwise-and
	\|	bitwise-ior
	\|	bitwise-xor
	\|	arithmetic-shift-right

cmp	::=	neq?
	\|	eq?
	\|	<
	\|	<=
	\|	>
	\|	>=

While we remove the distinction between expressions and values, we maintain the contextual distinction between impure computations and pure expressions.

Exercise 7: Redesign and extend the implementation of the function a-normalize. The source language is Impure-Exprs-bits-lang v8 and the target language is Impure-Values-bits-lang v8.
Be careful not to reorder the effectful operations.

15.4.8 specify-representation

Now we have all the abstractions necessary to implement structured data.

We design a new Impure-Exprs-data-lang v8.

p	::=	(module b ... e)

b	::=	(define label (lambda (aloc ...) e))

c	::=	(begin c ...)
	\|	(primop e ...)

e	::=	v
	\|	(primop e ...)
	\|	(apply e e ...)
	\|	(let ([aloc e]) e)
	\|	(if e e e)
	\|	(begin c ... e)

v	::=	fixnum
	\|	aloc
	\|	label
	\|	#t
	\|	#f
	\|	()
	\|	(void)
	\|	(error uint8)
	\|	ascii-char-literal

primop	::=	unsafe-fx*
	\|	unsafe-fx+
	\|	unsafe-fx-
	\|	eq?
	\|	unsafe-fx<
	\|	unsafe-fx<=
	\|	unsafe-fx>
	\|	unsafe-fx>=
	\|	fixnum?
	\|	boolean?
	\|	empty?
	\|	void?
	\|	ascii-char?
	\|	error?
	\|	not
	\|	pair?
	\|	procedure?
	\|	vector?
	\|	cons
	\|	unsafe-car
	\|	unsafe-cdr
	\|	unsafe-make-vector
	\|	unsafe-vector-length
	\|	unsafe-vector-set!
	\|	unsafe-vector-ref
	\|	make-procedure
	\|	unsafe-procedure-arity
	\|	unsafe-procedure-label

We add new primops for each of our new data types. Since the number of primitive operations is growing, we simplify the syntax to only give primops, rather than distinguishing unops, binops, and so on, so we can easily group like primops with like.

We also add impure computations, since vectors are a mutable data structure. Only effectful primops, those ending with !, are allowed in impure computations.

We add three new heap allocated data types, described below:

Pairs are constructed using (cons e_1 e_2). The predicate pair? should return #t when passed any value constructed this way, and #f for any other value—(eq? (pair? (cons e_1 e_2)) #t). (unsafe-car e) returns the value of the first element of the pair, and (unsafe-cdr e) returns the value of the second element. That is, (eq? (unsafe-car (cons e_1 e_2)) e_1) and (eq? (unsafe-cdr (cons e_1 e_2)) e_2).
Vectors are essentially arrays that know their length. They are constructed using (unsafe-make-vector e); the constructor takes the length of the vector as the argument. The predicate vector? should return #t for any value constructed this way, and #f for any other value—(eq? (vector? (unsafe-make-vector e)) #t). (unsafe-vector-ref e_1 e_2) returns the value at index e_2 in the vector e_1. (unsafe-vector-set! e_1 e_2 e_3) mutates the index e_2 in the vector e_1, setting its value to the value of e_3.
(unsafe-vector-set! e_1 e_2 e_3) is only allowed in impure computation context.
A procedure is a data structure representing a value that can be called as a function. Essentially, it is a wrapper around labels so we can check applications. Starting in this language, application must use a procedure instead of referencing a label directly. We construct a procedure using (make-procedure e_1 e_2), where e_1 must evaluate to a label and e_2 is the number of expected arguments. The predicate procedure? should return #t for any value constructed this way, and #f for any other value—(eq? (procedure? (make-procedure e_1 e_2)) #t). We extract the label of a procedure with (unsafe-procedure-label e_1), where e_1 is a procedure. We get the arity of a procedure with (unsafe-procedure-arity e_1), where e_1 is a procedure.

As we’re adding new data types, we need new tags. Here is our updated list of tags: Here is the set of tags we will use in this assignment, given in base 2.

#b000, fixnums, fixed-sized integers
#b001, pairs
#b010, procedures
#b011, vectors
#b100, unused
#b101, unused
#b110, non-fixnum immediates (booleans, etc)
#b111, unused

We follow the same pattern as Compiler 7: Immediate Data Types to implement the new predicates.

To implement constructors, we need to compile to alloc. cons allocates two words of space, storing its first argument in the first word and the second element in the second word, producing `(alloc ,(current-pair-size)). The ptr we get back needs to be tagged, so we produce `(bitwise-ior (alloc ,(current-pair-size)) ,(current-pair-tag)). make-procedure also allocates two words, one for the label and one for the argument arity. unsafe-make-vector allocates one word for the length, and then one word for every element of the vector. That is, it should allocate n+1 words for a vector of length n.

After allocating, we initialize the data structures using mset!. For example, we would transform `(cons ,e_1 ,e_2).

`(let ([x.1 (bitwise-xor (alloc 16) 1)])
   (begin
     (mset! (bitwise-xor x.1 ,(current-pair-tag)) 0 ,e_1)
     (mset! (bitwise-xor x.1 ,(current-pair-tag)) 8 ,e_2)
     x.1))

Since the length of the vector is dynamically determined, we do not initialize each field when implementing its constructor. Instead, we expose an unsafe constructor for vectors, and leave it to a safer language to initialize the vector.

We can optimize memory operations to avoid masking the pointer by taking advantage of pointer arithmetic. For example, (bitwise-xor (alloc 16) #b001) is the same as (+ (alloc 16) 1). We can therefore adjust the index by -1 to access the base of the pointer, instead of masking the pointer. Performing this optimization for pairs, we would instead transform `(cons ,e_1 ,e_2) into

`(let ([x.1 (+ (alloc 16) 1)])
   (begin
     (mset! x.1 -1 ,e_1)
     (mset! x.1 7 ,e_2)
     x.1))

The same optimization holds for vectors and procedures, with different constants.

Exercise 8: Redesign and extend the implementation of the function specify-representation. The source language is Impure-Exprs-data-lang v8 and the target language is Impure-Exprs-bits-lang v8.
Remember to not use constant values and instead use parameters. Some values may change.
Remember that the input language uses fixnum ptrs for all inputs, but the output uses bytes for mops.

15.4.9 implement-safe-apply

Last week, the last piece of undefined behavior in our language was in procedure calls. We were not checking that procedures were correctly applied to the number of expected arguments. Lower down, when procedure calls are implemented in terms of register and frame variables, this could result in dereferencing uninitialized locations. We can use the procedure data type to eliminate this undefined behavior.

This week, this pass is optional, as current events mean I haven’t been able to write up the design in time. Lucky you.

Challenge 1: Design and implement implement-safe-apply. You should design the source language, and the target language is Impure-Exprs-data-lang v8.

15.4.10 implement-safe-primops

All the accessors for the new data types can result in undefined behavior if used on the wrong ptr. Similarly, vector reference can access undefined values if the vector is constructed but never initialized.

Below we define Impure-Exprs-safe-lang v8.

p	::=	(module b ... e)

b	::=	(define label (lambda (aloc ...) e))

e	::=	v
	\|	(apply e e ...)
	\|	(let ([aloc e]) e)
	\|	(if e e e)

v	::=	fixnum
	\|	label
	\|	prim-f
	\|	aloc
	\|	#t
	\|	#f
	\|	()
	\|	(void)
	\|	(error uint8)
	\|	ascii-char-literal

prim-f	::=	*
	\|	+
	\|	-
	\|	eq?
	\|	<
	\|	<=
	\|	>
	\|	>=
	\|	fixnum?
	\|	boolean?
	\|	empty?
	\|	void?
	\|	ascii-char?
	\|	error?
	\|	not
	\|	pair?
	\|	procedure?
	\|	vector?
	\|	cons
	\|	car
	\|	cdr
	\|	make-vector
	\|	vector-length
	\|	vector-set!
	\|	vector-ref
	\|	procedure-arity

As in the last assignment, we wrap all accessors to perform dynamic tag checking before using the unsafe operations. We also wrap unsafe-make-vector to initialize all elements to 0.

We remove make-procedure and procedure-label, which are used internally. The surface programmer will only be able to define safe procedures using lambda. However, we do allow the user to dynamically test whether a value is a procedure and how many arguments it takes.

Note that in this language, we’ve removed begin. The user must manually call impure functions and bind the result. The result of an effectful function could be void, or an error. It would be unwise, although technically safe, to simple discard errors.

Writing a compiler for the following specification language may simplify this task:

; Symbol x Symbol x (List-of Parameter-Types)
; The first symbol is the name of a function in the source language.
; The second is either the name of a primop or a label in the target language implementing the
; behaviour safely, assuming well-typed parameters.
; The third is list of predicates, one for each argument to the source
; function, to check the parameters with. `any?` is specially recognized to
; not be checked.
(define prim-f-specs
  `((* unsafe-fx* (fixnum? fixnum?))
    (+ unsafe-fx+ (fixnum? fixnum?))
    (- unsafe-fx- (fixnum? fixnum?))
    (< unsafe-fx< (fixnum? fixnum?))
    (<= unsafe-fx<= (fixnum? fixnum?))
    (> unsafe-fx> (fixnum? fixnum?))
    (>= unsafe-fx>= (fixnum? fixnum?))

    (make-vector ,make-init-vector-label (fixnum?))
    (vector-length unsafe-vector-length (vector?))
    (vector-set! ,unsafe-vector-set!-label (vector? fixnum? any?))
    (vector-ref unsafe-vector-ref (vector? fixnum?))

    (car unsafe-car (pair?))
    (cdr unsafe-cdr (pair?))

    (procedure-arity unsafe-procedure-arity (procedure?))

    ,@(map (lambda (x) `(,x ,x (any?)))
           '(fixnum? boolean? empty? void? ascii-char? error? pair? procedure?
                     vector? not))
    ,@(map (lambda (x) `(,x ,x (any? any?)))
           '(cons eq?))))

All impure computations, those that end in !, should only return (void) or an error.

Exercise 9: Redesign and extend the implementation of implement-safe-primops. The source language is Impure-Exprs-safe-lang v8 and the target language is Impure-Exprs-data-lang v8, or your designed target language if you completed the challenge problem.
Now that there are now more combinations of primops and arguments than are possible to encode in a single number. You’ll need to give up some precision in error messages.

15.4.11 uniquify

Finally, we define the source language. Below we define Exprs-lang v8.

p	::=	(module b ... e)

b	::=	(define x (lambda (x ...) e))

e	::=	v
	\|	(apply e e ...)
	\|	(let ([x e]) e)
	\|	(if e e e)

v	::=	fixnum
	\|	x
	\|	prim-f
	\|	x
	\|	#t
	\|	#f
	\|	()
	\|	(void)
	\|	(error uint8)
	\|	ascii-char-literal

prim-f	::=	*
	\|	+
	\|	-
	\|	eq?
	\|	<
	\|	<=
	\|	>
	\|	>=
	\|	fixnum?
	\|	boolean?
	\|	empty?
	\|	void?
	\|	ascii-char?
	\|	error?
	\|	not
	\|	pair?
	\|	procedure?
	\|	vector?
	\|	cons
	\|	car
	\|	cdr
	\|	make-vector
	\|	vector-length
	\|	vector-set!
	\|	vector-ref
	\|	procedure-arity

Exercise 10: Redesign and extend the implementation of uniquify. The source language is Exprs-lang v8 and the target language Impure-Exprs-safe v8.

← prev up next →

1	CPSC 411 Syllabus
2	Course Calendar
3	The Compiler Design Recipe
4	Organizing Racket Code
5	Compiler 0: x64 to elf64
6	Compiler 1: Paren-x64 to x64
7	Compiler 2: Values-lang to x64
8	Compiler 3: Register Allocation (Original)
9	Compiler 3: Register Allocation (Corrected)
10	Compiler 4: Adding Control Flow (Original)
11	Compiler 4: Adding Control Flow (Corrected)
12	Assignment 5: The CPSC411 Compiler Recovery and Reinvestment Assignment of 2020.
13	Compiler 6: Calling Conventions
14	Compiler 7: Immediate Data Types
15	Compiler 8: Structured Data Types
16	Compiler 9: First-class Functions
17	Compiler 10: Recursive Data and Syntactic Sugar
18	Assignment 11: The CPSC411 Compiler Encore

15.1	Assignment Summary
15.2	Language Diagram
15.3	Preface: What’s wrong with Exprs-Lang v7
15.4	Exposing Heap Pointers in the Back-end