## Description

## 1 Introduction

In this project, you will implement a language called MiniML, in OCaml. This will allow you to work

with a larger code base, make use of many of the concepts that you have seen in class throughout

the semester, and gain a deeper understanding on key concepts such as free variables, substitution,

evaluation, type checking and type inference.

MiniML is similar to OCaml itself and very similar to

the language that is discussed in the lectures and lecture notes; it is however also more powerful

than the minimal language discussed in the lecture notes. In particular, it includes n-ary tuples

(pairs written as (e1, . . . , en)), a more general let-expression, functions, function application, and

recursion.

## Goal of the project

In this project, you will extend existing implementation with simple code analysis tools, an interpreter (which evaluates an expression), a type checker, and eventually, a type inference engine.

## Overview

The general pipeline we will use in this project is as follows:

1. Frontend (From External Language to Internal Language): In the frontend, we read and parse

a program written in an external language and produce the corresponding internal program

(i.e. the program in the internal language). We already have implemented this frontend,

which means you can conveniently write programs in the external language.

2. Core : The core of this project consists of code analyzer, an evaluator, a type checker, and

a type inference engine. All these programs take a program in the internal language as an

input. Hence, you will be writing programs that analyze and evaluate expressions in the

internal language. The frontend translates MiniML programs into their corresponding internal

representation and thus allows you to write MiniML programs in a clean, readable form.

We will explain this pipeline in more detail in the next section.

1

## 2 Internal and External Language

Our MiniML internal language can be described as an extension of the language that is given in the

lecture notes:

types τ :=int | bool | τ1 -> τ2 | τ1 ∗ . . . ∗ τn

expressions e := . . . | fn x : τ => e | e1 e2 | rec f : τ => e | (e1, . . . , e2) | let ds in e end

declaration d :=val x = e | val (x1, . . . , xn) = e | name x = e

declarations ds :=· | d ds

It is worth noting that this grammar describes an internal language where issues regarding

precedence order, brackets, etc. have been already resolved. A programmer actually writes programs in the external language, i.e. the language that the parser accepts. In the project, we provide

a frontend which reads and parses programs written in the external language and produces the

corresponding internal program (i.e. the program in the internal language).

For example, we write

1 let fun fact ( x : int ) : int =

2 if x = 0 then

3 1

4 else

5 x * fact ( x – 1)

6 in

7 fact 5

8 end;

which corresponds to

let val fact = rec fact : int -> int => fn x : int => if x = 0 then 1 else x ∗ fact(x − 1) in fact 5 end

Having such a frontend makes it easier to write programs as input.

In general, a valid MiniML program is just an expression followed by a semicolon. All MiniML

programs eventually compute to values (if they terminate). Our external language supports various

operators. They include arithmetic operators like addition (+), subtraction (-), multiplication (*),

division (/) and negation (~), arithmetic comparisons (<, >, <=, >=), equality comparisons (=, !=) and

logical operators like logic and (&&) and logic or (||). All operators have familiar precedence levels.

Notice that equality comparison operators only apply to integers.

##
3 With A Little Help From Your Friendly Teaching Assistants . . .

3.1 Parsing

We provide a number of helper functions in order to help you focus on the essential part of the

project.

Following is the signature of a parsing module which we provide:

1 module P : sig

2 val parse : string -> ( string , exp ) either

3 end

You can invoke this function by

2

1 P . parse ” some string ”

The function returns Right e if the string is a valid MiniML program, or Left s if it is not and s is

the corresponding (not-so-specific) error message. For example:

1 P . parse “let fun fact (x : int) : int =

2 if x = 0 then

3 1

4 else

5 x * fact (x – 1)

6 in

7 fact 5

8 end;”;;

9 (* this program returns

10 – Right ( Let

11 ([ Val

12 (Rec

13 (” fact ” , TArrow (TInt , TInt ),

14 Fn

15 (“x” , Some TInt ,

16 If

17 ( Primop (Equals ,

18 [Var “x”; Int 0]) ,

19 Int 1 ,

20 Primop (Times ,

21 [Var “x”;

22 Apply (Var ” fact ” ,

23 Primop (Minus ,

24 [Var “x”; Int 1]))])))),

25 ” fact “)] ,

26 Apply (Var ” fact ” , Int 5)))

27 *)

In a negative case:

1 P . parse “1” (* should end with semicolon *) ;;

2 – : Left ” Expected SEMICOLON token ”

## 3.2 Pretty Printing

Clearly, abstract syntax is not very easy to read. Sometimes you want to quickly verify if an abstract

syntax tree is as expected. In that case, you can use a printing helper which is provided by the

following module:

1 module Print : sig

2 val exp_to_string : exp -> string

3 end

which converts an exp to a readable form:

1 let Right e = P . parse “let fun fact (x : int ) : int =

2 if x = 0 then

3 1

4 else

5 x * fact (x – 1)

6 in

7 fact 5

8 end;” in

3

9 Print . exp_to_string e ;;

10 (* this program returns

11 “let fun fact : int -> int x = if x = 0 then 1 else x * fact (x – 1) in fact 5 end

”

12 *)

## 3.3 Testing

We have two methods to help you test your code, unit tests and hidden tests.

We implement some helper functions to unit test your code. For a requested function foo, you

can write your own unit tests in foo_tests and run your tests by evaluating your code and executing

run_foo_tests (). For example, for Q1, you can write some unit tests in free_vars_tests and run

run_free_var_tests () to run all the tests.

You are encouraged to test your code frequently!

Hidden tests can be run by clicking the “Grade” button. The report will simply tell you either

that you pass all the tests or the number of tests you fail. The tests are here to catch some common

errors and do not meant to be exhaustive. You should write more unit tests to make sure your code

is correct!

At last, please remember to back up your code! You will be working on this project for days

and weeks. You don’t want to restart everything from scratch because LearnOCaml erases your

work!

## 3.4 Other Helpers

We provide other helper definitions in the prelude buffer above the main buffer. It includes some

useful helper functions as well as some examples of valid MiniML programs. Make sure to take a

quick look before getting started!

## 4 Grading

We provide the testing mechanisms in order to help you focus on the technical problems themselves.

This project is graded offline. That is, your code might pass all the hidden tests on LearnOCaml,

but that does not imply a guaranteed 100%; bugs might still get caught by our offline tests and our

manual code review. For this reason, please try your best to test your own code!

You can organize the given code freely, including defining your own top-level definitions, adding

or removing rec, etc, as long as you implement the target functions correctly with the exact given

types.

##
5 Questions

5.1 Q0 (0 point): Trying out Parser Frontend

Get yourself familiar with the external syntax of MiniML and the usage of the parsing function

P.parse. Please use parse_tests : (string * (string, exp) either) list to familiarize yourself. Each

entry is a MiniML program and the expected output of P.parse. Hit Grade button and see if you

have got the tests right.

4

## 5.2 Q1 (10 points): Free Variables

Implement the function free_vars : exp -> name list which when given an expression computes

the free variables occurring in the expression. This function is key to implementing substitution

properly.

Your implementation should follow closely the definition of free variables given in the notes

and extend it to tuples, let expressions, functions and recursion. The order of the returned names

does not matter.

## 5.3 Q2 (15 points): Unused Variables

OCaml also tests whether a given variable is used or not; this is a useful little tool which we

would also like to support for MinML. Implement a function unused_vars: exp -> name list which

given an expression checks for each variable introduced by a binding construct (i.e. a function, let

expression, etc. ) whether the variable the construct introduces is in fact used in the body of the

expression.

For example:

let val x = 3 in 4 end; x is unused

let val x = true in let val y = 4 in x + 5 end end; y is unused

let val x = 3 in let val x = 4 in x + x end end; x (the first occurrence) is unused

Similarly, given the following program, x and test is unused:

1 let fun test ( x : int ) : int = 3 in 4 end ;

Implement the function unused_vars: exp -> name list. It traverses an expression and for every

construct which introduces some bound variables (i.e. functions, let-expression) we will check

whether these variables occur free in the body; if they do, then indeed these variables are used; if

they do not occur free in the body, then the variables are unused. You might want to use free_vars

that you just implemented. The order of names does not matter.

## 5.4 Q3 (15 points): Substitution

Finish the implementation of function subst:(exp * name) -> exp -> exp for substitution to handle

let-name, pairs, let-pair, functions, function application, and recursion.

The first argument to the function subst contains a tuple of an expression e’ and a variable name

x denoting the substitution [e’/x]. The second argument is an expression e to which we apply the

substitution. In other words,

subst (e’,x) e = [e’/x]e

i.e. the function subst will replace any free occurrence of the variable x in the expression e’ by the

expression e. Note that subst is capture-avoiding. Use fresh_var to obtain a fresh variable name if

necessary.

You can then for example test your substitution function as follows:

1 # subst ( Int 5 , “x”) ( If ( Bool ( true ) , Var “x”, Var “y”) ) ;

2 val it = If ( Bool true , Int 5 , Var “y”) : exp

5

## 5.5 Q4 (30 points): Big Step Evaluation

In this question, we will implement an interpreter based on big-step evaluation. Your task is to

implement the missing cases of the function eval for big-step evaluation. In particular, add the

cases for functions, function application, recursion, and handling generalized let-expressions, i.e.

those cases which currently raise the exception NotImplemented. The evaluation rules are fully listed

in Figure 1.

If you encounter a situation not covered by these rules, then you should raise (Stuck “message”)

(with something more descriptive than “message”). See the other cases for examples of this.

The evaluator is not environment-based, so there is a direct translation from the above rules to

code. Use subst that you just implemented.

As is often the case, not very much code is required.

## 5.6 Q5 (25 points): Type Checking

In this question, we are inferring the type of an expression. We use the type annotations on recursion and functions, to guarantee that we can always uniquely infer a unique type of an expression.

It simplifies the type inference problem. For now, you can ignore type variables.

To describe the type inference we will use the following judgement:

Γ ` e ⇒ T Infer type T for expression e in context Γ

The inference rules are given in Figure 2. Your implementation of the function infer: context

-> exp -> typ should follow exactly the algorithm described by the typing rules. It takes as input a

context describing the typing assumptions and an expression. It returns the type of the expression.

For this question, you can assume that all functions are annotated with their types. We will handle

those that are not in the next question.

If an expression is not well-typed, then we can raise a TypeError exception by calling the type_fail

function.

## 5.7 Q6 (30 points): Unification

Lastly, we want to enable full type inference; as a consequence, we can omit the type annotations

on functions and recursion. Key to type inference is unification. Your task is to implement the

function unify: typ -> typ -> unit, which checks whether two types are unifiable, i.e. if we can

make them syntactically equal.

You should follow the description of unification in the class notes. Type variables are modeled

via references. An uninstantiated type variable is modeled as a pointer to a cell with content None.

In other words to create a new type variable we can simply use a function

1 let fresh_tvar () = TVar ( ref None )

Once we know what a type variable should be instantiated with we simply assign it the correct

type. For example, if we have a type variable TVar x, then x has type (typ option) ref and we can

replace every occurrence of TVar x by the type Int using assignment x := Some TInt.

This will destructively update the type variable x and directly propagate the instantiation for it.

No extra implementation of a substitution function is necessary to propagate instantiations.

Implement the function unify : typ -> typ -> unit. If two types are unifiable, they will be denoting the same type after unification succeeds. If unification fails, raise a TypeError exception by

6

e ⇓ v expression e evaluates to value v

e ⇓ v

let · in e end ⇓ v (B-NO-DECS)

e1 ⇓ v1 [v1/x](let decs in e end) ⇓ v

let val x1 = e1 decs in e end ⇓ v (B-LET-VAL)

[e1/x](let decs in e end) ⇓ v

let name x1 = e1 decs in e end ⇓ v (B-LETN)

e1 ⇓ (v1, . . . , vn) [v1/x1, . . . , vn/xn](let decs in e end) ⇓ v

let val (x1, . . . , xn) = e1 decs in e end ⇓ v (B-LET-TUPLE)

for all i ei ⇓ vi

(e1, . . . , en) ⇓ (v1, . . . , vm) (B-TUPLE)

[rec f : τ => e/f]e ⇓ v

rec f : τ => e ⇓ v (B-REC)

e1 ⇓ fn x => e e2 ⇓ v2 [v2/x]e ⇓ v

e1 e2 ⇓ v (B-APP)

fn x => e ⇓ fn x => e (B-FN)

e1 ⇓ true e2 ⇓ v

if e1 then e2 else e3 ⇓ v (B-IFTRUE)

e1 ⇓ false e3 ⇓ v

if e1 then e2 else e3 ⇓ v (B-IFFALSE)

e ⇓ v

e : τ ⇓ v (B-ANNO)

n is a number

n ⇓ n

b ∈ {true, false}

b ⇓ b

e1 ⇓ true e2 ⇓ v

e1 && e2 ⇓ v

e1 ⇓ false

e1 && e2 ⇓ false

e1 ⇓ false e2 ⇓ v

e1 || e2 ⇓ v

e1 ⇓ true

e1 || e2 ⇓ true

e1 ⇓ v1 e2 ⇓ v2 op is a binary operator

e1 op e2 ⇓ v1 op v2 B-OP

e ⇓ v

∼ e ⇓ −v

Figure 1: rules for big-step evaluation

7

Inference Rules

Γ ` e ⇒ bool Γ ` e1 ⇒ τ Γ ` e2 ⇒ τ

Γ ` if e then e1 else e2 ⇒ τ (T-IF)

Γ, x : τ ` e ⇒ τ

0

Γ ` (fn x : τ => e) ⇒ τ -> τ

0

(T-FN)

Γ ` e1 ⇒ τ1 · · · Γ ` en ⇒ τn

Γ ` (e1, . . . , en) ⇒ (τ1 ∗ · · · ∗ τn) (T-TUPLE)

Γ, f : τ ` e ⇒ τ

Γ ` (rec f : τ => e) ⇒ τ (T-REC)

Γ ` e1 ⇒ τ -> τ

0

Γ ` e2 ⇒ τ

Γ ` e1 e2 ⇒ τ

0

(T-APP)

Γ ` decs ⇒ Γ

0

Γ, Γ 0 ` e ⇒ τ

Γ ` let decs in e end ⇒ τ (T-LET)

Γ ` e ⇒ τ

Γ ` (e : τ) ⇒ τ (T-ANNO)

n is a number

Γ ` n ⇒ int

b ∈ {true, false}

Γ ` b ⇒ bool

Γ ` ei ⇒ int op is +, -, *, /

Γ ` e1 op e2 ⇒ int

Γ ` ei ⇒ int op is <, <=, >=, >=, =, !=

Γ ` e1 op e2 ⇒ bool

Γ ` ei ⇒ bool op is &&, ||

Γ ` e1 op e2 ⇒ bool

Γ ` e ⇒ int

Γ `∼ e ⇒ int

Rules for declarations

Γ ` dec1 ⇒ Γ1 Γ, Γ1 ` decs ⇒ Γ2

Γ ` dec1 decs ⇒ Γ1, Γ2 (T-DECS)

Γ ` e ⇒ τ

Γ ` (val x = e) ⇒ (x : τ) (T-BY-VAL)

Γ ` e ⇒ τ

Γ ` (name x = e) ⇒ (x : τ) (T-BY-NAME)

Γ ` e ⇒ (τ1 ∗ · · · ∗ τn)

Γ ` (val (x1, . . . , xn) = e) ⇒ (x1 : τ1), . . . ,(xn : τn) (T-BY-VAL-TUPLE)

Figure 2: Type inference rules

8

calling the type_fail function. Follow the algorithm described in the notes to unify two types, and

fill in the implementation for unify.

### 5.8 Q7 (10 points) EXTRA CREDIT: Type Inference

Modify your function infer that it supports type inference. This is quite easy: if the type annotation

in recursion and functions is None and we don’t know the type of a variable x we want to add to the

context, we simply generate a fresh type variable tv and add to the context the fact that x has type

tv; instead of checking for equality, you call unification.

9 Final Project COMP 302