Roopsha Samanta

Assignment 7: Hindley-Milner Type Inference

Due Saturday, April 22 at 11:59PM.

In this assignment you will implement Hindley-Milner type inference, which represents the current ``best practice'' for flexible static typing. The assignment has two purposes:

To help you develop a deep understanding of type inference
To help you continue to build your ML programming skills

Setup

To get the code,

  git clone linux.cs.tufts.edu:/comp/105/book-code

The code you need is in bare/nml/ml.sml.

Individual Problems

Working on your own, please solve Exercises 1 and 2 on page 522 of Ramsey. These exercises explore some implications of type inference.

Pair Problems

Working with a partner, please solve Exercises 18, 19 and 20 on pages 525-526 of Ramsey, and exercises S and T below.

Problem Details

18. Implementing a constraint solver. Do Exercise 18 on page 525 of Ramsey. This exercise is probably the most difficult part of the assignment. Before proceeding with type inference, make sure your solver produces the correct result on our test cases and on your test cases.

19. Implementing type inference. Do Exercise 19 on page 526 of Ramsey.

20. Adding primitives. Do Exercise 20 on page 526 of Ramsey. -

S. Test cases for the solver. Write three test cases for the constraint solver. At least two of these test cases should be constraints that have no solution. Assuming that we provide a function constraintTest : con -> answer, write your test cases as three successive calls to constraintTest. Do not define constraintTest yourself.

Here is a sample set of test cases:

    val _ = constraintTest (TYVAR "a" ~ TYVAR "b")
    val _ = constraintTest (CONAPP (TYCON "list", [TYVAR "a"]) ~ TYCON "int")
    val _ = constraintTest (TYCON "bool" ~ TYCON "int")

Naturally, you will supply your own test cases, different from these.

T. Test cases for type inference. Write three test cases for type inference. At least two of these test cases should be for terms that fail to type check. Each test case should be a definition written in nML. Here is a sample set of test cases:

    (val weird (lambda (x y z) (cons x y z)))
    (+ 1 #t)
    (lambda (x) (cons x x))

Naturally, you will supply your own test cases, different from these.

Hints, guidelines, and testing code

This is one assignment where it pays to run a lot of tests, of both good and bad definitions. The most effective test of your algorithm is not that it properly assign types to correct terms, but that it reject ill-typed terms. To help you with the solver, once you have implemented solve, the following code redefines solve into a version that checks itself for sanity (ie, idempotence). It is a good idea to check that the substitution returned by your solver is idempotent before using it in your type inferencer.

    fun isStandard pairs =
        let fun distinct a' (a, tau) = a <> a' andalso not (member a' (freetyvars tau))
            fun good (prev', (a, tau)::next) =
                  List.all (distinct a) prev' andalso List.all (distinct a) next
                  andalso good ((a, tau)::prev', next)
              | good (_, []) = true
        in  good ([], pairs)
        end

    val solve =
        fn c => let val theta = solve c
                in  if isStandard theta then theta
                    else raise BugInTypeInference "non-standard substitution"
                end

In writing the type-inference code, you should refer to the typing rules of nml. With your solver in place, the type inference should be straightforward, with two exceptions: let and letrec. You can emulate the implementations for val and val-rec, but you must split the constraint into local and global portions. The splitting is covered in detail in Section 7.5.2 of the book.

Testing

The course interpreter is located in /homes/cs456/bin/nml. If your interpreter can process the initial basis and infer correct types, you are doing OK.

The real test of your interpreter is that it should reject incorrect definitions. You should prepare a dozen or so definitions that should not type check, and make sure they don't. For example:

    (val bad (lambda (x) (cons x x)))
    (val bad (lambda (x) (cdr (pair x x))))

Pick your toughest three test cases to submit for Exercise T.

Avoid common mistakes

Here some common mistakes:

A common mistake is to create too many fresh variables or to fail to constrain your fresh variables.
Another surprisingly common mistake is to include redundant cases in the code for inferring the type of a list literal. As is almost always true of functions that consume lists, it's sufficient to write one case for NIL and one case for PAIR.
It's a common mistake to define a new exception and not handle it. If you define any new exceptions, make sure they are handled. It's not acceptable for your interpreter to crash with an unhandled exception just because some nano-ML code didn't type-check.
It's a common mistake to omit the initial basis for testing and then to forget to include an initial basis in the interpreter you submit.

There are also some common assumptions which are mistaken:

It is a mistake to assume that an element of a literal list always has a monomorphic type.
It is a mistake to assume that begin is never empty.

What to submit

Individual Problems

You should submit two files:

README, telling us with whom you collaborated, how long you worked, what parts you finished, and so on.
a file meaning.nml containing your code for Exercises 1 and 2.

When you are ready, use the following command to submit your work

turnin -c cs456 -p ml-inf README meaning.nml

Pair Problems

Submit these files:

README, telling us with whom you collaborated, how long you worked, what parts you finished, and so on.
stest.nml, containing your answer to Exercise S
ttest.nml, containing your answer to Exercise T
ml.sml, containing a complete interpreter for nano-ML which includes your answers to Exercises 18, 19 and 20.

Use the command:

turnin -c cs456 -p ml-inf README ttest.nml ml.sml

Note: We must be able to compile your solution in Moscow ML by typing, e.g.,

mosmlc ml.sml

If there are errors or warnings in this step, your work will earn No Credit for functional correctness.

How your work will be evaluated

We will focus most of our evaluation on your constraint solving and type inference.

	Exemplary	Satisfactory	Must improve
Form	• The code has no offside violations. • Or, the code has just a couple of minor offside violations. • Indentation is consistent everywhere. • The submission has no bracket faults. • The submission has a few minor bracket faults. • Or, the submission has no bracketed names, but a few bracketed conditions or other faults.	• The code has several offside violations, but course staff can follow what's going on without difficulty. • In one or two places, code is not indented in the same way as structurally similar code elsewhere. • The submission has some redundant parentheses around function applications that are under infix operators (not checked by the bracketing tool) • Or, the submission contains a handful of bracketing faults. • Or, the submission contains more than a handful of bracketing faults, but just a few bracketed names or conditions.	• Offside violations make it hard for course staff to follow the code. • The code is not indented consistently. • The submission contains more than a handful of parenthesized names as in `(x)` • The submission contains more than a handful of parenthesized `if` conditions.
Names	• Type variables have names beginning with `a`; types have names beginning with `t` or `tau`; constraints have names beginning with `c`; substitutions have names beginning with `theta`; lists of things have names that begin conventionally and end in `s`.	• Types, type variables, constraints, and substitutions mostly respect conventions, but there are some names like `x` or `l` that aren't part of the typical convention.	• Some names misuse standard conventions; for example, in some places, a type variable might have a name beginning with `t`, leading a careless reader to confuse it with a type.
Structure	• The nine cases of simple type equality are handled by these five patterns: `TYVAR`/any, any/`TYVAR`, `CONAPP`/`CONAPP`, `TYCON`/`TYCON`, other. • The code for solving α ∼ τ has exactly three cases. • The constraint solver is implemented using an appropriate set of helper functions, each of which has a good name and a clear contract. • Type inference for list literals has no redundant case analysis. • Type inference for expressions has no redundant case analysis. • In the code for type inference, course staff see how each part of the code is necessary to implement the algorithm correctly. • Wherever possible appropriate, submission uses `map`, `filter`, `foldr`, and `exists`, either from `List` or from `ListPair`	• The nine cases are handled by nine patterns: one for each pair of value constructors for `ty` • The code for α ∼ τ has more than three cases, but the nontrivial cases all look different. • The constraint solver is implemented using too many helper functions, but each one has a good name and a clear contract. • The constraint solver is implemented using too few helper functions, and the course staff has some trouble understanding the solver. • Type inference for list literals has one redundant case analysis. • Type inference for expressions has one redundant case analysis. • In some parts of the code for type inference, course staff see some code that they believe is more complex than is required by the typing rules. • Submission sometimes uses a fold where `map`, `filter`, or `exists` could be used.	• The case analysis for a simple type equality does not have either of the two structures on the left. • The code for α ∼ τ has more than three cases, and different nontrivial cases share duplicate or near-duplicate code. • Course staff cannot identify the role of helper functions; course staff can't identify contracts and can't infer contracts from names. • Type inference for list literals has more than one redundant case analysis. • Type inference for expressions has more than one redundant case analysis. • Course staff believe that the code is significantly more complex than what is required to implement the typing rules. • Submission includes one or more recursive functions that could have been written without recursion by using `map`, `filter`, `List.exists`, or a `ListPair` function.

CS45600: Programming Languages

Assignment 7: Hindley-Milner Type Inference

Setup

Individual Problems

Pair Problems

Problem Details

Hints, guidelines, and testing code

Testing

Avoid common mistakes

What to submit

Individual Problems

Pair Problems

How your work will be evaluated