Chapter 9 Type Checking/Inference

Where is this going: remove syntactic distinction between boolean expressions and normal integer expressions. Remove clunkiness around unit expressions.

Syntactically everything will be an expression and the type system will distinguish expressions based on their types.

9.1 Types

9.1.1 About Types

What is a type?

  • A set of values: e.g. true, false
  • A set of operations on those values: e.g. !, &&, …

Why do we need types?

  • Help to structure and understand a program
  • Can prevent some kinds of errors
  • Generate more efficient code (e.g. no storage for Unit)

9.1.2 Our Grammar, Typed

<op>    ::= ['*' | '/' | '+' | '-' | '<' | '>' | '=' | '!']+
<type>  ::= <ident>
<bool>  ::= 'true' | 'false'
<atom>  ::= <number> | <bool> | '()'
          | '('<simp>')'
          | <ident>
          | '{'<exp>'}'
<uatom> ::= [<op>]<atom>
<simp>  ::= <uatom>[<op><uatom>]*
          | 'if' '('<simp>')' <simp> ['else' <simp>]
          | 'while' '(' <simp> ')' <simp>
          | <ident> '=' <simp>
<exp>   ::= <simp>[';'<exp>]
          | 'val' <ident> '=' <simp>';'<exp>
          | 'var' <ident> '=' <simp>';'<exp>

9.1.3 Type Checking

The type checking step will be part of the semantic analyzer.

The key point to understand is that types represent an abstract value, and inference rules define the set of operations on these values.

Therefore, the implementation is going to be very similar to eval, trans, or analyze.

9.2 Formal Type System

9.2.1 Inference Rules

Before we look at the implementation, let’s first describe our type system in formal notation. Just as BNF is a formalism to describe syntax, type systems are typically described through systems of inference rules. Where BNF describes legal derivations of programs from grammer rules, a system of inference rules describes legal derivations of typing judgements:

 Env |- e: T

Means that in the environment ‘Env’, the expression ‘e’ is of type ‘T’

This is a statement that can be proved by building a proof tree out of inference rules that formally specify the type sytem.

Env is the typing environment: it is a mapping between identifier and type.

Env,id:T is the environment that contains all information in ‘Env’ plus the mapping from ‘id’ to ‘T’

Note: Env is often represented with \(\Gamma\).

      conditions
     ------------ [Name of the rule]
      conclusion
  1. Lit: ‘i’ is an Int, ‘b’ is a Boolean
      ------------------ [Int]     --------------------- [Boolean]
       Env |- Lit(i): Int                Env |- Lit(b): Boolean

      ------------------- [Unit]
       Env |- Lit(()): Unit
  1. Unary: op is in [“+”, “-”]
            Env |- e: Int
      ------------------------ [IntUnOp]
       Env |- Unary(op, e): Int
  1. Prim:
  • op is in [“+”, “-”, “*“,”/“]
  • bop is in [“==”, “!=”, “<=”, “>=”, “<”, “>”]
       Env |- e1: Int         Env |- e2: Int
      --------------------------------------- [IntOp]
            Env |- Prim(op, e1, e2): Int
       Env |- e1: Int           Env |- e2: Int
      ----------------------------------------- [BoolOp]
          Env |- Prim(bop, e1, e2): Boolean
  1. Immutable variables
       Env |- e1: T1           Env,x:T1 |- e2: T2
      -------------------------------------------- [Let]
                Env |- Let(x, e1, e2): T2



           Env(x) = T
      ------------------- [Ref]
       Env |- Ref(x) : T

9.2.2 Example

Prove that the following program is of type Boolean

val x = 3; x == 4








        Let(x, Int, Lit(3), Prim("==", Ref(x), Lit(4)))

Prove that the following program is of type Boolean

val x = 3; x == 4








     |- Let(x, Int, Lit(3), Prim("==", Ref(x), Lit(4))): Boolean

Prove that the following program is of type Boolean

val x = 3; x == 4







  -----------------------------------------------------------------[Let]
     |- Let(x, Int, Lit(3), Prim("==", Ref(x), Lit(4))): Boolean

Prove that the following program is of type Boolean

val x = 3; x == 4






   |- Lit(3): Int     x:Int |- Prim("==", Ref(x), Lit(4)): Boolean
  -----------------------------------------------------------------[Let]
     |- Let(x, Int, Lit(3), Prim("==", Ref(x), Lit(4))): Boolean

Prove that the following program is of type Boolean

val x = 3; x == 4





  ------------------[Int]
   |- Lit(3): Int     x:Int |- Prim("==", Ref(x), Lit(4)): Boolean
  -----------------------------------------------------------------[Let]
     |- Let(x, Int, Lit(3), Prim("==", Ref(x), Lit(4))): Boolean

Prove that the following program is of type Boolean

val x = 3; x == 4





               --------------------------------------------------[BoolOp]
   |- Lit(3): Int     x:Int |- Prim("==", Ref(x), Lit(4)): Boolean
  -----------------------------------------------------------------[Let]
     |- Let(x, Int, Lit(3), Prim("==", Ref(x), Lit(4))): Boolean

Prove that the following program is of type Boolean

val x = 3; x == 4




                x:Int |- Ref(x): Int          x:Int |- Lit(4) : Int
               --------------------------------------------------[BoolOp]
   |- Lit(3): Int     x:Int |- Prim("==", Ref(x), Lit(4)): Boolean
  -----------------------------------------------------------------[Let]
     |- Let(x, Int, Lit(3), Prim("==", Ref(x), Lit(4))): Boolean

Prove that the following program is of type Boolean

val x = 3; x == 4


                  (x: Int)(x) = Int
               ----------------------[Ref]
                x:Int |- Ref(x): Int          x:Int |- Lit(4) : Int
               --------------------------------------------------[BoolOp]
   |- Lit(3): Int     x:Int |- Prim("==", Ref(x), Lit(4)): Boolean
  -----------------------------------------------------------------[Let]
     |- Let(x, Int, Lit(3), Prim("==", Ref(x), Lit(4))): Boolean

Prove that the following program is of type Boolean

val x = 3; x == 4


                  (x: Int)(x) = Int
               ----------------------[Ref]-----------------------[Int]
                x:Int |- Ref(x): Int          x:Int |- Lit(4) : Int
               --------------------------------------------------[BoolOp]
   |- Lit(3): Int     x:Int |- Prim("==", Ref(x), Lit(4)): Boolean
  -----------------------------------------------------------------[Let]
     |- Let(x, Int, Lit(3), Prim("==", Ref(x), Lit(4))): Boolean

There are no more obligations to prove!! That means our initial statement was true.

9.2.3 Inference Rules (cont’d)

  1. if:
     Env |- c1: Boolean         Env |- e1: T         Env |- e2: T
    -------------------------------------------------------------- [If]
                  Env |- If(c1, e1, e2): T
  1. Mutable variables
       Env |- e1: T1          Env,x:T1 |- e2: T2
      ------------------------------------------- [VarDec]
         Env |- VarDec(x, T1, e1, e2): T2

        Env(x) = T1         Env |- e1: T1
      ------------------------------------ [VarAssign]
         Env |- VarAssign(x, e1): T1
  1. while
     Env |- c1: Boolean         Env |- e1: Unit
    -------------------------------------------- [While]
              Env |- While(c1, e1): Unit

9.3 Type Checking and Type Inference

The type checking/inference step will be part of the semantic analyzer.

The key point to understand is that types represent an abstract value, and inference rules define the set of operations on these values.

Therefore, the implementation is going to be very similar to eval or analyze.

9.3.1 Syntax of Types

abstract class Type
case class BaseType(tp: String) extends Type
val IntType = BaseType("Int")
val BoolType = BaseType("Boolean")
val UnitType = BaseType("Unit")

9.3.2 A First Typer

The type checker has to resolve the type of each node:

def typeInfer(exp: Exp)(env: Map[String,Type]): Type = exp {
  case Lit(x: Int) =>
    IntType
  case Prim("+", List(a,b)) =>
    if (typeInfer(a) != IntType) error("expected Int!")
    if (typeInfer(b) != IntType) error("expected Int!")
    IntType
  case Let(x, rhs, body) =>
    val t1 = typeInfer(rhs)
    typeInfer(body)(env + (x -> t1))
  case If(c, thenp, elsep) =>
    if (typeInfer(c) != BooleanType) error("expected Boolean!")
    val t1 = typeInfer(thenp)
    val t2 = typeInfer(elsep)
    if (t1 != t2) error("if/else result types don't match!")
    t1
}

The function typeInfer is very similar to an interpreter, but it operates not over concrete values but over abstract values. Instead of computing 3+4 = 7, it computes Int+Int = Int. And instead of computing 4+true = error it computes Int+Boolean = error.

9.3.3 Separating Checking and Inference

How can we improve the code? There is clearly a lot of duplication that can be factored out. Let’s introduce another function, typeCheck, to take care of checking types and producing errors:

def typeCheck(exp: Exp, pt: Type)(env: Env): Unit = {
  val tpe = typeInfer(exp, pt)(env)
  if (tpe != pt) error(s"Expected type $pt but got $tpe", exp.pos)
}

Now we can refactor typeInfer as follows:

def typeInfer(exp: Exp)(env: Map[String,Type]): Type = exp match {
  case Prim("+", List(a,b)) =>
    typeCheck(a, IntType); typeCheck(b, IntType); IntType
  case Let(x, rhs, body) =>
    typeInfer(body)(env + (x -> typeInfer(rhs)))
  case If(c, thenp, elsep) =>
    typeCheck(c, BooleanType)
    val t1 = typeInfer(thenp); typeCheck(elsep,t1); t1
}

9.3.4 Type-Annotating Trees

What we have so far gets the job done of checking that a program doesn’t have any type errors, but we immediately discard the computed type information.

However, type information is often useful for later phases, e.g. for the code generator to decide on how much space needs to be reserved for a given value.

We can keep track of this by adding a field in the Exp class

trait Typed {
  var tpe: Type = UnknownType
  def withType(pt: Type): this.type = { tpe = pt; this }
}
abstract class Exp extends Typed

and modifying typeInfer to set this for each Exp visited.

def typeInfer(exp: Exp)(env: Map[String,Type]): Type = {
  val tpe = exp match { ... }
  exp.withTpe(tpe); tpe
}

This way, the computed type information is available for later phases to pick up.

9.3.5 Type-Driven Transformation

It is often the case that type mismatches can be fixed by inserting certain conversions. For example, an expression e of any type can be converted to Unit by wrapping it in a Let

val dummy = e; ()

In order to do this, we need to change typeInfer to return a new Exp (with the correct type assigned) instead of only a type.

def typeInfer(exp: Exp)(env: Map[String,Type]): Exp = ...

def typeCheck(exp: Exp, pt: Type)(env: Env): Exp = {
  val exp1 = typeInfer(exp, pt)(env)
  if (exp1.tpe == pt) exp1 else {
    if (pt == UnitType) {
      val dummy = freshDummyVar()
      Let(dummy, exp1, Lit(())).withType(UnitType)
    } else error(s"Expected type $pt but got $tpe", exp.pos)
  }
}

The body of typeInfer changes as follows:

def typeInfer(exp: Exp)(env: Map[String,Type]): Exp = exp match {
  case Prim("+", List(a,b)) =>
    Prim("+", List(typeCheck(a, IntType), typeCheck(b, IntType))).withType(IntType)
  case Let(x, rhs, body) =>
    val rhs1 = typeInfer(rhs)
    val body1 = typeInfer(body)(env + (x -> rhs1.tpe))
    Let(x, rhs1, body1).withType(body1.tpe)
  case If(c, thenp, elsep) =>
    val c1 = typeCheck(c, BooleanType)
    val elsep1 = typeInfer(elsep);
    val thenp1 = typeCheck(thenp, elsep.tpe);
    If(c1, thenp1, elsep1).withType(elsep1.tpe)
}

The last case (for if/else) demands some explanation.

Do now: why don’t we infer the type of the then branch and then check the else part against it?

This has to do with optional else. The parser will desugar an if without else by inserting a Unit literal in the else branch. For an if with missing else, converting a Unit literal to whatever other type the then branch has would be doomed to fail. But the opposite is always possible: we really want the then branch force-converted to Unit.

9.3.6 The Unknown Type as Sentinel

Note: this becomes relevant really only once we have functions or structured data

We need to define what “T1 conforms to T2” means.

T1 conforms to T2 if:

  • T1 == T2, or
  • T2 is UnknownType

Code update:

// Infer now also takes expected type as suggestion
def typeInfer(exp: Tree, pt: Type): Tree

// Check if 'tp' is well-formed. For now that means that 'tp'
// is not unknown
def typeWellFormed(tp: Type): Type

// Check if 'tp' conforms to 'pt' and return the more precise Type
// The returned type should also be well-formed
def typeConforms(tp: Type, tp: Type): Type

def typeCheck(exp: Tree, pt: Type): Tree = {
  // First infer
  val nexp = typeInfer(exp, pt)(env)
  val rtp = typeConforms(nexp.tp, pt)(env)
  nexp.withType(rtp)
}

9.4 Wrap-Up

9.4.1 Interpretation With Types

Let’s update our definitional interpreter to include types.

Previously we had

type Val = Int

Now we need a more generic definition. Either

type Val = Any

or something like

abstract class Val
case class IntVal(x: Int) extends Val
case class BoolVal(x: Boolean) extends Val
case object UnitVal extends Val

if we want to be more specific.

9.4.2 New Error Cases

Either way, the implementation will now contain type casts that may fail at runtime:

def eval(exp)(env: Env): Val = exp match {
  case Lit(i: Int) => IntVal(i)
  case Prim(op, l, r) =>
    evalPrim(op)(eval(l)(env), eval(r)(env))
  // ...
}
def evalPrim(op: String)(l: Val, r: Val) = (op, l, r) match {
  case ("+" , IntVal(x), IntVal(y)) => IntVal(x + y)
  case ("==", IntVal(x), IntVal(y)) => BoolVal(x == y)
  // ...
  // runtime error if "+" used with Int and Bool
}

These potential error cases didn’t exist before.

9.4.3 Type Soundness

We would like to make sure that our type checker actually prevents these newly introduced error cases. We could formally prove this based on the typing judgements, if we want to go all the way but at least we should convince ourselves informally by inspecting each case, such as:

def eval(exp: String)(env: Map[String,Type]) = exp match {
  case Prim("+" , List(IntVal(x), IntVal(y))) => IntVal(x + y)
  ...
}

def typeInfer(exp: Exp)(env: Map[String,Type]): Exp = exp match {
  case Prim("+", List(a,b)) =>
    Prim("+", List(typeCheck(a, IntType), typeCheck(b, IntType))).withType(IntType)
    ...
}

Here, it is easy to see that all “well-typed programs don’t go wrong”, i.e., that all error cases in the interpreter map to error cases in the type checker.

Exercise: does the converse also hold? In this case? In general?

Of course for more complicated type systems and languages things aren’t quite so easy (see work on formalizing Scala’s type system in the DOT calculus).

9.4.4 Compilation With Types

Assembly code does not have types. We need to make an “implementation” decision on how to represent the new types.

  • Boolean: 0 or 1, but we still use the full register
  • Unit: There are no operations using Unit, we don’t need to store it

Implementation of the operators:

val x = 1 == 4;

We could use jumps: one branch sets 0, the other sets 1.

but X86 offers us a shortcut:

set<op> %al          // set %al if flags validate <op>
                     // like jump, there are: sete, setne, setl, etc.
movbq %al, %rax      // transform the byte into the full register

We also have to modify our compilation for the If statements.

def trans(exp: Tree, sp: Int)(env: Env) = exp match {
  case If(cond, tBranch, eBranch) =>
    trans(cond, sp)(env)  // now sp will contain 0 or 1
    transJumpIfTrue(sp)("if")
    // ...

What code would transJumpIfTrue generates?

cmp $0, ${regs(sp)}
jne $label
test ${regs(sp)}, ${regs(sp)} # used by gcc
jnz $label

‘test S, T’ sets the flags accordingly to S & T. So if ‘sp’ contains 1: 1 & 1 != 0 so we jump (jnz). If ‘sp’ contains 0: 0 & 0 == 0 so we don’t jump.

9.5 Where Are We?

We looked into the formalization of type checking in our language. We discussed the implementation of the type checker.