representative research projects

Deductive Synthesis of Correct-by-Construction Programs

The goal of program synthesis is to automatically produce the implementation of a program from a high-level specification its behavior. Fiat is a synthesis framework for the Coq proof assistant that combines foundational ideas from programming language and formal methods to build an automated refinement system from first principles capable produces correcqt-by-construction sofware with a minimal trusted code base. Starting from high-level specifications in Coq's rich logic, Fiat helps users interactively construct an executable implementation. Every step in the process is justified by a machine-checked refinement proof which, when composed together, certifies that the derived implementation meets the original specification. The framework provides libraries of domain-specific specification languages and reusable refinement tactics for guiding the search for an implementation, leveraging Coq's powerful Ltac metaprogramming language to achieve a high degree of automation. Each refinement step is certified by Coq, allowing users to freely incorporate their own implementation strategies into a derivation without compromising the guarantees of its correctness.

Representative publications:

Synthesis of Communication Code from Binary Formats

The code responsible for encoding and parsing external data is a vital component of any software that communicates with the outside world: failure to produce or correctly interpret encoded data that conform to a standard format can result in lost or corrupted data, while bugs in the decoder code that takes untrusted external data as input represents a key attack surface for malicious actors. Writing code that conforms to a high-level format description by hand is both tedious and error-prone. As one example of the seriousness of this risk, a bug was recently discovered in the Linux kernel that allowed remote attackers to execute arbitrary code by sending carefully crafted UDP traffic. In this project seeks to eliminate both the tedium and the potential for error in building binary encoders and decoders. We are developing a framework, called Narcissus, that allows users to express a format as a binary relation that precisely captures all the valid binary encodings of an arbitrary datatype instance. From this specification, Narcissus synthesizes a decoder that is guaranteed to both be the inverse of this relation and to detect malformed encodings by failing on any inputs not included in this relation. Narcissus produces a proof trail certifying the correctness of the synthesized decoder. Using Narcissus, we have synthesized replacement decoders for all packet formats used in the standard network stack of the MirageOS unikernel.

Progress Report:

Modular and Extensible Programming Language Metatheory

Static program analyses are an important tool for ensuring the absence of run-time errors. A faulty analysis is ofentimes worse than none at all, providing users with a false sense of security. This makes it particularly important that static analyses are sound, only validating programs that are actually error-free. Designers typically establish soundness via a mathematical meta-theory proof connecting the analysis to a formal description of a language's semantics. To ensure the correctness of these intricate proofs, researchers commonly mechanize them in computerized proof assistants. These mechanized proofs can be both tedious and labor-intensive, but this effort results in a foundational proof of the correctness of the analysis, which is automatically conferred onto any analyzed program. These mechanized proof developments are artifacts that, much like source code, can be adapted and extended with new features, enabling the development of variant languages. Unfortunately, the standard approach to language extension is essentially to copy the existing formalization, modify that copy, and then patch up the proofs. The effects of an extension can be woven throughout the meta-theory proofs of a language, making it difficult to both identify and apply the meta-theory changes required by a new feature. This project attempts to solve these challenges by developing sounds and modular approaches to extend programming languages with new features, allowing programming language researchers to better leverage existing mechanized formalizations of languages, enabling innovation without compromising soundness.

Representative publications: