A Translation of OCaml GADTs into Coq

Pedro Abreu, Guillaume Claret, and Benjamin Delaware,

abstract

Proof assistants based on dependent types are powerful tools for building certified software. In order to verify programs written in a different language, however, a representation of those programs in the proof assistant is required. When that language is sufficiently similar to that of the proof assistant, one solution is to use a shallow embedding to directly encode source programs as programs in the proof assistant. One challenge with this approach is ensuring that any semantic gaps between the two languages are accounted for. In this paper, we present GSet, a mixed embedding that bridges the gap between OCaml GADTs and inductive datatypes in Coq. This embedding retains the rich typing information of GADTs while also allowing pattern matching with impossible branches to be translated without additional axioms. We formalize this with GADTml, a minimal calculus that captures GADTs in OCaml, and gCIC, an impredicative variant of the Calculus of Inductive Constructions. Furthermore, we present the translation algorithm between GADTml and gCIC, together with a proof of the soundness of this translation. We have integrated this technique into Coq-of-OCaml, a tool for automatically translating OCaml programs into Coq. Finally, we demonstrate the feasibility of our approach by using our enhanced version of Coq-of-OCaml to translate a portion of the Tezos code base into Coq.