abstract.html

Home

Abstracts of Selected Papers

2023

1. Covering All the Bases: Type-Based Verification of Test Input Generators

2. Morpheus: Automated Safety Verification of Data-Dependent Parser Combinator Programs

2022

1. Specification-Guided Component-Based Synthesis from Effectful Libraries

2. Defending Observation Attacks in Deep Reinforcement Learning via Detection and Denoising

3. DistSPECTRL: Distributing Specifications in Multi-Agent Reinforcement Learning Systems

3. Model-free Neural Lyapunov Control for Safe Robot Navigation

2021

1. Data-Driven Abductive Inference of Library Specifications

2. Repairing Serializability Bugs in Distributed Database Programs via Automated Schema Refactoring

2020

1. Semantics, Specification, and Bounded Verification of Concurrent Libraries in Replicated Systems

2. ART : Abstraction Refinement-Guided Training for Provably Correct Neural Networks

2019

1. Mergeable Replicated Data Types

2. CLOTHO: Directed Test Generation for Weakly-Consistent Database Systems

3. An Inductive Synthesis Framework for Verifiable Reinforcement Learning

4. Automated Parameterized Verification of CRDTs

2018

1. Safe Replication through Bounded Concurrency Verification

2. Alone Together: Compositional Reasoning and Inference for Weak Isolation

3. A Data-Driven CHC Solver

4. Automated Detection of Serializability Under Weak Consistency

2017

1. Verifying a Concurrent Garbage Collector using a Rely-Guarantee Methodology

2016

1. Automatically Learning Shape Specifications

2. Representation without Taxation: A Uniform, Low-Overhead and High-Level Interface to Eventually Consistent Key-Value Stores

3. Verifying Custom Synchronisation Constructs Using Higher-Order Separation Logics

2015

1. Learning Refinement Types

2. Cooking the Books: Formalizing JMM Implementation Recipes

3. Poling: SMT-Aided Linearizability Proofs

4. Declatative Programming over Eventually Consistent Data Stores

5. Synthesizing Racy Tests

6. Dependent Array Type Inference from Tests

2014

1. MultiMLton: A Multi-Core Aware Runtime for Standard ML

2. A Relational Framework for Higher-Order Shape Analysis

3. Atomicity Refinement for Verified Compilation

4. Rx-CML: A Prescription for Safely Relaxing Concurrency

2013

1. Flexible Access Control for Javascript

2. Plan B: A Buffered Memory Model for Java

3. Compositional and Lightweight Dependent Type Inference for ML

4. Proof-Directed Parallelization Synthesis by Separation Logic

2012

1. Eliminating Read Barriers through Procrastination and Cleanliness

2. Resource-Sensitive Synchronization Inference Using Abduction

2011

1. Accentuating the Positive: Atomicity Inference and Enforcement Using Correct Executions

2. Isolating Determinism in Multi-Threaded Programs

future (e)

3. Composable Asynchronous Events

composable

asynchronous events

4. Relaxed Memory Concurrency and Verified Compilation

5. Modular Reasoning for Deterministic Parallelism

2010

1. Analyzing Concurrency Bugs Using Dual Slicing

2. Analyzing Multicore Dumps to Facilitate Concurrency Bug Reproduction

3. Analyzing Lightweight Checkpointing for Concurrent ML

Transient faults that arise in large-scale software systems can often be repaired by re-executing the code in which they occur. Ascribing a meaningful semantics for safe re-execution in multithreaded code is not obvious, however. For a thread to re-execute correctly a region of code, it must ensure that all other threads that have witnessed its unwanted effects within that region are also reverted to a meaningful earlier state. If not done properly, data inconsistencies and other undesirable behavior may result. However, automatically determining what constitutes a consistent global checkpoint is not straightforward since thread interactions are a dynamic property of the program. In this paper, we present a safe and efficient checkpointing mechanism for Concurrent ML (CML) that can be used to recover from transient faults. We introduce a new linguistic abstraction called stabilizers that permits the specification of per-thread monitors and the restoration of globally consistent checkpoints. Safe global states are computed through lightweight monitoring of communication events among threads (e.g. message-passing operations or updates to shared variables). We present a formal characterization of its design, and provide a detailed description of its implementation within MLton, a whole-program optimizing compiler for Standard ML. Our experimental results on microbenchmarks as well as several realistic, multithreaded, server-style CML applications, including a web server and a windowing toolkit, show that the overheads to use stabilizers are small, and lead us to conclude that they are a viable mechanism for defining safe checkpoints in concurrent functional programs.

2009

1. Partial Memoization of Concurrency and Communication

Memoization is a well-known optimization technique used to eliminate redundant calls for pure functions. If a call to a function f with argument v yields result r, a subsequent call to f with v can be immediately reduced to r without the need to re-evaluate f's body.

Understanding memoization in the presence of concurrency and communication is significantly more challenging. For example, if f communicates with other threads, it is not sufficient to simply record its input/output behavior; we must also track inter-thread dependencies induced by these communication actions. Subsequent calls to f can be elided only if we can identify an interleaving of actions from these call-sites that lead to states in which these dependencies are satisfied. Similar issues arise if f spawns additional threads.

In this paper, we consider the memoization problem for a higher-order concurrent language whose threads may communicate through synchronous message-based communication. To avoid the need to perform unbounded state space search that may be necessary to determine if all communication dependencies manifest in an earlier call can be satisfied in a later one, we introduce a weaker notion of memoization called partial memoization that gives implementations the freedom to avoid performing some part, if not all, of a previously memoized call.

To validate the effectiveness of our ideas, we consider the benefits of memoization for reducing the overhead of recomputation for streaming, server-based, and transactional applications executed on a multi-core machine. We show that on a variety of workloads, memoization can lead to substantial performance improvements without incurring high memory costs.

2. Exceptionally Safe Futures

A future is a well-known programming construct used to introduce concurrency to sequential programs. Computations annotated as futures are executed asynchronously and run concurrently with their continuations. Typically, futures are not transparent annotations: a program with futures need not produce the same result as the sequential program from which it was derived. Safe futures guarantee a future-annotated program produce the same result as its sequential counterpart. Ensuring safety is especially challenging in the presence of constructs such as exceptions that permit the expression of non-local control-flow. For example, a future may raise an exception whose handler is in its continuation. To ensure safety, we must guarantee the continuation does not discard this handler regardless of the continuation's own internal control-flow (e.g. exceptions it raises or futures it spawns). In this paper, we present a formulation of safe futures for a higher-order functional language with first-class exceptions. Safety can be guaranteed dynamically by stalling the execution of a continuation that has an exception handler potentially required by its future until the future completes. To enable greater concurrency, we develop a static analysis and instrumentation and formalize the runtime behavior for instrumented programs that allows execution to discard handlers precisely when it is safe to do so.

3. Semantics-Aware Trace Analysis

As computer systems continue to become more powerful and complex, so do programs. High-level abstractions introduced to deal with complexity in large programs, while simplifying human reasoning, can often obfuscate salient program properties gleaned from automated source-level analysis through subtle (often non-local) interactions. Consequently, understanding the effects of program changes and whether these changes violate intended protocols become difficult to infer. Refactorings, and feature additions, modifications, or removals can introduce hard-to-catch bugs that often go undetected until many revisions later.

To address these issues, this paper presents a novel dynamic program analysis that builds a semantic view of program executions. These views reflect program abstractions; and aspects; however, views are not simply projections of execution traces, but are linked to each other to capture semantic interactions among abstractions at different levels of granularity in a scalable manner. We describe our approach in the context of Java and demonstrate its utility to improve regression analysis. We first formalize a subset of Java and a grammar for traces generated at program execution. We then introduce several types of views used to analyze regression bugs along with a novel, scalable technique for semantics differencing of traces from different versions of the same program. Bechmark results on large open-source Java programs demonstrate that semantic-aware trace differencing can identify precise and useful details about the underlying cause for a regression even in programs that use reflection, multithreading, or dynamic code generation, features that typically confound other techniques.

4. Alchemist: A Transparent Dependence Distance Profiling Infrastructure

Effectively migrating sequential applications to take advantage of parallelism available on multicore platforms is a well-recognized challenge. This paper addresses important aspects of this issue by proposing a novel profiling technique to automatically detect available concurrency in C programs. The profiler, called Alchemist, operates completely transparently to applications, and identifies constructs at various levels of granularity (e.g., loops, procedures, and conditional statements) as candidates for asynchronous execution. Various dependences including read-after-write (RAW), write-after-read (WAR), and write-after-write (WAW), are detected between a construct and its continuation, the execution following the completion of the construct. The time-ordered {\em distance} between program points forming a dependence gives a measure of the effectiveness of parallelizing that construct, as well as identifying the transformations necessary to facilitate such parallelization. Using the notion of post-dominance, our profiling algorithm builds an execution index tree at run-time. This tree is used to differentiate among multiple instances of the same static construct, and leads to improved accuracy in the computed profile, useful to better identify constructs that are amenable to parallelization. Performance results indicate that the profiles generated by Alchemist pinpoint strong candidates for parallelization, and can help significantly ease the burden of application migration to multicore environments.

5. Speculative N-Way Barriers

Speculative execution is an important technique that has historically been used to extract concurrency from sequential programs. While techniques to support speculation work well when computations perform relatively simple actions (e.g., reads and writes to known locations), understanding speculation for multi-threaded programs in which threads communicate through shared references is significantly more challenging, and is the focus of this paper.

We use as our reference point a simple higher-order concurrent language extended with an n-way barrier and a fork/join execution model. Our technique permits the expression guarded by the barrier to speculatively proceed before the barrier has been satisfied (i.e., before all threads that synchronize on that barrier have done so) and to have participating threads that would normally block on the barrier to speculatively proceed as well. Our solution formulates safety properties under which speculation is correct in a fork/join model, and uses traces to validate these properties modularly on a per-thread and per-synchronization basis.

2008

1. Flattening Tuples in an Intermediate SSA Representation

For functional programs, unboxing aggregate data structures such as tuples removes memory indirections and frees dead components of the decoupled structures. To explore the consequences of such optimizations in a whole-program compiler, this paper presents a tuple flattening transformation and a framework that allows the formal study and comparison of different flattening schemes.

We present our transformation over functional SSA, a simply-typed, monomorphic language and show that the transformation is type-safe. The flattening algorithm defined by our transformation has been incorporated into MLton, a whole-program, optimizing compiler for SML. Experimental results indicate that aggressive tuple flattening can lead to substantial improvements in runtime performance, a reduction in code size, and a decrease in total allocation without a significant increase in compilation time.

2. A Uniform Transactional Execution Environment for Java

Transactional memory (TM) has recently emerged as an effective tool for extracting fine-grain parallelism from declarative critical sections. In order to make STM systems practical, significant effort has been made to integrate transactions into existing programming languages. Unfortunately, existing approaches fail to provide a simple implementation that permits lock-based and transaction-based abstractions to coexist seamlessly. Because of the fundamental semantic differences between locks and transactions, legacy applications or libraries written using locks can not be transparently used within atomic regions. To address these shortcomings, we implement a uniform transactional execution environment for Java programs in which transactions can be integrated with more traditional concurrency control constructs. Programmers can run arbitrary programs that utilize traditionalmutual-exclusion-based programming techniques, execute new programs written with explicit transactional constructs, and freely combine abstractions that use both coding styles.

3. Protocol Inference Using Static Path Profiles

Specifcation inference tools typically mine commonalities among states at relevant program points. For example, to infer the invariants that must hold at all calls to a procedure p requires examining the state abstractions found at all call-sites to p. Unfortunately, existing approaches to building these abstractions require being able to explore all paths (either static or dynamic) to all of p’s call-sites to derive speciﬁcations with any measure of conﬁdence. Because programs that have complex control-ﬂow structure may induce a large number of paths, naive path exploration is impractical.

In this paper, we propose a new speciﬁcation inference technique that allows us to eﬃciently explore statically all paths to a program point. Our approach builds static path proﬁles, proﬁle information constructed by a static analysis that accumulates predicates valid along diﬀerent paths to a program point. To make our technique tractable, we employ a summarization scheme to merge predicates at join points based on the frequency with which they occur on diﬀerent paths. For example, predicates present on a majority of static paths to all call-sites of any procedure p forms the pre-condition of p.

We have implemented a tool, Marga, based on static path proﬁling. Qualitative analysis of the speciﬁcations inferred by marga indicates that it is more accurate than existing static mining techniques, can be used to derive useful speciﬁcation even for APIs that occur infrequently (statically) in the program, and is robust against imprecision that may arise from examination of infeasible or infrequently occurring dynamic paths. A comparison of the speciﬁcations generated using marga with a dynamic speciﬁcation inference engine based on Cute, an automatic unit test generation tool, indicates that Marga generates comparably precise speciﬁcations with smaller cost.

4. Quasi-Static Scheduling for Safe Futures.

Migrating sequential programs to effectively utilize next generation multicore architectures is a key challenge facing application developers and implementors. Languages like Java that support complex control- and dataflow abstractions confound classical automatic parallelization techniques. On the other hand, introducing multithreading and concurrency control explicitly into programs can impose a high conceptual burden on the programmer, and may entail a significant rewrite of the original program.

In this paper, we consider a new technique to address this issue. Our approach makes use of futures, a simple annotation that introduces asynchronous concurrency into Java programs, but provides no concurrency control. To ensure concurrent execution does not yield behavior inconsistent with sequential execution (i.e., execution yielded by erasing all futures), we present a new interprocedural summary-based dataflow analysis. The analysis inserts lightweight barriers that block and resume threads executing futures if a dependency violation may ensue. There are no constraints on how threads execute other than those imposed by these barriers.

Our experimental results indicate futures can be leveraged to transparently ensure safety and profitably exploit parallelism; in contrast to earlier efforts, our technique is completely portable, and requires no modifications to the underlying JVM.

2007