NJPLS at University of Pennsylvania

May 19, 2016

This instance of NJPLS will take place at the University of Pennsylvania.

All are welcome and there is no registration fee. However, if you are planning to attend, you must register in advance by filling out this form.

Travel instructions may be found here.

Agenda

8:30amBreakfast
9:00am
10:15amCoffee
11:00am
12:15pmLunch
1:35pmBusiness Meeting
1:45pm
3:00pmCoffee
3:45pm

This meeting was organized by Stephanie Weirich.

Abstracts

Probabilistic NetKAT

Nate Foster

This talk will present a new language for network programming based on a probabilistic semantics. We extend the NetKAT language with new primitives for expressing probabilistic behaviors and enrich the semantics from one based on deterministic functions to one based on measurable functions on sets of packet histories. We establish fundamental properties of the semantics, prove that it is a conservative extension of the deterministic semantics, show that it satisfies a number of natural equations, and develop a notion of approximation. We present case studies that show how the language can be used to model a diverse collection of scenarios drawn from real-world networks. This is joint work with Dexter Kozen, Kostas Mamouras, Mark Reitblatt, and Alexandra Silva.

A program logic for union bounds

Justin Hsu

We propose a probabilistic Hoare logic aHL based on the union bound, a basic fact from probability theory. While this principle is simple, it is an extremely common tool in the analysis of randomized algorithms. From a formal verification perspective, the union bound allows compositional reasoning over possible ways an algorithm may go wrong. The union bound is also a flexible abstraction for expressing complex facts about probabilities---even though our target programs and properties are probabilistic, assertions in our logic are standard first-order formulas and do not involve probabilities or distributions.

Our logic can also prove accuracy properties for interactive programs, where the program must produce intermediate outputs as soon as pieces of the input arrive, rather than accessing the entire input at once. This setting also enables adaptivity, where later inputs may depend on earlier intermediate outputs.

Draft paper

Domain specific type error diagnosis

Juriaan Hage

Verifying programs with Complex data structures using Coq

Ken Roe and Scott Smith

If one had a tool that could verify that a program correctly maintains data structure invariants, it would be very useful for finding bugs. For example, the Heartbleed bug from a couple of years ago can be traced to an invariant vio- lation. The OpenSLL library made an assumption that packet size information stored in two different places was consistent. By sending packets which broke this invariant, a hacker was able to steal critical data. Had a tool existed to verify these invariants, this bug would have been caught before the software was released.

The research presented in this abstract aims at creating a tool for first docu- menting data structure invariants and second to verify them. We have developed a separation logic based language using the Coq theorem prover. This language is sufficient to document most useful invariants. We are working on the verification of a simplified version of the DPLL algorithm to demonstrate the utility of the invariants. The code for this algorithm is around 200 lines of C code. The invariant describing relationship between all the data structures is around 100 lines of Coq code. This invariant describes simple relationship such as the relation between an array storing assignments for boolean variables and a linked list storing the same assignments using pairs with the variable number and value. It also describes more complex relationships such as the 2-watch variable algorithm used to quickly identify unit propagations in DPLL.

One of the keys to make completing the proof tractable is to be able to break it into smaller pieces. In order to do this, we needed to add some constructs tour separation logic framework. These constructs make it easier to represent inter- mediate states where for example, an intermediate state might be "all invariants hold except that one variable has been assigned a new value."

Any Coq user who as attempted a non-trivial proof has found that the process is extremely tedious. The author after analyzing some of his own workflow in developing proofs identified a number of areas in which the proof development process could be improved. One key finding is that of developing a large proof (with many lemmas) often requires many iterations of revisions on the statement of the proof. Developing the proof script often reveals errors in the statement of the proof. Changing the statement then requires the proof to be replayed which is very tedious. As part of the research, we introduce a new IDE, CoqPIE that has all the functionality of Proof General or Coq IDE plus many new features to deal with workflow issues. For example, the IDE introduces tools to automatically replay and update proof scripts.

Constructive Galois Connections

David Darais

Galois connections are a foundational tool for structuring abstraction in semantics and their use lies at the heart of the theory of abstract interpretation. Yet, mechanization of Galois connections remains limited to restricted modes of use, preventing their general application in mechanized metatheory and certified programming.

This paper presents constructive Galois connections, a variant of Galois connections that is effective both on paper and in proof assistants; is complete with respect to a large subset of classical Galois connections; and enables more general reasoning principles, including the "calculational" style advocated by Cousot.

To design constructive Galois connection we identify a restricted mode of use of classical ones which is both general and amenable to mechanization in dependently-typed functional programming languages. Crucial to our metatheory is the addition of monadic structure to Galois connections to control a "specification effect". Effectful calculations may reason classically, while pure calculations have extractable computational content. Explicitly moving between the worlds of specification and implementation is enabled by our metatheory.

To validate our approach, we provide two case studies in mechanizing existing proofs from the literature: one uses calculational abstract interpretation to design a static analyzer, the other forms a semantic basis for gradual typing. Both mechanized proofs closely follow their original paper-and-pencil counterparts, employ reasoning principles not captured by previous mechanization approaches, support the extraction of verified algorithms, and are novel.

Accurate Rigorous Simulation Should be Possible for Good Designs

Walid Taha, joint work with Adam Duracz and Ferenc Bartha

The development of Cyber-Physical Systems benefits from better methods and tools to support the simulation and verification of hybrid (continuous/discrete) models. Acumen is an open source testbed for exploring the design space of what rigorous-but-practical next-generation tools can deliver to developers. Central to Acumen is the notion of rigorous simulation. Like verification tools, rigorous simulation is intended to provide guarantees about the behavior of the system. Like traditional simulation tools, it is intended to be intuitive, practical, and scalable. Whether these two goals can be achieved simultaneously is an important, long-term challenge.

This paper proposes a design principle that can play an important role in meeting this challenge. The principle addresses the criticism that accumulating numerical errors is a serious impediment to practical rigorous simulation. It is inspired by a twofold insight: one relating to the nature of systems engineered in the real world, and the other relating to how numerical errors in the simulation of a model can be recast as errors in the state or parameters of the model in the simulation. We present a suite of small, concrete benchmarks that can be used to assess the extent to which a rigorous simulator upholds the proposed principle. We also report on which benchmarks Acumen’s current rigorous simulator already succeeds and which ones remain challenging. Based on work presented at SNR 2016.

Calculational Design of Information Flow Monitors

Mounir Assaf, joint work with David Naumann

Fine grained information flow monitoring can in principle address a wide range of security and privacy goals, for example in web applications. But it is very difficult to achieve sound monitoring with acceptable runtime cost and sufficient precision to avoid impractical restrictions on programs and policies. We present a systematic technique for design of monitors that are correct by construction. It encompasses policies with downgrading. The technique is based on abstract interpretation which is a standard basis for static analysis of programs. This should enable integration of a wide range of analysis techniques, enabling more sophisticated engineering of monitors to address the challenges of precision and scaling to widely used programming languages.

Multiparty Session Types and their Applications

Nobuko Yoshida, Imperial

We give a summary of our recent research developments on multiparty session types for verifying distributed, parallel and concurrent programs, and our collaborations with industry partners. We shall first outline the multiparty session types and then explain how we started collaborations with industry to develop a protocol description language called Scribble. We then talk about the recent developments in Scribble, the runtime session monitoring framework used in Ocean Observatories Initiative and network protocol verifications. We also demonstrate how our multiparty session synthesis theory is applied to Zero Deviation Life Cycle project with Cognizant; and static deadlock analysis for Google's Go language.

On properties of B-terms

Keisuke Nakano

B-terms are built from the B combinator alone defined by B = \f.\g.\x. f (g x), which is well-known as a function composition operator. In this talk, I introduce an interesting property of B-terms, that is, whether repetitive right applications of a B-term circulates or not. The decision problem of the property is investigated through a canonical representation of B-terms and a sound and complete equational axiomatization. I present several remaining problems related to the property as well.

Making Big-Data Programming Easy

Martin Hirzel

Recent trends, from social media to algorithmic trading to the internet of things, have made available a deluge of data. This data comes at both high volume and high velocity, making it difficult to handle efficiently. Those who quickly extract insights from it gain an edge. Unfortunately, with existing systems and languages, it is hard to write efficient big-data applications. The challenge is bridging the gap between a high-level programming experience on the one hand and low-level incremental and parallel algorithms on the other hand. This talk describes three programming models we built at IBM. First, the SPL language allows library writers to implement streaming operators as compiler extensions. Second, ActiveSheets offers a spreadsheet interface for programming streaming operators. And third, META offers a unified rule-based programming model both for online event processing and for batch analytics in the same system. This talk describes our research innovations as well as productization experiences.

Atomicity Checker for Task Parallel Programs

Adarsh Yoga

Task based programming models (e.g., Cilk, Intel TBB, X10, Java Fork-Join tasks) simplify multicore programming in contrast to programming with threads. In a task based model, the programmer specifies parallel tasks and the runtime maps these tasks to hardware threads. The runtime automatically balances the load using work-stealing and provides performance portability. However, interference between parallel tasks can result in concurrency errors.

We propose a dynamic analysis technique to detect atomicity violations in task parallel programs, which could occur in different schedules for a given input without performing interleaving exploration. Our technique leverages the series-parallel dynamic execution structure of a task parallel program to identify parallel accesses. It also maintains access history metadata with each shared memory location to identify parallel accesses that can cause atomicity violations in different schedules. To streamline metadata management, the access history metadata is split into global metadata that is shared by all tasks and local metadata that is specific to each task. The global metadata tracks a fixed number of access histories for each shared memory location that capture all possible access patterns necessary for an atomicity violation. Our prototype tool for Intel Threading Building Blocks (TBB) detects atomicity violations that can potentially occur in different interleavings for a given input with performance overheads similar to Velodrome atomicity checker for thread based programs

Automatically Finding & Fixing Parallel Performance Bugs

Joe Devietti

Multicore architectures continue to pervade every part of our computing infrastructure, from servers to phones and smart watches. While these parallel architectures bring established performance and energy efficiency gains compared to single-core designs, parallel code written for these architectures can suffer from subtle performance bugs that are difficult to understand and repair with current tools.

We'll discuss two systems that leverage hardware-software co-design to tackle false sharing performance bugs, in the context of both unmanaged languages like C/C++ and managed languages like Java. These systems use hardware performance counters for efficient bug detection, and novel runtime systems to repair bugs online without the need for programmer intervention. Along the way, we'll discuss some subtle memory consistency model issues brought to light by these optimizations.