New Jersey
Programming Languages and Systems
Seminar

May 9, 2025

Jane Street, 250 Vesey St., New York, NY

The New Jersey Programming Languages and Systems Seminar (NJPLS) brings researchers in the greater New Jersey area together for a day of informal talks and discussions. We accept talks on complete research and work in progress, dealing with topics of interest to the programming languages and systems communities. See the talk proposal form for more information.

When Friday, May 9, 2025
Where Jane Street · 250 Vesey St, New York, NY, 10281
What Research talks and discussions. Breakfast, lunch, coffee, and snacks provided.

Registration is free thanks to Jane Street's sponsorship!

Talk Proposal Deadline: April 11, 2025
Registration Deadline: May 2, 2025

Any questions or concerns should be directed to Richard Eisenberg (reisenberg@janestreet.com).


Program

9:00 AM Breakfast and Sign-in
9:55 AM Welcome
10:00 AM Package Managers à la Carte Shiwei Weng
Johns Hopkins University
Abstract

Package managers are a cornerstone of modern programming and system software ecosystems. Despite their ubiquity, they are often informally specified and implemented in an ad hoc manner, leading to frequent misunderstandings and misuse by developers, administrators, and end users alike.

In this talk, we present our ongoing work on modeling package managers as distributed key-value stores equipped with version logics. We examine real-world package managers through this lens, demonstrating how our model captures their core behaviors.

Building on this analysis, we introduce a proof-of-concept framework that enables the modular derivation of package managers with desired features. Our findings offer a principled approach to understanding, designing, and building more reliable and flexible package management systems.

For NJPLS, we will demonstrate the potential of our tool to bring package managers (and dependency hell) to languages that no one realized might need package managers — but do.

10:20 AM βη-reduction with surjective pairing is confluent for strongly β-normalizing terms Yiyun Liu
University of Pennsylvania
Abstract

It is a well-known result that in the untyped λ-calculus, the βη-reduction relation extended with the surjective pairing axiom fails to satisfy confluence. In this talk, I show that the confluence proof is salvageable for a set of untyped, strongly β-normalizing (SN) terms where η-postponement can still be proved.

In the context of a dependent type theory with a cumulative universe hierarchy, I show how the confluence result can help us model judgmental equality with product and function η-laws with untyped βη equivalence, enabling us to prove the correctness of Coquand's algorithm syntactically from the assumption that all well-typed terms are strongly β-normalizing. As a fun corollary, we also get to show that the dependent type theory with type-annotated conversion is equivalent to a dependent type theory with untyped βη-conversion restricted to β-SN terms.

Finally, I'll discuss how the SN premise for η-postponement can be relaxed into a weaker coinductive statement, and thus potentially allowing us to derive partial correctness of algorithmic conversion for systems that are not normalizing.

10:40 AM Logical Relations for Formally Verified Authenticated Data Structures Simon Gregersen
New York University
Abstract

Authenticated data structures allow untrusted third parties to carry out operations which produce proofs that can be used to verify an operation's output. Such data structures are challenging to develop and implement correctly. This paper gives a formal proof of security and correctness for a library that generates authenticated versions of data structures automatically. The proof is based on a new relational separation logic for reasoning about programs that use collision-resistant cryptographic hash functions. This logic provides a basis for constructing two semantic models of a type system, which are used to justify how the library makes use of type abstraction to enforce security and correctness. Using these models, we also prove the correctness of several optimizations to the library and then show how optimized, hand-written implementations of authenticated data structures can be soundly linked with automatically generated code. All of the results in this paper have been mechanized in the Rocq proof assistant using the Iris framework.

11:00 AM Coffee Break
11:30 AM Revisiting the D Programming Language Mike Shah
Yale University
Abstract

The D programming language (dlang) is a systems programming language created by Walter Bright and released in 2001. D has continued to evolve during its near 25 year history, and in this talk I will re-introduce the D programming language to the community, as someone who has now decided to do significant teaching and also application development in the D language. Why I made this decision, and what are the features that dlang interesting will be answered in this talk. A rundown of language design decisions in D will be discussed, and the audience takeaway will be to think about language design for modern systems languages, and what features I forsee (in D and related languages) are 'must haves' for both productivity, speed, and safety in the future.

11:50 AM Flow-Analysis-Based Closure Optimization Byron Zhong
University of Chicago
Abstract

One of the key implementation issues for higher-order functional languages is managing the representation of first-class function values. The standard approach to this problem is closure conversion, which is a compiler transformation that introduces an explicit data structure to represent the environment of a function value. Constructing the closure for a nested function usually involves copying data from the enclosing function's closure. To avoid this copying, one might use linked representations, but that choice makes variable access slow and can introduce space leaks. Shao and Appel developed an effective closure converter for the SML/NJ system that reduces copying and is safe-for-space. Their approach has a weakness in that it is built on a purely syntactic analysis that cannot identify higher-order flows.

In this talk, I will present a new safe-for-space closure converter for SML/NJ that uses a higher-order flow analysis to inform closure-representation decisions. I will show an efficient implementation of 0-CFA and describe how we can use this information to improve two aspects of closure optimization: sharing heap-allocated tuples of variables and spreading variables into argument registers. The new closure converter produces better code on average than the Shao and Appel converter, with significant performance gains for some benchmarks.

Joint work with my advisor John Reppy (University of Chicago).

12:10 PM Vellvm: Low Level Memory Versus Optimizations Calvin Beck
University of Pennsylvania
Abstract

We ask so much of LLVM. We expect to be able to write efficient programs with low-level operations and direct access to memory, but we also expect LLVM to provide us with a bounty of optimizations to make our low-level code even more efficient! Unfortunately, these desires are often incompatible and lead to vastly more complicated language semantics than you might expect. This talk will explore these complications and the interplay between low-level operations and optimizations. We'll talk about how memory may not be the simple key-value store you'd expect when you want your compiler to be able to perform optimizations, and how we're resigned to a world with undefined behaviour if we want our compilers to do what little magic they can do! These semantic edge cases are unignorable in the context of the verified optimization passes we would like to create in the Vellvm project, as Rocq does not let us lie.

12:30 PM Lunch
1:50 PM Typing Laziness (Co)-Effectively Daniel Sainati
University of Pennsylvania
Abstract

In non-strict languages like Haskell, variables can either be used strictly (their value is definitely scrutinized during execution) or lazily (they might not be inspected on all execution paths). Optimizing compilers like GHC can use strictness analysis to detect the differences in these kinds of usages to pre-evaluate strictly-used arguments to functions, often with significant performance impact. However, strictness analysis is internal to the compiler, and can be difficult for programmers to predict and reason about.

In this talk, we outline how coeffects can be used to reformulate a variant of strictness analysis as a user-facing type system. We explain why controlling strictness with a type system is of value to programmers, and why the obvious approaches to this problem do not work. Along the way we examine what effects and coeffects are, how they differ, and the ways in which our new type system blurs the line between them.

2:10 PM PDL: a new language to program generative AI applications Louis Mandel
IBM Research
Abstract

The rapid growth of prompt and agent programming has led to a proliferation of frameworks such as AutoGen, CrewAI, or Bee. While these frameworks help developers get started with generative AI, they introduce an abstraction layer that obscures the underlying prompts. That makes it difficult to identify and fix errors and optimize model accuracy, since prompt engineering is central to building generative AI applications. To overcome this limitation, we introduce PDL, a novel declarative programming language that makes interacting with Large Language Models (LLMs) easy but leaves full control to the developers. Our runtime automatically parallelizes LLM calls, uses types for structured decoding, and through a source-to-source transformation, enables a graphical user interface for intuitive program understanding and debugging. By empowering developers with full control and visibility, PDL unlocks new possibilities for building accurate, efficient, and reliable generative AI applications. In this talk, we will present both the design and implementation of the language.

2:30 PM Verification of Higher-order Flow-directed Program Transformations Benjamin Quiring
University of Maryland
Abstract

Compilers typically perform optimizations such as dead-parameter elimination, uncurrying, and arity-raising only on known functions; functions that are only applied, never passed as an argument to a higher-order function or put inside a data-structure. To do better, compilers can do flow-analysis to obtain semantic information about how the program executes, and then use this information to perform more optimizations. In particular, many of these classical optimizations can be made "flow-directed," working on higher-order functions whose behavior isn't known. In this talk, I'll demonstrate why this is so universal, in a way that aids the verification of these optimizations.

Mathematically, classical optimizations can typically be decomposed into a sequence of transformations, starting with the worker-wrapper transformation. This decomposition aids in verification, since each transformation can be verified independently and is often used for multiple optimizations. To obtain a flow-directed optimization, just a single step in the decomposition needs to be swapped out, replaced by one that uses the results of the flow analysis. This step is independent of the specific optimization, and so any transformation that uses worker-wrapper can be made flow-directed. The independence also means that we can verify the step once, reusing it to verify all flow-directed passes.

2:50 PM Giving types to jq Alperen Keles
University of Maryland
Abstract

jq is a popular tool, equipped with a domain-specific language, for processing JSON. The jq language is a dynamically typed, lazily evaluated functional programming language. A notable shortcoming of jq is its error messages, which are cryptic and hard to decode, especially for novice jq users. We propose tjq, a static type system for jq that requires no type annotations, and improves the quality of the error messages provided to users. This talk will focus on the design of the tjq type system, as well as the random testing infrastructure designed for its validation with respect to jq semantics.

3:10 PM Coffee Break
3:40 PM Indistinguishability Reasoning for Interaction Trees Justine Frank
University of Maryland
Abstract

Interaction trees are an increasingly popular way to define and reason about the semantics of non-terminating, effectful, first-order programming languages using layered monadic interpreters. They have a rich equational theory for reasoning up to strong- or weak-bisimulation. However, for some applications, like crash recovery systems, bisimulation is too strong; the desired property is not that two computations behave the same, but that they will be indistinguishable---they will never be observed behaving differently.

To enable reasoning about indistinguishability, we develop tolerance-up-to-taus (tutt); an extension of equivalence-up-to-taus (eutt) that allows removing an infinite number of taus from either side. We draw out the insight that the tutt relation thus has a nice property of being a middle point in between termination-sensitive and termination-insensitive reasoning, permitting reasoning about computations that do not terminate yet continuously perform observable effects. A key challenge of working with tutt is that, unlike eutt, this relation does not admit transitivity; however, we show that it is compatible with existing ITree "up-to" reasoning, allowing rewriting by bisimulation equivalences while proving tutt, and supports reasoning up-to bind. Finally, we showcase the usefulness of tutt by demonstrating a proof of indistinguishability of crash-recovering executions of a program.

4:00 PM What compiler writers should know about providing run-time support for debugging, debuggers, and decompilers Rocky Bernstein
gnu.org
Abstract

In current compiler books, not much is written about

  • the information that is needed to provide good run-time error information, or
  • the information that a debugger needs to debug the compiled program.
University courses on compilers don't covers these areas, either.

I will outline what information compilers should provide in order to facilitate use of debuggers and provide good run-time error reporting. If there is interest, I can also describe briefly how debuggers and decompilers work. For decompilers, I emphasize the relation between compilation and decompilation, and a dichotomy of approaches for general-purpose decompilation versus special-purpose decompilation.

4:20 PM Nullability Type Inference via Machine Learning Kazi Amanul Islam Siddiqui
New Jersey Institute of Technology
Abstract

Null-pointer dereferences are a common source of software bugs in Java. Pluggable type systems such as NullAway provide a static, compile-time safety net against these errors. However, these systems require annotating legacy codebases with qualifiers (e.g., @Nullable), a process that is both time-consuming and error-prone. While existing type inference engines can reduce some of the manual effort, they typically rely on specialized constraint-solving and require deep integration with each checker, making them cumbersome to apply and reuse. Our work addresses this challenge by developing a machine-learning-based approach to infer pluggable type qualifiers, focusing on Java nullability as a feasibility study.

4:40 PM A Rose Tree is Blooming (Proof Pearl) Joomy Korkut
Bloomberg L.P.
Abstract

Game trees are a fundamental mathematical abstraction for analyzing games, often implemented as rose trees in functional programming. We can construct rose trees by starting from an initial state and iteratively applying the function that provides the states one move away, an approach known as an anamorphism (or colloquially, an unfold).

In ML or Haskell, finite and infinite rose trees share the same type, but proof assistants typically distinguish them: finite trees as inductive, infinite ones as coinductive. With the inductive approach, defining the unfold function is tricky but we obtain a finite tree that is easy to compute with and easy to reason about. With the coinductive approach, defining the unfold function is easy but we obtain a potentially infinite tree, which is harder to reason about since we must resort to proof techniques like bisimulation. In this talk, we compare the inductive and coinductive approaches to game trees, implement unfold functions for both, and prove the soundness and completeness of both unfold functions (with respect to the game) in the Rocq Prover.

5:00 PM Closing Remarks