John Li

I'm a PhD student of Amal Ahmed and Steven Holtzen at Northeastern University. I'm generally interested in programming language semantics, especially for languages with probabilistic features. Before starting my PhD, I built a verified-optimization-pass generator as part of the CertiCoq project. When I'm not thinking about research, I like learning math and playing bullet chess, violin, and ping pong.

Papers

Categorical Semantics of Probabilistic Symbolic Execution PLDI 2026
John M. Li, Jack Czenszak, Steven Holtzen.

Abstract DOI Local copy

Symbolic execution has emerged as a powerful technique for scaling exact probabilistic inference to languages with more expressive features. But, this expressivity comes at a price: probabilistic programming languages based on symbolic execution are difficult to debug, optimize, and prove correct due to the many intricacies inherent to high-performance symbolic execution strategies. We aim to make it easier to work with probabilistic symbolic executors by developing symbolic sets, a new semantic domain that cleanly captures the notion of computation underlying symbolic execution. Just as a symbolic executor replaces ordinary execution with a lifted semantics, symbolic set theory replaces ordinary set theory with a lifted mathematics: the category of symbolic sets is a Grothendieck topos, which allows type theory to be used as a metalanguage for working with symbolic sets and functions. We prove a metatheorem that shows how a large class of definitional interpreters written in the internal language of symbolic sets are automatically correct for their ordinary set-theoretic interpretations. Using this metatheorem, we give the first full correctness argument for a symbolic probabilistic language with higher-order functions, type-directed state merging, pattern matching, and structural recursion.

From Linearity to Borrowing OOPSLA 2025
Andrew Wagner, Olek Gierczak, Brianna Marshall, John M. Li, Amal Ahmed.

Abstract DOI Local copy OOPSLA 2025 Distinguished Paper Award

Linear type systems are powerful because they can statically ensure the correct management of resources like memory, but they can also be cumbersome to work with, since even benign uses of a resource require that it be explicitly threaded through during computation. Borrowing, as popularized by Rust, reduces this burden by allowing one to temporarily disable certain resource permissions (e.g., deallocation or mutation) in exchange for enabling certain structural permissions (e.g., weakening or contraction). In particular, this mechanism spares the borrower of a resource from having to explicitly return it to the lender but nevertheless ensures that the lender eventually reclaims ownership of the resource.

In this paper, we elucidate the semantics of borrowing by starting with a standard linear type system for ensuring safe manual memory management in an untyped lambda calculus and gradually augmenting it with immutable borrows, lexical lifetimes, reborrowing, and finally mutable borrows. We prove semantic type soundness for our Borrow Calculus (BoCa) using Borrow Logic (BoLo), a novel domain-specific separation logic for borrowing. We establish the soundness of this logic using a semantic model that additionally guarantees that our calculus is terminating and free of memory leaks. We also show that our Borrow Logic is robust enough to establish the semantic safety of some syntactically ill-typed programs that temporarily break but reestablish invariants.

Multi-Language Probabilistic Programming OOPSLA 2025
Sam Stites, John M. Li, and Steven Holtzen.

Abstract arXiv DOI Local copy

There are many different probabilistic programming languages that are specialized to specific kinds of probabilistic programs. From a usability and scalability perspective, this is undesirable: today, probabilistic programmers are forced up-front to decide which language they want to use and cannot mix-and-match different languages for handling heterogeneous programs. To rectify this, we seek a foundation for sound interoperability for probabilistic programming languages: just as today's Python programmers can resort to low-level C programming for performance, we argue that probabilistic programmers should be able to freely mix different languages for meeting the demands of heterogeneous probabilistic programming environments. As a first step towards this goal, we introduce MultiPPL, a probabilistic multi-language that enables programmers to interoperate between two different probabilistic programming languages: one that leverages a high-performance exact discrete inference strategy, and one that uses approximate importance sampling. We give a syntax and semantics for MultiPPL, prove soundness of its inference algorithm, and provide empirical evidence that it enables programmers to perform inference on complex heterogeneous probabilistic programs and flexibly exploits the strengths and weaknesses of two languages simultaneously.

Roulette: A Language for Expressive, Exact, and Efficient Discrete Probabilistic Programming PLDI 2025
Cameron Moy, Jack Czenszak, John M. Li, Brianna Marshall, and Steven Holtzen.

Abstract DOI Local copy

Exact probabilistic inference is a requirement for many applications of probabilistic programming languages (PPLs) such as in high-consequence settings or verification. However, designing and implementing a PPL with scalable high-performance exact inference is difficult: exact inference engines, much like SAT solvers, are intricate low-level programs that are hard to implement. Due to this implementation challenge, PPLs that support scalable exact inference are restrictive and lack many features of general-purpose languages.

This paper presents Roulette, the first discrete probabilistic programming language that combines high- performance exact inference with general-purpose language features. Roulette supports a significant subset of Racket, including data structures, first-class functions, surely-terminating recursion, mutable state, modules, and macros, along with probabilistic features such as finitely supported discrete random variables, conditioning, and top-level inference. The key insight is that there is a close connection between exact probabilistic inference and the symbolic evaluation strategy of Rosette. Building on this connection, Roulette generalizes and extends the Rosette solver-aided programming system to reason about probabilistic rather than symbolic quantities. We prove Roulette sound by generalizing a proof of correctness for Rosette to handle probabilities, and demonstrate its scalability and expressivity on a number of examples.

A Nominal Approach to Probabilistic Separation Logic LICS 2024
John M. Li, Jon Aytac, Philip Johnson-Freyd, Amal Ahmed, and Steven Holtzen.

Abstract Slides arXiv DOI Local copy

Currently, there is a gap between the tools used by probability theorists and those used in formal reasoning about probabilistic programs. On the one hand, a probability theorist decomposes probabilistic state along the simple and natural product of probability spaces. On the other hand, recently developed probabilistic separation logics decompose state via relatively unfamiliar measure-theoretic constructions for computing unions of sigma-algebras and probability measures. We bridge the gap between these two perspectives by showing that these two methods of decomposition are equivalent up to a suitable equivalence of categories. Our main result is a probabilistic analog of the classic equivalence between the category of nominal sets and the Schanuel topos. Through this equivalence, we validate design decisions in prior work on probabilistic separation logic and create new connections to nominal-set-like models of probability.

Lilac: A Modal Separation Logic for Conditional Probability PLDI 2023
John M. Li, Amal Ahmed, and Steven Holtzen.

Abstract Slides arXiv DOI Local copy

We present Lilac, a separation logic for reasoning about probabilistic programs where separating conjunction captures probabilistic independence. Inspired by an analogy with mutable state where sampling corresponds to dynamic allocation, we show how probability spaces over a fixed, ambient sample space appear to be the natural analogue of heap fragments, and present a new combining operation on them such that probability spaces behave like heaps and measurability of random variables behaves like ownership. This combining operation forms the basis for our model of separation, and produces a logic with many pleasant properties. In particular, Lilac has a frame rule identical to the ordinary one, and naturally accommodates advanced features like continuous random variables and reasoning about quantitative properties of programs. Then we propose a new modality based on disintegration theory for reasoning about conditional probability. We show how the resulting modal logic validates examples from prior work, and give a formal verification of an intricate weighted sampling algorithm whose correctness depends crucially on conditional independence structure.

Notes: Since publication, Jialu Bao has pointed out that the interpretation of almost-sure equality of random variables does not validate the expected substitution rule, due to a subtle point about negligibility. This issue is corrected in the local copy above and on arXiv; see Appendix B.5 for details. We did not attempt to fully mechanize Lilac, but we do have a mechanization of our main result that probability spaces form a Kripke resource monoid (Theorem 2.4 in the paper).

Deriving Efficient Program Transformations from Rewrite Rules ICFP 2021
John M. Li and Andrew W. Appel.

Abstract Slides DOI Local copy

An efficient optimizing compiler can perform many cascading rewrites in a single pass, using auxiliary data structures such as variable binding maps, delayed substitutions, and occurrence counts. Such optimizers often perform transformations according to relatively simple rewrite rules, but the subtle interactions between the data structures needed for efficiency make them tricky to write and trickier to prove correct. We present a system for semi-automatically deriving both an efficient program transformation and its correctness proof from a list of rewrite rules and specifications of the auxiliary data structures it requires. Dependent types ensure that the holes left behind by our system (for the user to fill in) are filled in correctly, allowing the user low-level control over the implementation without having to worry about getting it wrong. We implemented our system in Coq (though it could be implemented in other logics as well), and used it to write optimization passes that perform uncurrying, inlining, dead code elimination, and static evaluation of case expressions and record projections. The generated implementations are sometimes faster, and at most 40% slower, than hand-written counterparts on a small set of benchmarks; in some cases, they require significantly less code to write and prove correct.

Compositional Optimizations for CertiCoq ICFP 2021
Zoe Paraskevopoulou, John M. Li, and Andrew W. Appel.

Abstract DOI Local copy

Compositional compiler verification is a difficult problem that focuses on separate compilation of program components with possibly different verified compilers. Logical relations are widely used in proving correctness of program transformations in higher-order languages; however, they do not scale to compositional verification of multi-pass compilers due to their lack of transitivity. The only known technique to apply to compositional verification of multi-pass compilers for higher-order languages is parametric inter-language simulations (PILS), which is however significantly more complicated than traditional proof techniques for compiler correctness. In this paper, we present a novel verification framework for lightweight compositional compiler correctness. We demonstrate that by imposing the additional restriction that program components are compiled by pipelines that go through the same sequence of intermediate representations, logical relation proofs can be transitively composed in order to derive an end-to-end compositional specification for multi-pass compiler pipelines. Unlike traditional logical-relation frameworks, our framework supports divergence preservation---even when transformations reduce the number of program steps. We achieve this by parameterizing our logical relations with a pair of relational invariants.

We apply this technique to verify a multi-pass, optimizing middle-end pipeline for CertiCoq, a compiler from Gallina (Coq's specification language) to C. The pipeline optimizes and closure-converts an untyped functional intermediate language (ANF or CPS) to a subset of that language without nested functions, which can be easily code-generated to low-level languages. Notably, our pipeline performs more complex closure-allocation optimizations than the state of the art in verified compilation. Using our novel verification framework, we prove an end-to-end theorem for our pipeline that covers both termination and divergence and applies to whole-program and separate compilation, even when different modules are compiled with different optimizations. Our results are mechanized in the Coq proof assistant.

Extended abstracts

Substructural Weakest Preconditions in FibrationsHOPE @ ICFP 2025
John M. Li, Ryan Doenges, Pedro H. Azevedo de Amorim.
Extended Abstract Slides

Towards Symbolic Execution for Probability and Non-determinismLAFI @ POPL 2025
Jack Czenszak, John M. Li, Steven Holtzen.
Extended Abstract Poster

Towards a Categorical Model of the Lilac Separation LogicLAFI @ POPL 2024
John M. Li, Jon Aytac, Philip Johnson-Freyd, Amal Ahmed, and Steven Holtzen.
Extended Abstract Slides

New Foundations for Probabilistic Separation LogicLAFI @ POPL 2023
John M. Li, Amal Ahmed, and Steven Holtzen.
Extended Abstract Slides Poster

Service

LICS 2026: External Review

CPP 2026: External Review

LICS 2025: External Review

POPL 2025: External Review

LICS 2024: External Review