Expression Problem

The Expression Problem1 is a new name for an old problem. The goal is to define a datatype by cases, where one can add new cases to the datatype and new functions over the datatype, without recompiling existing code, and while retaining static type safety

  • Phillip Wadler

One can also think of this as a matter of expressing cross-cutting concerns without conflating implementation details.2 That is, expressing the set of possible manifestations of a certain type shouldn’t be necessarily tied to expressing the set of possible operations on a type.

To rephrase:

“Can we add both rows and columns to this matrix without changing any existing code or sacrificing type safety?”

operation 1operation 2
case 1impl c1+o1impl c1+o2
case 2impl c2+o1impl c2+o2

The key requirements here are:

  1. You have a datatype defined by cases.

    eg (for Haskell)

    data Type = Type -- "row"
    a :: Type -> _ -- "column"
    

    or (for Java)

    public class Type[[[T]]] { // "row"
      public T a(); // "column"
    }
    
  2. You want to add new functionality to the program in both directions

    • Adding new cases to the datatype. This is trivial with Java (with inheritance), but not possible naively in Haskell.
    • Adding new functions which work with the datatype. Conversely, trivial in Haskell, but not in Java.
  3. Crucially, the two constraints are that the extension can happen

    • without modifying existing code
    • without compromising static type safety

Interestingly, Java and Haskell both have ways of solving the problem, but they’re much clunkier than the normal way of dealing with things. So in a sense, if one wants to optimize for the least changing of legacy code when changing or extending it, it’s very helpful to think of how the codebase is expected to change.