Previously: Modeling Identity Types.

On first viewing, the identity type seems odd. Does it make sense to replace the traditional yes/no equality predicate with an elaborate type of equality proofs? In fact the father or modern type theory Martin Löf had his doubts, and initially tried to reflect all identity proofs into more basic judgmental or definitional equalities. This turned out to be too restrictive for many applications.

Still, some modern theorem provers like Lean impose the uniqueness of identity proofs (UIP) condition as an axiom. This so called axiom K states that reflexivity:

\text{refl} (a) : \text{Id}_A (a, a)

is the only possible proof of equality. Two things are equal if they are the same thing, otherwise they’re not. This works for most common data types.

Universes

There is, however, a much more interesting use case for non-trivial identity types. This happens when we apply equality to types themselves.

Since type theory is all about types, the question naturally arises, what is a type of a type? In Haskell, we call it a kind and denote as Type (or *, in legacy code).

In type theory we say that types form a universe. In Agda, the lowest such universe, U_0, is called Type l-zero (level zero), often abbreviated to Type. Of course, this begs the next question, what is the type of Type? Well, Type is a member of the next universe U_1, and so on, ad infinitum.

But if Type is a type then, like all types, it must be equipped with its own identity type. In other words, we should be albe to compare types for equality. The most obvious choice would be to say that two types are equal if they are isomorphic. We define an isomorphism as a pair of functions that are the inverse of each other. This makes perfect sense for sets, but we don’t want to treat types as sets. Types have more structure: every type comes equipped with its own identity type.

Equivalences

We’ve seen before that to model identity types we may look at homotopy theory. We model types as topological spaces and equality as continuous paths between points. The notion of isomorphism is too strong for comparing such spaces. What we need is an equivalence, which is an isomorphism modulo equality.

Indeed, we don’t need the round trip to land us exactly where we started — it’s enough that we land at a point that is equal to the original point. Here, equal means: there is a path that connects these two points. And once we have loosened the definition of what an inverse is, there is no reason to insist that the right inverse be the same as the left inverse. We can as well use two different functions, g and h.

Thus a function f \colon A \to B is called an equivalence iff:

  • there exists a function g \colon B \to A such that the composite f \circ g \sim id_B\;, that is \forall b \colon B . \; f (g\, b) =_B b
  • there exists a function h \colon B \to A such that the composite h \circ f \sim id_A, that is \forall a \colon A . \; h (f \, a) =_A a.

If such functions exist, we say that A is equivalent to B, A \simeq B.

Univalence

Equivalence is the only relation between types that behaves exactly like equality. Can we use it to define equality of types? This is what Vladimir Voevodsky postulated as the axiom of univalence.

For any A and B in a universe U, we postulate that:

(A =_U B) \simeq (A \simeq B)

in other words, equality is equivalent to equivalence.

Equivalences are much easier to prove than equalities. However, without univalence, they don’t satisfy the rule of substitution of equals for equals. This rule formulated by Leibniz reads: “Two terms are the same if one can be substituted for the other without altering the truth of any statement.”

In Martin Löf type theory it’s impossible to define a predicate that would distinguish between equivalent types. This is why imposing the axiom of univalence to extend type theory makes perfect sense.

Mathematician often apply the substitution rule to terms that are isomorphic rather than equal, essentially leaving the proof of the validity of a substitution as an “exercise to the reader.” Univalence legitimizes this practice and leads to substantial simplification of many mathematical proofs.

However, just like with isomorphisms, there may be several equivalences between two types. In fact, a type can be equivalent to itself in more than one way. For instance, not is a non-trivial equivalence for Bool — it’s clearly different from id. By univalence, such different equivalences introduce different equalities. Uniqueness of identity proofs is thus incompatible with univalence.

Higher Inductive Types

If identity is a type like any other, then we may use it to define types that contain their own proofs of identity. Such types are called higher inductive types. The simplest example is the circle, which you may think of as a singleton \text{base} plus a proof \text{loop} that \text{base} is equal to itself. In Agda, which is a dependently typed language, this translates to:

data S¹ : Type where
base : S¹
loop : base ≡ base

Here, base ≡ base denotes the type \text{Id}_{S^1} (\text{base}, \text{base}).

This definition is somewhat similar to, say, the definition of a boolean type:

data Bool : Type where
true : Bool
false : Bool

except that, in the definition of the circle, the type of the second component depends on the value of the first component. This definition introduces explicitly an element of the identity type, \text{loop}, that is not \text{refl}. It is therefore incompatible with axiom K.

Higher inductive types can be used to prove theorems in topological homotopy. Such proofs would normally involve topology, but in HoTT they are just statements about types. They can be formalized using theorem provers, like Agda.

I’m grateful to Thorsten Altenkirch for valuable comments on the draft of this post.

Previously: Identity Types.

Let me first explain why the naive categorical model of dependent types doesn’t work very well for identity types. The problem is that, in such a model, any arrow can be considered a fibration, and therefore interpreted as a dependent type. In our definition of path induction, with an arrow d that makes the outer square commute, there are no additional constraints on \rho:

So we are free to substitute \text{refl} \colon A \to \text{Id}_A for \rho, and use the identity for d:

We are guaranteed that there is a unique J that makes the two triangles commute. But these commutation conditions tell us that J is the inverse of \text{refl} and, consequently, \text{Id}_A is isomorphic to A. It means that there are no paths in \text{Id}_A other than those given by reflexivity.

This is not satisfactory from the point of homotopy type theory, since the existence of non-trivial paths is necessary to make sense of things like higher inductive types (HITs) or univalence. If we want to model dependent types as fibrations, we cannot allow \text{refl} to be a fibration. Thus we need a more restrictive model in which not all mappings are fibrations.

The solution is to apply the lessons from topology, in particular homotopy. Recall that maps between topological spaces can be characterized as fibrations, cofibrations, and homotopy equivalences. There usually are overlaps between these three categories.

A fibration that is also an equivalence is called a trivial fibration or an acyclic fibration. Similarly, we have trivial (acyclic) cofibrations. We’ve also seen that, roughly speaking, cofibrations generalize injections. So, if we want a non-trivial model of identity types, it makes sense to model \text{refl} as a cofibration. We want different values a \colon A to be mapped to different “paths” \text{refl}(a) \colon \text{Id}_A, and we’d also like to have some non-trivial paths left over to play with.

We know what a path is in a topological space A: it’s a continuous mapping from a unit interval, \gamma \colon I \to A. In a cartesian category with weak equivalences, we can use an abstract definition of the object representing the space of all paths in A. A path space object P_A is an object that factorizes the diagonal morphism \Delta = \pi \circ s, such that s is a weak equivalence.

In a category that supports a factorization system, we can tighten this definition by requiring that \pi be a fibration, defining a good path object; and s be a cofibration (therefore a trivial cofibration) defining a very good path object.

The intuition behind this diagram is that \pi produces the start and the end of a path, and s sends a point to a constant path.

Compare this with the diagram that illustrates the introduction rule for the identity type:

We want \text{refl} to be a trivial cofibration, that is a cofibration that’s a weak homotopy equivalence. Weak equivalence let us shrink all paths. Imagine a path as a rubber band. We fix one end and of the path and let the other end slide all the way back until the path becomes constant. (In the compact/open topology of P_A, this works also for closed loops.)

Thus the existence of very good path objects takes care of the introduction rule for identity types.

The elimination rule is harder to wrap our minds around. How is it possible that any mapping that is defined on trivial paths, which are given by \text{refl} over the diagonal of A \times A, can be uniquely extended to all paths?

We might take a hint from complex analysis, where we can uniquely define analytic continuations of real function.

In topology, our continuations are restricted by continuity. When you are lifting a path in A, you might not have much choice as to which value of type D(a) to pick next. In fact, there might be only one value available to you — the “closest one” in the direction you’re going. To choose any other value, you’d have to “jump,” which would break continuity.

You can visualize building J(a, b, p) by growing it from its initial value J(a, a, \text{refl}(a)) = d(a). You gradually extend it above the path p until you reach the point directly above b. (In fact, this is the rough idea behind cubical type theory.)

A more abstract way of putting it is that identity types are path spaces in some factorization system, and dependent types over them are fibrations that satisfy the lifting property. Any path from a to b in \text{Id}_A has a unique lifting J in D that starts at d(a).

This is exactly the property illustrated by the lifting diaram, in which paths are elements of the identity type \text{Id}_A:

This time \text{refl} is not a fibration.

The general framework, in which to build models of homotopy type theory is called a Quillen model category, which was originally used to model homotopies in topological spaces. It can be described as a weak factorization system, in which any morphism can be written as a trivial cofibration followed by a fibration. It must satisfy the unique lifting property for any square in which i is a cofibration, p is a fibration, and one of them is a weak equivalence:

The full definition of a model category has more moving parts, but this is the gist of it.

We model dependent types as fibrations in a model category. We construct path objects as factorizations of the diagonal, and this lets us define non-trivial identity types.

One of the consequences of the fact that there could be multiple identity paths between two points is that we can, in turn, compare these paths for equality. There is, for instance, an identity type of identity types \text{Id}_{\text{Id}_A(a, b)}. Such iterated types correspond to higher homotopies: paths between paths, and so on, ad infinitum. The structure that arises is called a weak infinity groupoid. It’s a category in which every morphism has an inverse (just follow the same path backwards), and category laws (identity or associativity) are satisifed up to higher morphisms (higher homotopies).

Previously: Models of (Dependent) Type Theory.

There is a deep connection between mathematics and programming. Computer programs deal with such mathematical objects as numbers, vectors, monoids, functors, algebras, and many more. We can implement such structures in most programming languages. For instance, here’s the definition of natural numbers in Haskell:

data Nat where
  Z :: Nat           -- zero
  S :: Nat -> Nat    -- successor

There is a problem, though, with encoding the laws that they are supposed to obey. These laws are expressed using equalities. But not all equalities are created equal. Take, for instance, this definition of addition:

plus :: Nat -> Nat -> Nat
plus Z n = n
plus (S n) m = S (plus n m)

The first clause tells us that 0 + n is equal to n. This is called definitional equality, since it’s just part of the definition. However, if we reverse the order of the addends and try to prove n + 0 = n, there is no obvious shortcut. We have to prove it the hard way! This second type of equality is called propositional.

In traditional mathematics equality is treated as a relation. Two things are either equal or not. In type theory, on the other hand, equality is a type. It’s a type of proofs of equality. When you postulate that two things are equal, you specify the type of proofs that you are willing to accept.

Equality is a dependent type, because it depends on the values that are being compared. For a pair of values of the type A, the identity (or equality) type is denoted by \text{Id}_A:

\frac{a, b \colon A}{\text{Id}_A (a, b) \colon \text{Type}}

(In this notation, assumptions are listed above the horizontal line.)

You’ll also see \text{Id}_A(a, b) written as a =_A b, or even a = b, if the type A is obvious from the context.

When two values are not equal, this type is uninhabited. Otherwise it’s inhabited by proofs, or witnesses, of equality. This is consistent with the propositions as types interpretation of type theory.

For the identity type to be useful, we have to have a way to populate it with values — proofs of equality. We do that by providing an introduction rule:

\frac{a \colon A}{\text{refl}(a) \colon \text{Id}_A(a, a)}

The constructor \text{refl} generates a witness to the obvious fact that, for every a, a = a. Here, \text{refl} is the abbreviation of reflexivity.

At first sight it doesn’t seem like we have gained anything by turning equality into a type. Indeed, if \text{refl} were the only inhabitant of \text{Id}_A, we’d just buy ourselves a lot of headaches for nothing. The trouble is that, if there is a type \text{Id}_A(a, b), then there’s also a type \text{Id}_{\text{Id}_A(a,b)} (type of equality of proofs of equality), and so on, ad infinitum. This used to be a bane of Martin Löf type theory, but it became a bounty for Homotopy Type Theory. So let’s imagine that the identity type may have non-trivial inhabitants. We’ll call these inhabitants paths. The trivial paths generated by \text{refl} are degenerate do-nothing loops.

But first, let’s consider the identity elimination rule: the mapping out of the identity type. What data do we need to provide in order to define a function from a path p \colon \text{Id}_A(x, y) to some type D? In full generality, D should be a dependent type family parameterized by the path p as well as its endpoints x and y:

x, y \colon A, \; p \colon \text{Id}_A (x, y) \vdash D(x, y, p) \colon \text{Type}

Our goal is to define a D-valued mapping J that works for any path in A:

x, y \colon A, \; p \colon \text{Id}_A(x, y) \vdash J(x, y, p) \colon D(x, y, p)

The trick is to realize that there is only one family of constructors of the identity type, \text{refl}(a), so it’s enough to specify how J is supposed to act on it. We do this by providing a dependent function d (a dependent function is a function whose return type depends on the value of its argument):

a \colon A \vdash d(a) \colon D(a, a, \text{refl}(a))

This is very similar to constructing a mapping out of a Boolean by specifying its action on the two constructors, \text{True} and \text{False}. Except that here we have a whole family of constructors \text{refl}(a) parameterized by a.

This is enough information to uniquely determine the action of J on any path, provided it agrees with d along \text{refl}:

J(a, a, \text{refl}(a)) = d(a)

This procedure for defining mappings out of identity types is called path induction.

Categorical Model

We’ll build a categorical interpretation of identity types making as few assumptions as possible. We’ll model \text{Id}_A as a fibration:

\text{Id}_A \xrightarrow{\pi} A \times A

The introduction rule asserts the existence of the arrow:

A \xrightarrow{\text{refl}} \text{Id}_A

For any given a \colon A, this arrow selects a diagonal element of the identity type, \text{Id}_A(a, a). When projected down to A \times A using \pi, it lands on the diagonal of A \times A. In a cartesian category, the diagonal is defined by the arrow \Delta \colon A \to A \times A, so we have:

\Delta = \pi \circ \text{refl}

In other words, there is a factorization of the diagonal arrow.

The elimination rule starts with defining a dependent type D, which can be modeled as a fibration \rho:

D \xrightarrow{\rho} \text{Id} \xrightarrow{\pi} A \times A

We provide a dependent function d, which is interpreted as an arrow d : A \to D that maps a to an element d(a) of D(a, a, \text{refl}(a)). So, when projected back to \text{Id}_A using \rho, it produces the same element as \text{refl}(a)

\rho \circ d = \text{refl}

Given these assumptions, there exists a mapping:

x, y \colon A, p \colon \text{Id}_A(x, y) \vdash J(x, y, p) \colon D(x, y, p)

which is interpreted as a section \text{Id}_A \to D. The section condition is:

\rho \circ J = id_{\text{Id}_A}

This mapping is unique if we assume that it agrees with d when restricted to \text{refl}:

J \circ \text{refl} = d

Consequently, we can illustrate the path induction rule using a single commuting diagram:

If the outer square commutes then there is a unique diagonal arrow that makes the two triangles commute. The similarity to the lifting property of homotopy theory is not accidental, as we’ll see next.

Next: Modeling Identity Types.