This is part of the book Category Theory for Programmers. The previous instalment was Category: The Essence of Composition. See the Table of Contents.
The category of types and functions plays an important role in programming, so let’s talk about what types are and why we need them.
Who Needs Types?
There seems to be some controversy about the advantages of static vs. dynamic and strong vs. weak typing. Let me illustrate these choices with a thought experiment. Imagine millions of monkeys at computer keyboards happily hitting random keys, producing programs, compiling, and running them.
With machine language, any combination of bytes produced by monkeys would be accepted and run. But with higher level languages, we do appreciate the fact that a compiler is able to detect lexical and grammatical errors. Lots of monkeys will go without bananas, but the remaining programs will have a better chance of being useful. Type checking provides yet another barrier against nonsensical programs. Moreover, whereas in a dynamically typed language, type mismatches would be discovered at runtime, in strongly typed statically checked languages type mismatches are discovered at compile time, eliminating lots of incorrect programs before they have a chance to run.
So the question is, do we want to make monkeys happy, or do we want to produce correct programs?
The usual goal in the typing monkeys thought experiment is the production of the complete works of Shakespeare. Having a spell checker and a grammar checker in the loop would drastically increase the odds. The analog of a type checker would go even further by making sure that, once Romeo is declared a human being, he doesn’t sprout leaves or trap photons in his powerful gravitational field.
Types Are About Composability
Category theory is about composing arrows. But not any two arrows can be composed. The target object of one arrow must be the same as the source source object of the next arrow. In programming we pass the results on one function to another. The program will not work if the target function is not able to correctly interpret the data produced by the source function. The two ends must fit for the composition to work. The stronger the type system of the language, the better this match can be described and mechanically verified.
The only serious argument I hear against strong static type checking is that it might eliminate some programs that are semantically correct. In practice, this happens extremely rarely and, in any case, every language provides some kind of a backdoor to bypass the type system when that’s really necessary. Even Haskell has unsafeCoerce
. But such devices should by used judiciously. Franz Kafka’s character, Gregor Samsa, breaks the type system when he metamorphoses into a giant bug, and we all know how it ends.
Another argument I hear a lot is that dealing with types imposes too much burden on the programmer. I could sympathize with this sentiment after having to write a few declarations of iterators in C++ myself, except that there is a technology called type inference that lets the compiler deduce most of the types from the context in which they are used. In C++, you can now declare a variable auto
and let the compiler figure out its type.
In Haskell, except on rare occasions, type annotations are purely optional. Programmers tend to use them anyway, because they can tell a lot about the semantics of code, and they make compilation errors easier to understand. It’s a common practice in Haskell to start a project by designing the types. Later, type annotations drive the implementation and become compilerenforced comments.
Strong static typing is often used as an excuse for not testing the code. You may sometimes hear Haskell programmers saying, “If it compiles, it must be correct.” Of course, there is no guarantee that a typecorrect program is correct in the sense of producing the right ouput. The result of this cavalier attitude is that in several studies Haskell didn’t come as strongly ahead of the pack in code quality as one would expect. It seems that, in the commercial setting, the pressure to fix bugs is applied only up to a certain quality level, which has everything to do with the economics of software development and the tolerance of the end user, and very little to do with the programming language or methodology. A better criterion would be to measure how many projects fall behind schedule or are delivered with drastically reduced functionality.
As for the argument that unit testing can replace strong typing, consider the common refactoring practice in strongly typed languages: changing the type of an argument of a particular function. In a strongly typed language, it’s enough to modify the declaration of that function and then fix all the build breaks. In a weakly typed language, the fact that a function now expects different data cannot be propagated to call sites. Unit testing may catch some of the mismatches, but testing is almost always a probabilistic rather than a deterministic process. Testing is a poor substitute for proof.
What Are Types?
The simplest intuition for types is that they are sets of values. The type Bool
(remember, concrete types start with a capital letter in Haskell) is a twoelement set of True
and False
. Type Char
is a set of all Unicode characters like 'a'
or 'ą'
.
Sets can be finite or infinite. The type of String
, which is a synonym for a list of Char
, is an example of an infinite set.
When we declare x
to be an Integer
:
x :: Integer
we are saying that it’s an element of the set of integers. Integer
in Haskell is an infinite set, and it can be used to do arbitrary precision arithmetic. There is also a finiteset Int
that corresponds to machine type, just like the C++ int
.
There are some subtleties that make this identification of types and sets tricky. There are problems with polymorphic functions that involve circular definitions, and with the fact that you can’t have a set of all sets; but as I promised, I won’t be a stickler for math. The great thing is that there is a category of sets, which is called Set, and we’ll just work with it. In Set, objects are sets and morphisms (arrows) are functions.
Set is a very special category, because we can actually peek inside its objects and get a lot of intuitions from doing that. For instance, we know that an empty set has no elements. We know that there are special oneelement sets. We know that functions map elements of one set to elements of another set. They can map two elements to one, but not one element to two. We know that an identity function maps each element of a set to itself, and so on. The plan is to gradually forget all this information and instead express all those notions in purely categorical terms, that is in terms of objects and arrows.
In the ideal world we would just say that Haskell types are sets and Haskell functions are mathematical functions between sets. There is just one little problem: A mathematical function does not execute any code — it just knows the answer. A Haskell function has to calculate the answer. It’s not a problem if the answer can be obtained in a finite number of steps — however big that number might be. But there are some calculations that involve recursion, and those might never terminate. We can’t just ban nonterminating functions from Haskell because distinguishing between terminating and nonterminating functions is undecidable — the famous halting problem. That’s why computer scientists came up with a brilliant idea, or a major hack, depending on your point of view, to extend every type by one more special value called the bottom and denoted by __
, or Unicode ⊥. This “value” corresponds to a nonterminating computation. So a function declared as:
f :: Bool > Bool
may return True
, False
, or __
; the latter meaning that it would never terminate.
Interestingly, once you accept the bottom as part of the type system, it is convenient to treat every runtime error as a bottom, and even allow functions to return the bottom explicitly. The latter is usually done using the expression undefined
, as in:
f :: Bool > Bool f x = undefined
This definition type checks because undefined
evaluates to bottom, which is a member of any type, including Bool
. You can even write:
f :: Bool > Bool f = undefined
(without the x
) because the bottom is also a member of the type Bool>Bool
.
Functions that may return bottom are called partial, as opposed to total functions, which return valid results for every possible argument.
Because of the bottom, you’ll see the category of Haskell types and functions referred to as Hask rather than Set. From the theoretical point of view, this is the source of neverending complications, so at this point I will use my butcher’s knife and terminate this line of reasoning. From the pragmatic point of view, it’s okay to ignore nonterminating functions and bottoms, and treat Hask as bona fide Set (see Bibliography at the end).
Why Do We Need a Mathematical Model?
As a programmer you are intimately familiar with the syntax and grammar of your programming language. These aspects of the language are usually described using formal notation at the very beginning of the language spec. But the meaning, or semantics, of the language is much harder to describe; it takes many more pages, is rarely formal enough, and almost never complete. Hence the never ending discussions among language lawyers, and a whole cottage industry of books dedicated to the exegesis of the finer points of language standards.
There are formal tools for describing the semantics of a language but, because of their complexity, they are mostly used with simplified academic languages, not reallife programming behemoths. One such tool called operational semantics describes the mechanics of program execution. It defines a formalized idealized interpreter. The semantics of industrial languages, such as C++, is usually described using informal operational reasoning, often in terms of an “abstract machine.”
The problem is that it’s very hard to prove things about programs using operational semantics. To show a property of a program you essentially have to “run it” through the idealized interpreter.
It doesn’t matter that programmers never perform formal proofs of correctness. We always “think” that we write correct programs. Nobody sits at the keyboard saying, “Oh, I’ll just throw a few lines of code and see what happens.” We think that the code we write will perform certain actions that will produce desired results. We are usually quite surprised when it doesn’t. That means we do reason about programs we write, and we usually do it by running an interpreter in our heads. It’s just really hard to keep track of all the variables. Computers are good at running programs — humans are not! If we were, we wouldn’t need computers.
But there is an alternative. It’s called denotational semantics and it’s based on math. In denotational semantics every programing construct is given its mathematical interpretation. With that, if you want to prove a property of a program, you just prove a mathematical theorem. You might think that theorem proving is hard, but the fact is that we humans have been building up mathematical methods for thousands of years, so there is a wealth of accumulated knowledge to tap into. Also, as compared to the kind of theorems that professional mathematicians prove, the problems that we encounter in programming are usually quite simple, if not trivial.
Consider the definition of a factorial function in Haskell, which is a language quite amenable to denotational semantics:
fact n = product [1..n]
The expression [1..n]
is a list of integers from 1 to n. The function product
multiplies all elements of a list. That’s just like a definition of factorial taken from a math text. Compare this with C:
int fact(int n) { int i; int result = 1; for (i = 2; i <= n; ++i) result *= i; return result; }
Need I say more?
Okay, I’ll be the first to admit that this was a cheap shot! A factorial function has an obvious mathematical denotation. An astute reader might ask: What’s the mathematical model for reading a character from the keyboard or sending a packet across the network? For the longest time that would have been an awkward question leading to a rather convoluted explanation. It seemed like denotational semantics wasn’t the best fit for a considerable number of important tasks that were essential for writing useful programs, and which could be easily tackled by operational semantics. The breakthrough came from category theory. Eugenio Moggi discovered that computational effect can be mapped to monads. This turned out to be an important observation that not only gave denotational semantics a new lease on life and made pure functional programs more usable, but also shed new light on traditional programming. I’ll talk about monads later, when we develop more categorical tools.
One of the important advantages of having a mathematical model for programming is that it’s possible to perform formal proofs of correctness of software. This might not seem so important when you’re writing consumer software, but there are areas of programming where the price of failure may be exorbitant, or where human life is at stake. But even when writing web applications for the health system, you may appreciate the thought that functions and algorithms from the Haskell standard library come with proofs of correctness.
Pure and Dirty Functions
The things we call functions in C++ or any other imperative language, are not the same things mathematicians call functions. A mathematical function is just a mapping of values to values.
We can implement a mathematical function in a programming language: Such a function, given an input value will calculate the output value. A function to produce a square of a number will probably multiply the input value by itself. It will do it every time it’s called, and it’s guaranteed to produce the same output every time it’s called with the same input. The square of a number doesn’t change with the phases of the Moon.
Also, calculating the square of a number should not have a side effect of dispensing a tasty treat for your dog. A “function” that does that cannot be easily modelled as a mathematical function.
In programming languages, functions that always produce the same result given the same input and have no side effects are called pure functions. In a pure functional language like Haskell all functions are pure. Because of that, it’s easier to give these languages denotational semantics and model them using category theory. As for other languages, it’s always possible to restrict yourself to a pure subset, or reason about side effects separately. Later we’ll see how monads let us model all kinds of effects using only pure functions. So we really don’t lose anything by restricting ourselves to mathematical functions.
Examples of Types
Once you realize that types are sets, you can think of some rather exotic types. For instance, what’s the type corresponding to an empty set? No, it’s not C++ void
, although this type is called Void
in Haskell. It’s a type that’s not inhabited by any values. You can define a function that takes Void
, but you can never call it. To call it, you would have to provide a value of the type Void
, and there just aren’t any. As for what this function can return, there are no restrictions whatsoever. It can return any type (although it never will, because it can’t be called). In other words it’s a function that’s polymorphic in the return type. Haskellers have a name for it:
absurd :: Void > a
(Remember, a
is a type variable that can stand for any type.) The name is not coincidental. There is deeper interpretation of types and functions in terms of logic called the CurryHoward isomorphism. The type Void
represents falsity, and the type of the function absurd
corresponds to the statement that from falsity follows anything, as in the Latin adage “ex falso sequitur quodlibet.”
Next is the type that corresponds to a singleton set. It’s a type that has only one possible value. This value just “is.” You might not immediately recognise it as such, but that is the C++ void
. Think of functions from and to this type. A function from void
can always be called. If it’s a pure function, it will always return the same result. Here’s an example of such a function:
int f44() { return 44; }
You might think of this function as taking “nothing”, but as we’ve just seen, a function that takes “nothing” can never be called because there is no value representing “nothing.” So what does this function take? Conceptually, it takes a dummy value of which there is only one instance ever, so we don’t have to mention it explicitly. In Haskell, however, there is a symbol for this value: an empty pair of parentheses, ()
. So, by a funny coincidence (or is it a coincidence?), the call to a function of void looks the same in C++ and in Haskell. Also, because of the Haskell’s love of terseness, the same symbol ()
is used for the type, the constructor, and the only value corresponding to a singleton set. So here’s this function in Haskell:
f44 :: () > Integer f44 () = 44
The first line declares that f44
takes the type ()
, pronounced “unit,” into the type Integer
. The second line defines f44
by pattern matching the only constructor for unit, namely ()
, and producing the number 44. You call this function by providing the unit value ()
:
f44 ()
Notice that every function of unit is equivalent to picking a single element from the target type (here, picking the Integer
44). In fact you could think of f44
as a different representation for the number 44. This is an example of how we can replace explicit mention of elements of a set by talking about functions (arrows) instead. Functions from unit to any type A are in onetoone correspondence with the elements of that set A.
What about functions with the void
return type, or, in Haskell, with the unit return type? In C++ such functions are used for side effects, but we know that these are not real functions in the mathematical sense of the word. A pure function that returns unit does nothing: it discards its argument.
Mathematically, a function from a set A to a singleton set maps every element of A to the single element of that singleton set. For every A there is exactly one such function. Here’s this function for Integer
:
fInt :: Integer > () fInt x = ()
You give it any integer, and it gives you back a unit. In the spirit of terseness, Haskell lets you use the wildcard pattern, the underscore, for an argument that is discarded. This way you don’t have to invent a name for it. So the above can be rewritten as:
fInt :: Integer > () fInt _ = ()
Notice that the implementation of this function not only doesn’t depend on the value passed to it, but it doesn’t even depend on the type of the argument.
Functions that can be implemented with the same formula for any type are called parametrically polymorphic. You can implement a whole family of such functions with one equation using a type parameter instead of a concrete type. What should we call a polymorphic function from any type to unit type? Of course we’ll call it unit
:
unit :: a > () unit _ = ()
In C++ you would write this function as:
template<class T> void unit(T) {}
Next in the typology of types is a twoelement set. In C++ it’s called bool
and in Haskell, predictably, Bool
. The difference is that in C++ bool
is a builtin type, whereas in Haskell it can be defined as follows:
data Bool = True  False
(The way to read this definition is that Bool
is either True
or False
.) In principle, one should also be able to define a Boolean type in C++ as an enumeration:
enum bool { true, false };
but C++ enum
is secretly an integer. The C++11 “enum class
” could have been used instead, but then you would have to qualify its values with the class name, as in bool::true
and bool::false
, not to mention having to include the appropriate header in every file that uses it.
Pure functions from Bool
just pick two values from the target type, one corresponding to True
and another to False
.
Functions to Bool
are called predicates. For instance, the Haskell library Data.Char
is full of predicates like isAlpha
or isDigit
. In C++ there is a similar library that defines, among others,
isalpha
and isdigit
, but these return an int
rather than a Boolean. The actual predicates are defined in std::ctype
and have the form ctype::is(alpha, c)
, ctype::is(digit, c)
, etc.
Challenges
 Define a higherorder function (or a function object)
memoize
in your favorite language. This function takes a pure functionf
as an argument and returns a function that behaves almost exactly likef
, except that it only calls the original function once for every argument, stores the result internally, and subsequently returns this stored result every time it’s called with the same argument. You can tell the memoized function from the original by watching its performance. For instance, try to memoize a function that takes a long time to evaluate. You’ll have to wait for the result the first time you call it, but on subsequent calls, with the same argument, you should get the result immediately.  Try to memoize a function from your standard library that you normally use to produce random numbers. Does it work?
 Most random number generators can be initialized with a seed. Implement a function that takes a seed, calls the random number generator with that seed, and returns the result. Memoize that function. Does it work?
 Which of these C++ functions are pure? Try to memoize them and observe what happens when you call them multiple times: memoized and not.
 The factorial function from the example in the text.

std::getchar()

bool f() { std::cout << "Hello!" << std::endl; return true; }

int f(int x) { static int y = 0; y += x; return y; }
 How many different functions are there from
Bool
toBool
? Can you implement them all?  Draw a picture of a category whose only objects are the types
Void
,()
(unit), andBool
; with arrows corresponding to all possible functions between these types. Label the arrows with the names of the functions.
Bibliography
 Nils Anders Danielsson, John Hughes, Patrik Jansson, Jeremy Gibbons, Fast and Loose Reasoning is Morally Correct. This paper provides justification for ignoring bottoms in most contexts.
Next: Categories Great and Small
Follow @BartoszMilewski
November 24, 2014 at 8:11 pm
What is your preferred way to receive(or not) reports of typos and similar minor errors? Many thanks for this series
November 24, 2014 at 10:31 pm
I’m still reading along, but just a thought: do you think it would be helpful to provide references for further reading? For example, to explain Goedel? That way, even if you don’t want excessive “mathiness” to permeate the writing, readers would have an idea of where to look for more. Very enjoyable so far!
November 24, 2014 at 10:43 pm
…Russell, even. See? 😉
November 24, 2014 at 10:44 pm
@William: Twitter works fine. I’ll see it if you mention @BartoszMilewski.
November 25, 2014 at 2:21 am
So a point of precision here: I think you confuse denotational and categorical semantics a bit. I just checked back at the Moggi paper “Notions of Computation and Monads” and indeed he constrasts the approach there, which he describes as categorical semantics, to all three of denotational, logical, and operational accounts. Also, denotational semantics prior to Moggi already could deal with effects, just not as pleasantly — indeed, the discipline (from the Strachey standpoint, not the Scott one) was motivated by trying to understand languages like C, and it was in fact in Strachey’s work that the idea of lvals was first introduced!
For similar reasons, this statement is false: “In a pure functional language like Haskell all functions are pure. Because of that, it’s possible to give these languages denotational semantics and model them using category theory.” The falsehood is in the fact that we can also give imperative languages denotational semantics and model them using category theory (or rather, we model them using denotational semantics with borrowings from category theory — genuine categorical semantics get scary fast and aren’t used for ‘real sized’ languages as far as I know). Here, for example, is a paper on the denotational semantics of ANSI C (as implemented, of course in Haskell :P) http://www.softlab.ntua.gr/~nickie/Papers/papaspyrou2001dsac.pdf.
I think that densem and cat theory (which are not the same — traditional densem is set theoretic) must be motivated not as the only way to do things, but as ways with certain advantages.
Here is a perhaps clearer way to argue things: denotational / categorical semantics can be much easier to reason about than operational ones. however, the link between denotational / categorical semantics and imperative languages is typically harder to establish than the link between imperative languages and their operational semantics. If we accept that we have a better way to talk about what programs “mean” then perhaps we should have languages that allow us to represent things closer to that way in the first place, and we should attempt to internalize their constructs (such as monadic effects) directly into those languages as much as possible.
November 25, 2014 at 6:19 am
I think that the debate around static/dynamic weak/strong typing debate misses the point. Explaining what typed and functions and what their advantages are won’t change a thing.
People use dynamic languages because it is simple and faster to get things running. Why is that? My hypothesis is that is is in big part because of culture. The Haskell world fosters a culture of complexity even for minor tasks.
For instance, watch the situation of command line parameter parsing, needlessly complex libraries, with poor documentation and example who won’t compile… Then compare it to python or ruby or clojure libraries…
Then you have the fact that Haskell is a relatively simple language by itself, but every library developer feels the need to bring in his own set of Language extensions. Hence you end up having a good bunch of Haskell variants to study.
Then you have things like pipes and conduit that do quite similar stuff (and apparently very commonly needed stuff) with dissimilar interfaces. Both are complex and you have to know both because they are pulled in by other libraries…
This complexity culture partially explains for me why people will prefer Ruby, Clojure, Python or even Java to Haskell (and Scala to a lesser extent).
Make simple things simple please…
PS: For now I am a happy Lisper (with typed clojure and its gradual typing). I’m really appreciating ocaml and F# these days and I’m considering pushing them at work. I abandoned the idea of pushing Haskell because I could only see a world of pain for my coworkers.
November 25, 2014 at 11:02 am
@sclv: I’m glad you are keeping me on my toes ;). I have modified the offending sentence replacing “possible” with “easier” (that’s the great thing about blogs: you can improve on them).
As for denotational vs. categorical semantics, this is the kind of hair splitting I’d like to avoid. I think the key is your observation that “traditional densem is set theoretic.” I would argue then that “modern densem is categorical.”
November 25, 2014 at 11:13 am
@johann: I sympathize with your arguments, but what you’re complaining about is the prevailing practice in the Haskell community, which will hopefully improve with time, not the principle of strong typing. I’ve seen people throwing together shell scripts in Haskell in no time, using Haskell’s type inference to avoid any explicit typing. It feels like programming in a dynamic language, except when you make a mistake.
November 25, 2014 at 5:29 pm
unit :: a > ()
unit _ > ()
Second line should be:
unit _ = ()
November 25, 2014 at 5:34 pm
I’m not a fan of the phrases “strongly typed” and “weakly typed” because they are so often used to mean so many different things. I’ve heard and read ALL of the following meanings in casual conversations and academic papers
1) weak = dynamically typed, strong = statically typed.
2) Weak = lots of automatic coercions (e.g. JavaScript and C++), strong = relatively few (e.g. Scheme and Haskell)
3) Weak = undefined behavior/memory unsafe (e.g. C, Fortran, Pascal, and C++), strong = memory safe (most popular languages [ignoring FFIs and such])
4) Weak = relatively limited static type systems (e.g. all dynamically typed languages, but also C and Pascal), Strong = relatively rich type systems (e.g. Haskell and Scala). I guess by that definition Java and C# would be medium strong and Agda would be ultra strong. Or something.
5) Weak = effects aren’t tracked in types in any way, strong = effects are tracked in types
I read through this post and suspect you mean at least #4 but also maybe #5. But it’s not clear.
November 25, 2014 at 6:28 pm
Corrected! Thanks!
November 25, 2014 at 8:05 pm
Thank you for an interesting article and a really interesting blog.
The big problem I have with statically typed languages that I have experience with (mostly C and C++) is that they are inherently bad at handling dynamic data. That is data where the type of the data is not necessarily known beforehand. And I find that a whole lot of problems go into that domain.
How is this handled in Haskell? (While I have tried to learn Haskell I have yet to do anything above toy examples)
November 27, 2014 at 3:16 pm
@Ragnar,
Sorry for the divergence w.r.t the actual contents of the article. But on the topic of consuming data in a statically typed language, this is always problematic. If, as you say your data sources are truly dynamic and not know until runtime. You then have little choice but to parse the data and emit the data as an abstract syntax tree that represents the data, which you can then provide accessors over using a set of carefully chosen combinators or lenses maybe. If however you have a representative sample of the data set, then F# has a language feature called type providers which can help massively here. see,
http://fsharp.github.io/FSharp.Data/
or
http://msdn.microsoft.com/enus/library/hh156509.aspx
for examples, I’m not sure of an equivalent of this in haskell although implementations may well exist.
November 28, 2014 at 8:40 am
[…] Types and Functions […]
November 28, 2014 at 12:07 pm
Type System applies rules to source code that are not covered by language (context free) grammar.
In other words, Type system looks like replacement for extensible and context sensitive grammar that covers user defined types and code.
Static program analysis is about applying (context sensitive) rules to source code, like grammar.
Another attempt of enforcing rules is C# using (resource) statement. Resources are created, used and disposed in ruled order, also like grammar.
We need tools that at compile time can check the rules about how to use resources or order of using library functions / object creatingusingdisposal / assigning or checking nullable types before use / memory allocationusedisposal etc.
November 30, 2014 at 3:01 am
[…] Types and Functions by Bartosz Milewski. […]
December 5, 2014 at 10:20 am
@sclv This is basically an aside, but the idea that denotational semantics means “interpret in a category” has become a pretty common one. For example, Carvhalo in “Execution Time of λTerms via Denotational Semantics and Intersection Types” says “The fundamental idea of denotational semantics is that types should be interpreted as objects of a category C and terms should be interpreted as arrows in C in such a way that if a term t reduces to a term t′, then they are interpreted by the same arrow.” I am sure if I tried I could dig up other people saying essentially the same thing.
I don’t know if I agree with this view: different people mean different things when they say “denotational semantics”–often what is meant is something like “Scott semantics.” In the context of logic, “denotational semantics” usually is differentiated from other kinds of models in that it models proofs and not just provability. That said, there is some strength to the “denotational is categorical view”–it gives a particular research direction for coming up with models that looks something like 1. figure what you need in the category to model your language 2. come up with a category with those features. This is explicitly the research direction taken by people like Lars Birkedal (or at least I have heard him describe his work that way) and seems to be what is going on with people like Samson Abramsky. So, although it might not be a universal perspective, it seems it would be be hard to argue it is a bad one.
Finally, the question of if “traditional denotational semantics” was categorical is a question of what “traditional denotational semantics” means. But if it is supposed to be early Scott, the answer is that although Scott avoided the use of categorical language I think it is pretty clear that he was motivated by categorical ideas. These appear explicitly in papers like “Data Types as Lattices” but I think these ideas were implicit in “Outline of a mathematical theory of computation.” Indeed, Scotts own version of history emphasizes the importance of category theory to the development of the lambda calculus, and describes his main contribution as the theorem that “The category of T0topological spaces with an equivalence relation and continuous functions respecting equivalence is cartesian closed.” source http://youtu.be/7cPtCpyBPNI
Anyways, as I said this is all an aside. How much you emphasize category theory vs other approaches is a matter of taste.
December 7, 2014 at 11:16 pm
@philipjf. My point was not that people don’t use terms loosely to mean many things. It is that the historic account given is misleading, and particularly if one turns to Moggi’s paper we see that he explains he is doing categorical semantics in contrast to denotational semantics!
@bartosz: Glad you took one edit. Appreciate your concerns about hairsplitting. Let me suggest one section that I still find misleading. In particular:
“What’s the mathematical model for reading a character from the keyboard or sending a packet across the network? For the longest time that would have been a conversation stopper. It seemed like denotational semantics wasn’t applicable to a considerable number of important tasks that were essential for writing useful programs, and which could be easily tackled by operational semantics. The breakthrough came from category theory. Eugenio Moggi discovered that computational effect can be mapped to monads. This turned out to be an important observation that not only saved denotational semantics, and made pure functional programs usable, but also shed new light on traditional programming.”
As discussed above, even sans monads, denotational semantics could tackle this stuff. So perhaps the following.
“For the longest time, the answer would have been somewhat convoluted. It seemed like denotational semantics was awkward for a considerable…”
And similarly I would suggest “and made pure functional programs much more usable”.
After all, we had a usable, if awkward IO story for Haskell even prior to monads. Much of Plotkin’s work on effects has been, in a sense, suggesting an alternate history in which Monads where not taken up so fulsomely, and asking what avenues we may have continued to go down instead. Such avenues, of course, end up being also very categorical, but in a somewhat different way.
December 8, 2014 at 4:26 pm
@sclv: All right, is it better now?
December 8, 2014 at 6:53 pm
Much! Appreciate you being responsive on this 🙂
December 8, 2014 at 7:25 pm
[…] Next: Types and Functions. […]
December 10, 2014 at 10:06 am
Well… I actually do sometimes. Usually it’s when the interface I am using is documented poorly, whether “poorly” means “underdocumented” or “overcomplicated”, or even both, as the case with pretty much any “asynchronous” API.
So I just imagine some sort of idealized behaviour, with sane invariants and pre/postconditions, write a toy implementation and then, amazedly, watch how destroyed objects keep sending events (with the sender being a null pointer at best), how EVENT_CONNECTION_DROPPED arrives before EVENT_MORE_DATA_ARRIVED, how some events are never reported/passed to me even though they clearly happen, and all such nonsense. After this I indeed start “just throwing lines of code and see what happens”, to learn crucial things that were left out from the documentation.
Denotational semantics is lovely and exciting, but most people actually prefer axiomatic semantics when trying to reason about correctness: I’d argue that the reasoning along the lines “those variables are guaranteed to have such and such values when the execution hits this point of code, so calling this routine with such and such parameters can’t cause any surprises, and after it returns, the program state will satisfy that and that condition” is not an application of operational semantics, and definitely not of denotational semantics. Surely, picking some particular mathematical objects will provide us with exactly the axioms we need to infer conclusions that are of interest to us but after we have realised that those axioms are the ones we need, we can we work just with them, without the “interpretation” in terms of those objects.
December 10, 2014 at 11:56 am
@Joker_vD: What you are calling axioms are actually theorems and they can be derived from either version of semantics. An example of axiomatic semantics is the description of C++ weak atomics — one of the most unapproachable areas of language definition. As far as I know (and I talked to Hans Boehm about it), there is no operational model for weak atomics, just a set of axioms about visibility of writes.
December 11, 2014 at 3:53 pm
Yes, the axioms/properties/theorems/invariants/whatchamacallit definitely can be derived from operational or denotational semantics. But axiomatic semantics starts with them, without any “interpretation” attached, be it a state machine in case of operational semantics or a domain/category in case of denotational semantics.
If we’re talking about “portable races” aka weak atomics, while I’d agree that there probably cannot be an operational semantics equivalent to the axiomatic semantics, I’d say there are many operational semantics satisfying the axiomatic semantics: x86+RAM and ARM+RAM can be described as state machines, albeit very cumbersome ones (which is troubling because reasoning about state machines such cumbersome is not easier than formulating theorems about properties that you actually want).
So that’s the difference between the axiomatic and denotational approach, at least as I understand it: instead of looking for a bag of mathematical objects that reflects the desired properties of our programming language constructs, reason about those properties directly; don’t search for another model/interpretation of the logic theory that is the meaning of the programming language, work with the logic theory itself. The axiomatic semantics is that theory, the operational semantics is a humanfriendly model of that theory, and the denotational semantics is a mathematicianfriendly model of it.
December 16, 2014 at 5:07 pm
[…] the previous installment of Category Theory for Programmers we talked about the category of types and functions. If you’re new to the series, here’s the link to the initial […]
January 13, 2015 at 1:29 pm
Hi! I’ve been translating your blog posts from this series at the main Russian ITthemed website, and one of the people suggested that you should have written:
rather than
so iflike constructs would work.
Anyway, thanks for your work!
January 13, 2015 at 2:00 pm
@monnoroch: Yes, I’ve seen your translations. Thank you for popularizing my work in Russia.
Implicit conversions between data types,
bool
andint
in this case, are a source of some of the most insidious bugs. A wellwritten C++ program should work independently of howtrue
andfalse
are encoded. That’s why I deliberately started withtrue
. Of course, this was just a hypothetical definition, sincebool
,true
, andfalse
are reserved keywords in C++.February 4, 2015 at 9:10 am
I’m not sure if this was an intentional simplification or not (after all, this (great!) series is about category theory, not the CurryHoward isomorphism), but where you’ve written, “the function absurd is the statement that from falsity follows anything,” I believe it should read, “the function absurd is a proof of the statement that from false follows anything;” it’s the (polymorphic) type of absurd which corresponds to that statement.
February 4, 2015 at 12:03 pm
@sj: I was being vague because this is only true in a vague, handwaving sense, considering the problems with the bottom in Haskell. The type of the function absurd corresponds to a preposition, the implementation would be the proof, and I didn’t provide the implementation, so I couldn’t claim the proof. I looked at Edward Kmett’s clever hack in Data.Void but I don’t think it would make it any clearer.
Okay, I changed the wording to “the type of the function absurd…”
February 4, 2015 at 11:01 pm
Your thought experiment with monkeys has a fatal flow: None of Shakespeares plays were grammatically correct. So your grammar checked monkeys have zero probability of generating one of his plays. (Maybe dependent types would help here?)
April 19, 2015 at 12:54 am
Exercise #6 raises some questions in my beginner mind. There are two functions of the type () > Bool. Are those distinct morphisms? It seems so, because I’m asked to label the morphisms, but my intuition tells me that they’re identical?
April 19, 2015 at 12:38 pm
@Uglemat: These two functions are not identical because they produce different results for the same input.
Your intuition is probably telling you that flipping True and False (the
negate
function) is an isomorphism, which is true. But that doesn’t mean that the two functions are the same.April 19, 2015 at 6:11 pm
@Milewski Actually my thinking was that hom(A, B) could have maximum one element. Because wouldn’t more than one be superfluous? Because what more information is contained in a morphism than the two objects? Should the concrete “behavior” of a morphism be taken as a part of its information? I guess what I’m saying is that I don’t understand what a morphism is. When you do typechecking, you wouldn’t distinguish between two morphisms with the same start and target because both are equivalent in the world of types, where concrete implementation is “hidden”, maybe that’s what caused me to think the stuff above.
April 19, 2015 at 8:09 pm
@Uglemat: There are two points of view. When you’re looking at an abstract category, you’re not supposed to “understand” what a morphism is. You just say there are two morphisms between these two objects, and that’s that. The structure of the category lies in how many morphisms there are between any two objects and how they compose.
But here we are trying to build a category by mimicking the structure of a programming language. So we look at the language, its types and its functions, and build a category that corresponds to it. There is a type called () and a type called Bool. They will correspond to two objects in our category. Then there are two functions from () to Bool. They will give rise to two morphisms between the two objects.
Abstracting means forgetting about the implementation. In the category, we no longer remember why the two morphisms were supposed to be different — they just are.
April 19, 2015 at 9:07 pm
Ok, that makes sense. You’d have to represent a morphism with a 3tuple like (from, to, morphismid) though if homset returns an actual set (if I understand set theory correctly). Since homset doesn’t need to return an actual set (as morphism@wikipedia tells me) maybe that’s a different way of solving it (if “not set” can be taken to mean a list). Thanks for clearing this up!
April 19, 2015 at 9:25 pm
I try to use the word homset to denote a set of morphisms. What Wikipedia says is that some people stretch the notion of homset to mean something more than a set (I try to use homobject in that case). But when they say that homset is not a set, they mean it’s a class — something that can be bigger than a set.
Yes, there are things bigger than sets. For instance, a collection of all sets is not a set. Fortunately, as programmers, we don’t have to worry about such things.
May 7, 2015 at 7:26 pm
Maybe you mentioned it somewhere in your posts and I just overlooked it, but Hask = Set +  implies that category Set consists of countable sets only. Is this simplification made only here or the category theory doesn’t work with uncountable sets either? Does it matter at all, as I get the feeling that many things are irrelevant in this field?
May 7, 2015 at 10:15 pm
Set is a category of all sets, countable and uncountable.
May 10, 2015 at 11:03 pm
My solution to #1: https://ideone.com/pxX5As
Any idea how to get rid of explicit Arg type parameter?
Also, Solution to 6:
Void ———–> ()
Void ———–> Bool
Bool —————> ()
() ——————> Bool
() —————> Bool
Is this right? Should there be two arrows from unit to bool?
May 11, 2015 at 10:16 am
I think there are some template tricks using traits that could eliminate the need for the explicit type parameter Arg.
Yes, there should be two arrows from unit () to bool (but only one from Void). And don’t forget the identity arrows.
May 31, 2015 at 6:25 pm
Regarding challenge #6:
Should there be two morphisms Bool>Void and ()>Void? We can define such functions, but can’t call them — same as “absurd”, right?
And should there be a “not” morphism Bool>Bool? If yes, than it will look like “identity” but won’t satisfy laws of “identity”. And if no, than why? There is a function “not” and all functions should be in category in form of morphism, right?
June 1, 2015 at 4:35 am
May I know what this sentence means? Any example perhaps?
“The only serious argument I hear against strong static type checking is that it might eliminate some programs that are semantically correct.”
I’ve been trying to understand monad and I heard that it’s heavily related to category theory. Really appreciate for the effort you’re putting in explaining it very extensively here.
June 5, 2015 at 6:42 pm
@Iwan: Essentially anywhere you see an explicit cast, the programmer is overriding the type system. The programmer was able to reason that despite the types not matching formally, the code will work correctly. The programmer had more information than the type checker. Even in Haskell you may occasionally see
unsafeCoerce
as a means to bypass the type system, for instance to get better performance. (See this Stack Overflow discussion.)June 5, 2015 at 9:43 pm
@ rpominov: How would you define a function
Bool>Void
? What would it return when called withTrue
?not
is a legitimate morphism that is not identity. So there are indeed two morphismsBool>Bool
. (BTW,not
is missing in Ezo’s solution.)June 5, 2015 at 10:02 pm
I think I understand now. I was confused with what “define” means. I thought it meant to “define” type signature of a function. I just did’t realize that for
Void>Bool
we can define not only signature, but a function body as well, and still won’t be able to call it. While forBool>Void
we can’t define a function body, so there can’t be such morphism.And for
not
I just did’t realize that in a category can be a morphisms fromA
toA
, that isn’t the identity morphisms.Thank you!
June 17, 2015 at 4:24 pm
Solution for the Ex6 seems to be like: https://goo.gl/cE0mEI
Am I right?
But the question arises: what is an identity function for the Void type? In math it is just an empty function from ∅ to ∅. But we can’t declare it in Haskell. So it is a bit weird 🙂
June 17, 2015 at 7:05 pm
@karkunow: You can declare and implement it in Haskell
You’ll never be able to call it. But that’s a different story.
June 17, 2015 at 9:34 pm
@bartosz, thanks, got it!
July 20, 2015 at 10:55 pm
Indeed, Void > Void is just a particular case of parametric f x = x, likewise 0^0 is a particular case of general formula a^b with exception that you cannot compute 0^0. That is, you can define 0^0, as a particular case of a^b but cannot evaluate it. I started to think about it because I have computed the amount of distinct functions that we must have
https://docs.google.com/document/d/1NcZ07pwYLbQif9hYqY8d82dh2FMf6bHXDuccWKfx8/pub
It was more difficult to define Void to anything other than Void or Unit. I could not map Void to Boolean. My consideration suggests that there should be only one such function. But which one?
July 21, 2015 at 11:16 am
This table also illustrates chapter 5. We clearly see that ⊥ suits the terminal object definition since there is always 1 function of type a → ⊥ and 0 functions of type ⊥ → a except ⊥ → ⊥, which has undefined number of functions. We also see that the unit () has always 1 map to it from any type, making it ideal terminal object. Yet, it also maps to every value in your type system, as you say in this chapter, which makes it not that terminal.
July 21, 2015 at 12:08 pm
@Valentin: By convention, 0^0 is assumed to be 1, so you do get correct numbers in all cases.
I wouldn’t use the symbol ⊥ for the initial object, because it has a meaning in Haskell as a nonterminating computation. The presence of nontermination complicates things a lot, and I want to avoid it.
The fact that () maps to any other object (except Void) doesn’t interfere with it being terminal.
As for the function from Void to Bool. Formally, in Haskell, you could define two such functions:
but you can easily convince yourself that it’s the same function.
I didn’t get into the definition of what it means for functions (in the sense of mappings between sets) to be equal. So here it is: Two functions are equal if, for all arguments, they return the same values. It’s called “extensional” equality. For all x, f x = g x. In the case of Void, the set of arguments is empty, so vtob and vtob’ are equal “in vacuum.”
Or, if you like proof by contradiction, try to show that vtob is different from vtob’. You would have to find an argument for which they produce different values. Good luck finding such argument!
July 21, 2015 at 2:02 pm
“Vacuously equal” is brilliant. Indeed, it seems to be a generalization of vacuous truth, ∀x∈A:Q(x) where Q is not limited to bivalue predicate. I am sure that everybody will agree that this is utterly breathtaking. But I am still would like to know why 0^0=1 is “a correct number”. We should have all 1’s in the first row of the table and all zeroes in the first column. There is a certain conflict at their intersection, in the first corner cell, if it must be set to 0 or 1. It is surprising that category theory argues for correct number 0^0 = 1. It paradoxically says that can produce Void given a Void, despite we cannot produce anything else, given a Void with the argument “nobody can give us a void”.
December 28, 2015 at 5:47 pm
I kind of feel like thinking of types as sets only is somehow incomplete. Isn’t it also necessary to equip your sets with composition laws to completely describe them? For example, in the boolean type example, it is clear that it is a different type from say, the equivalence classes of integers modulo 2. Since both sets are of cardinality 2, what differenciates them is their composition laws.
December 29, 2015 at 5:23 pm
In the previous section I explained that composition is the essence of every category. In the category of sets, this would be the composition of functions between these sets.
Having said that, sets of the same cardinality are isomoprhic, which means they can be substituted for each other. If you have an algoritm implemented in terms of Booleans, you can write an equivalent algorithm in terms of integers modulo 2. The only difference between isomorphic types is that some of them might lead to more efficient implementations, which is something outside of type theory.
December 30, 2015 at 6:24 am
Thanks for your reply,
Ok, I understand now. While reading your articles, I keep forgetting we are talking about sets, not groups or other kind of structures equiped with composition laws. I made the same error in the next section: I didn’t understand why it was saying that “adders” where morphisms because clearly they where not mapping identity at the left to identity at the right. But I realise now that there is no identity element in sets and morphisms between sets are functions so it make sense after all.
It is unnatural for me to think in these terms because I am not used to it. I keep viewing this through the prism of my prior group algebra exposure.
January 3, 2016 at 11:39 am
Thanks for blog posts. Its been very enjoyable so far. I’ve seen many memoize functions for single parameter functions on integers. Here is my attempt at memoizing variadic functions in ECMAScript® 2015.
const memoize = (f, ...leftArgs) => {
const undef = {}; // f() might be undefined, so ref type used for check later
let unit = undef;
const memo = new Map();
return (...args) => {
if (args.length === 0) {
if (unit === undef) {
unit = f(...leftArgs);
}
return unit;
}
const [arg, ...rightArgs] = args;
if (! memo.has(arg)) {
memo.set(arg, memoize(f, ...leftArgs, arg));
}
return memo.get(arg)(...rightArgs);
};
};
January 5, 2016 at 11:02 am
i feel like only () > () is possible to implement, everything else is either impossible or absurd?
January 5, 2016 at 1:28 pm
What about two possible functions ()>Bool, for instance?
April 23, 2016 at 4:06 am
I would like to mention that Enzo’s solution is not complete. Some morphisms of type Bool > Bool are missing.
April 23, 2016 at 11:33 am
Who is Enzo? And what solution are you talking about?
June 21, 2016 at 11:06 am
Regarding comment #51:
It’s obvious that ∀v∈Void => vtob v = True ≠ False = vtob’ v. So vtob = vtob’ can only be satisfied if there are no v’s (otherwise, we claim vtob v = True = False = vtob’ v, which is fallacious). Since v∈Void and Void is empty, the condition that there are no v’s is satisfied. Therefore, it is indeed true that vtob = vtob’.
Is my understanding of this correct?
June 22, 2016 at 1:00 am
This article in wikipedia may help.
Also, if you’re not afraid of proofs by contradiction, think about how you would negate a statement with a universal quantifier:
You’d have to find a counterexample, that is,
If X is an empty set, nothing exists in it. So the negation of the original statement is false. (Of course, proof by contradiction is shunned in constructivist logic, but that’s a different story.)
July 4, 2016 at 6:41 am
const 1 (undefined :: Void)
July 4, 2016 at 7:02 am
Hmm. Infinitely many functions of type
Integer > ()
can be built using undefined. ConsiderJuly 16, 2016 at 6:58 am
@bartosz, You say that “… in several studies Haskell didn’t come as strongly ahead of the pack in code quality as one would expect.”
Which studies are you referring to? If it’s not too much trouble, could you provide the references?
(Background: I regularly face resistance from imperative programmers to embrace FP techniques. For example, somehow the ‘M’ word often causes strong negative reactions and that seems to be the end of it. I’m searching for solid evidence to support the claim that pure FP languages such as Haskell, accompanied by its underpinning, category theory, are not just academic toys.)
July 16, 2016 at 9:12 am
Here’s some literature:
A Large Scale Study of Programming Languages
and Code Quality in Github,
A Comparative Study of Programming Languages in Rosetta Code.
They contain references to other studies.
A good resource is Why Haskell Matters.
October 17, 2016 at 11:21 am
With regards to
Bool > Void
and() > Void
, wouldn’t following be a proper morphism?bottom = bottom
btov :: Bool > Void
btov _ = bottom
As Hask includes bottom, is it correct to say that this function is a morphism?
October 17, 2016 at 12:27 pm
It’s a tricky question. That’s why I keep hedging my statements about Haskell by saying “ignoring bottoms.” Some people, like Andrei Bauer or Robert Harper, even doubt whether Hask is a real category.
October 21, 2016 at 10:31 am
In comment 40 you said:
“Yes, there should be two arrows from unit () to bool (but only one from Void)”
There are two functions from Void to Bool, right?
1.Void > True
2.Void > False
October 21, 2016 at 11:17 am
Void is a type, True and False are values. What you wrote is neither a type signature nor a function definition. But I understand what you mean. You mean this:
But despite appearances, these two formulas give you the same function. For any value of the argument
x
, they produce the same result. Remember,x
must be an element of the empty set!November 6, 2016 at 1:46 am
An implementation of memoize in F#:
For the naive fib function you’ll get:
First run took: 00:00:02.4870968
Second run took: 00:00:00.0000013
For the optimized memfib the result is:
First run took: 00:00:00.0006868
Second run took: 00:00:00.0000010
December 10, 2016 at 1:22 pm
Hi Bartosz, thanks for the great series of articles. I’ve just started and I am amazed by how simple things appear after your explanation.
I am just curious about point 5 from the challenges.
What I came up with is 4 distinct functions of Bool > Bool signature.
Of course I can have a function like this:
But that seems the same as id function. Or I can go with another one like
But that seems the same as false.
So based on the above my intuition tells my that there are 4 distinct Bool > Bool functions. Am I right ?
December 10, 2016 at 1:33 pm
@Łukasz: Later you’ll see that these functions can be represented as
Bool^{Bool}
, or 2^{2}, which indeed gives 4.