Type System

Archived Posts from this Category

January 9, 2011

Monads for the Curious Programmer, Part 1

Posted by Bartosz Milewski under C++, Category Theory, Functional Programming, Haskell, Monads, Programming, Type System
[43] Comments

The Monad is like a bellows:
it is empty yet infinitely capable.
The more you use it, the more it produces;
the more you talk about it, the less you understand.

–Monad Te Ching

I don’t know if I’m exaggerating but it seems like every programmer who gets monads posts a tutorial about them. (And each post begins with: There’s already a lot of monad tutorials on the Internet, but…) The reason is that getting monads it’s like a spiritual experience that you want to share with others.

When facing a monad, people often behave like the three blind men describing an elephant. You’ll see monads described as containers and monads described as actions. Some people see them as a cover-up for side effects, others as examples of endofunctors in Category Theory.

Monads are hard to describe because they don’t correspond to anything in our everyday experience. Compare this with Objects in Object-Oriented programming. Even an infant knows what an object is (something you can put in your mouth). What do you do with a monad?

But first, let me answer the pertinent question:

Why Bother?

Monads enable pure functional programmers to implement mutation, state, I/O, and a plethora of other things that are not functions. Well, you might say, they brought it on themselves. They tied their hands behind their backs and now they’re bragging that they can type with their toes. Why should we pay attention?

The thing is, all those non-functional things that we are so used to doing in imperative programming are also sources of a lot of troubles. Take side effects for instance. Smart programmers (read: the ones who burnt their fingers too many times) try to minimize the use of global and static variables for fear of side effects. That’s doable if you know what you’re doing. But the real game changer is multithreading. Controlling the sharing of state between threads is not just good programming practice– it’s a survival skill. Extreme programming models are in use that eliminate sharing altogether, like Erlang’s full isolation of processes and its restriction of message passing to values.

Monads stake the ground between total anarchy of imperative languages and the rigid dictatorship of Erlang-like isolationism. They don’t prohibit sharing or side effects but let you control them. And, since the control is exercised through the type system, a program that uses monads can be checked for correctness by the compiler. Considering how hard it it to test for data races in imperative programs, I think it’s worth investing some time to learn monads.

There is also a completely different motivation: metaprogramming. The template language used for metaprogramming in C++ is a pure functional language (see my blog post, What does Haskell have to do with C++?). If monads are so important in functional programming, they must also pop up in C++ metaprogramming. And indeed they do. I hope to discuss this topic in a future post.

So what’s a monad?

A Categorical Answer

If you don’t know anything about category theory, don’t get intimidated. This is really simple stuff and it will clarify a lot of things, not to mention earning you some bragging rights. My main goal is to share some intuitions from mathematics that will build foundations for a deeper understanding of monads in programming. In this installment I will explain categories, functors, and endofunctors, leading up to monads. I will give examples taken both from everyday life and from programming. I will really get into monads and their practical applications in the next installment, so be patient.

Functors

A functor, F, is a map from one category to another: it maps objects into objects and morphisms into morphisms. But it can’t do it in a haphazard way because that would destroy the very structures that we are after. So we must impose some “obvious” (mathematicians love that word) constraints.

First of all, if you have a morphism between two objects in the first category then it better be mapped into a morphism between the corresponding objects in the second category. Fig 2 explains this diagrammatically. Object A is mapped into F(A), object B into F(B). A morphism f from A to B is mapped into a morphism F(f) from F(A) to F(B). Mathematicians say that such a diagram must commute, that is the result must be the same whether you go from A to F(A) and then apply F(f), or first apply f and then go from B to F(B).

Fig 2. Diagram showing the action of a functor F on objects A and B and a morphism f. The bottom part lives in F's domain (source) category, the top part in its codomain (the target).

Moreover, such mapping should preserve the composition property of morphisms. So if morphism h is a composition of f and g, then F(h) must be a composition of F(f) and F(g). And, of course, the functor must map identity morphisms into identity morphisms.

To get a feel for how constrained functors are by these conditions, consider how you could map the category in Fig 1 into itself (such a functor just rearranges things inside one category). There are two trivial mappings that collapse both objects into one (either A or B), and turn all morphisms into identity. Then there is the identity functor that maps both objects into themselves and all morphisms into themselves. Finally, there is just one “interesting” functor that maps A into B and B into A with f and g switching roles. Now imagine a similar category but with the g arrow removed (yes, it’s still a category). Suddenly there is no functor other than the collapsing ones between Fig 1 and that new category. That’s because the two categories have completely different structure.

Let me now jump into more familiar territory. Since we are mostly interested in one category, Hask, let me define a functor that maps that category into itself (such functors are called endofunctors). An object in Hask is a type, so our functor must map types into types. The way to look at it is that a functor in Hask constructs one type from another– it’s a type constructor. Don’t get confused by the name: a type constructor creates a new type in your program, but that type has already existed in Hask.

A classical example is the list type constructor. Given any type it constructs a list of that type. Type Integer is mapped into list of integers or, in Haskell notation, [Integer]. Notice that this is not a map defined on integer values, like 1, 2, or 3. It also doesn’t add a new type to Hask— the type [Integer] is already there. It just maps one type into another. For C++ programmers: think of mapping type T into a container of T; for instance, std::vector<T>.

Mapping the types is the easy part, what about functions? We have to find a way to take a particular function and map it into a function on lists. That’s also easy: apply the function to each element of the list in turn. There is a (higher level) function in Haskel that does it. It’s called map and it takes a function and a list and returns a new list (or, because of currying, you may say that it takes a function and returns a function acting on lists). In C++ there is a corresponding template function called std::transform (well, it takes two iterators and a function object, but the idea is the same).

Mathematicians often use diagrams to illustrate the properties of morphisms and functors (see Fig 2). The arrows for morphisms are usually horizontal, while the arrows for functors are vertical (going up). That’s why the mapping of morphisms under a functor is often called lifting. You can take a function operating on integers and “lift it” (using a functor) to a function operating on lists of integers, and so on.

The list functor obviously preserves function composition and identity (I’ll leave it as an easy but instructive exercise for the reader).

And now for another moment of Zen. What’s the second most important property of programming? Reusability! Look what we have just done: We took all the functions we’ve implemented so far and lifted them to the level of lists. We’ve got functions operating on lists essentially for free (well, we’ve got a small but important subset of those functions). And the same trick may be applied to all kinds of containers, arrays, trees, queues, unique_ptrs and more.

It’s all beautiful, but you don’t really need category theory to apply functions to lists. Still it’s always good to see patterns in programming, and this one is definitely a keeper. The real revolution starts with monads. And, guess what, the list functor is actually a monad. You just need a few more ingredients.

What’s the intuition behind the statement that mappings expose the structure of the system? Consider the schematic of the London underground in Fig 3. It’s just a bunch of circles and lines. It’s only relevant because there is a mapping between the city of London and this schematic. The circles correspond to tube stations and the lines to train connections. Most importantly, if trains run between two stations, the corresponding circles in the diagram are connected by lines and vice versa: these are the constraints that the mapping preserves. The schematic shows a certain structure that exists in London (mostly hidden underground) which is made apparent by the mapping.

Fig 3. The schematic map of London underground system.

Interestingly, what I’m doing here is also mapping: London and the underground map correspond to two categories. Trains stations/circles are objects and train connections/lines are morphism. How’s that for an example?

Endofunctors

Mathematicians love mappings that preserve “obvious” constraints. As I explained, such mappings abstract inner structures away from the details of implementation. But you can also learn a lot about structure by studying non-trivial mappings into itself. Functors that map a category into itself are called endofunctors (like endo-scopes they let you look inside things). If functors expose similarities, endofunctors expose self-similarities. Take one look at the fractal fern, Fig 4, and you’ll understand how powerful self-similarity can be.

Fig 4. This fractal fern was generated using just four endomorphisms.

With a little bit of imagination you can see the list functor exposing fern-like structures inside Hask (Fig 5). Chars fan out into lists of Chars, which then fan out into lists of lists of Chars, and so on, ad infinitum. Horizontal structures described by functions from Char to Bool are reflected at higher and higher levels as functions on lists, lists of lists, etc.

Fig 5. The action of the list type constructor reveals fractal-like structure inside Hask. The functor lifts things up, the functions act horizontally.

A C++ template that takes a type parameter could be considered a type constructor. How likely is it that it also defines a functor (loosely speaking– C++ is not as mathematized as Haskell)? You have to ask yourself: Is the type parameter constrained in any way? It’s often hard to say, because type constraints are implicit in the body of a template and are tested only during instantiation. For instance, the type parameter for a std::vector must be copyable. That eliminates, for instance, classes that have private or deleted (in C++0x) copy constructors. This is not a problem though, because copyable types form a subcategory (I’m speaking really loosely now). The important thing is that a vector of copyable is itself copyable, so the “endo-” part of the endomorphism holds. In general you want to be able to feed the type created by the type constructor back to the type constructor, as in std::vector<std::vector<Foo>>. And, of course, you have to be able to lift functions in a generic way too, as in std::transform.

Monads

Ooh, Monads!
–Haskell Simpson

It’s time to finally lift the veil. I’ll start with the definition of a monad that builds on the previous sections and is mostly used by mathematicians. There is another one that’s less intuitive but easier to use in programming. I’ll leave that one for later.

A monad is an endofunctor together with two special families of morphisms, both going vertically, one up and one down (for “directions” see Fig 5). The one going up is called unit and the one going down is called join.

Now we are juggling a lot of mappings so let’s slow down to build some intuition. Remember, a functor maps objects: in our case, types, which are sets of values. The functor doesn’t see what’s inside the objects; morphisms, in general, do. In our case, a morphism is a function that maps values of one type into values of another type. Our functors, which are defined by type constructors, usually map poorer types into richer types; in the sense that type Bool is a set that contains just two elements, True and False, but type [Bool] contains infinitely many lists of True and False.

Unit takes a value from the poorer type, then picks one value from the richer type, and pronounces the two roughly equivalent. Such a rough equivalent of True from the Bool object is the one-element list [True] from the [Bool] object. Similarly, unit would map False into [False]. It would also map integer 5 into [5] and so on.

Unit can be thought of as immersing values from a lower level into the higher level in the most natural way possible. By the way, in programming we call a family of functions defined for any type a polymorphic function. In C++, we would express unit as a template, like this:

template<class T>
std::vector<T> unit(T value) {
    std::vector<T> vec;
    vec.push_back(value);
    return vec;
}

To explain join, imagine the functor acting twice. For instance, from a given type T the list functor will first construct the type [T] (list of T), and then [[T]] (list of list of T). Join removes one layer of “listiness” by joining the sub-lists. Plainly speaking, it just concatenates the inner lists. Given, for instance, [[a, b], [c], [d, e]], it produces [a, b, c, d, e]. It’s a many-to-one mapping from the richer type to the poorer type and the type-parameterized family of joins also forms a polymorphic function (a template, in C++).

There are a few monadic axioms that define the properties of unit and join (for instance that unit and join cancel each other), but I’m not going to elaborate on them. The important part is that the existence of unit and join imposes new constraints on the endofunctor and thus exposes even more structure.

Mathematicians look at join as the grandfather of all multiplication with unit being its neutral element. It’s heaven for mathematicians because multiplication leads to algebraic structures and indeed monads are great for constructing algebras and finding their hidden properties.

Unlike mathematicians, we programmers are not that interested in algebraic structures. So there must be something else that makes monads such a hit. As I mentioned in the beginning, in programming we often face problems that don’t naturally translate into the functional paradigm. There are some types of computations that are best expressed in imperative style. It doesn’t mean they can’t be translated into functions, it’s just that the translation is somewhat awkward and tedious. Monads provide an elegant tool to do this translation. Monads made possible the absorption and assimilation of imperative programming into functional programming, so much so that some people claim (tongue in cheek?) that Haskell is the best imperative language. And like all things functional monads are bound to turn around and find their place in imperative programming. But that’s material for my next blog post.

Bibliography

The Catsters. An excellent series of videos about category theory.
Mike Vanier, Yet Another Monad Tutorial. Explains Haskel monads in great detail.
Dan Pipone, Neighborhood of Infinity. Interesting blog with a lot of insights into category theory and Haskell.
Eugenio Moggi, Notions of Computation and Monads. This is a hard core research paper that started the whole monad movement in functional languages.
Philip Wadler, Monads for Functional Programming. The classic paper introducing monads into Haskell.

January 5, 2011

Using F#: Sequences and Pipelines

Posted by Bartosz Milewski under F#, Functional Programming, ML, Programming, Type System
[15] Comments

Learning a new programming paradigm is like learning a foreign language. You learn the new vocabulary, the grammar, a few idioms, but you still formulate your thoughts in your native tongue. It takes years of immersion before you start thinking in a foreign language. In programming, the litmus test comes when you’re approaching a new problem. Will you formulate your solution in terms of the old or the new paradigm? Will you see procedures, objects, or functions?

I remember proposing an object oriented approach to the design of a content index back in my Microsoft years. The reaction was: It’s a great paradigm, but it’s not applicable to this particular problem. There are no “objects” in the content index. Indeed, you can’t find Employees and Payrolls, Students and Courses, DisplayableObjects and LightRays in the content index. But after a short brain storm we discovered such exotic objects as a Resource Manager or a Master Merge. We ended up with a fine piece of OO engineering that is still part of the Windows shell after all those years.

There’s a similar, if not larger, shift in thinking when you learn functional programming, especially if you come from an OO background. Initially you can’t help but see everything through the perspective of mutable data structures and loops. There are no obvious “functions” in your favorite problem. Functions are these weird stateless things–they always produce the same results when called with the same arguments. They are good in mathematics, but not in real-life programming.

And yet, there are people who write complete applications using functional (or at least hybrid) languages. One such example is the game The Path of Go created by Microsoft Research for the Xbox. It’s not a spectacular game as far as UI goes, but it plays some mean Go and it’s written in F#.

F# is a mostly functional language (based on ML) with support for object-oriented programming and access to the rich .NET libraries. In this respect it’s similar to Scala. Roughly: F# is to .NET what Scala is to JVM. Both languages were designed by excellent teams. F# was designed by Microsoft Research in Cambridge, England. The chief architect of F# is Don Syme. Scala is the brainchild of Martin Odersky.

I decided to learn F# and immerse myself in the new paradigm. So I had to pick a random problem, not one with obvious functional implementation, and start from scratch, design and all. In practice, I had to do a lot of experimenting in order to familiarize myself with the language and the library. Experimenting, by the way, was made relatively easy by the inclusion of an F# interpreter in Visual Studio 2010.

To learn F#, I used only online documentation which, as I found out, is not that good. There are also fewer online discussions about F# than, say, about Haskell. The two websites I used most are:

The Problem

Without further ado, let me describe the challenge:

Write a program that finds duplicate files on disk.

In particular, I was interested in finding duplicate image files, but for testing I used text files. The program should therefore concentrate on files with particular extensions. It should also be able to skip directories that I’m not interested in. The duplicates (or triplicates, etc.) don’t have to have the same names but have to have identical extensions and contents.

The Design

Not surprisingly, functional programming requires a major adjustment. It’s one thing to read somebody else’s code and admire the tricks, but a completely different thing to be designing and writing functional code from scratch. But once you get the hang of it, it actually becomes easy and quite natural.

The most important thing you notice when using functional languages is that types are mostly unobtrusive due to type inference, but type checking is very strong. Essentially, if you manage to compile your program, it usually runs correctly. You spend much less time debugging (which I found rather difficult in F#) and much more time figuring out why the types don’t match. Of course, you have to learn a whole new language of error messages.

So it’s definitely a steep learning curve but once you’re over the hump you start reaping the benefits.

The naive approach to solving my problem would be to list all files on disk (recursively, starting with the root directory) and compare each with each. That would scale very poorly, O(N²), so we need to do some pruning.

Let’s first group files by extension and size. There will be a lot of singleton groups– containing only one file with a particular combination of extension and size. Let’s eliminate them from consideration. After that we’ll be dealing with much smaller groups of files so, within those groups, we can do full-blown byte-by-byte comparisons. Strict comparisons will potentially split those groups into even smaller groups. Again, we should eliminate the resulting singletons. Finally, we should be able to print the resulting lists of lists of identical files.

For an imperative programmer the first impulse would be to use a lot of looping; e.g., for each file retrieve its extension and size, etc. An object-oriented programmer would use vectors, hash tables, and looping over iterators.

How would a functional programmer approach the subject? Iteration is out of the question. Recursion and immutable lists are in. Quite often functions operating on lists can be expressed as list comprehensions. But there’s an even better tool called sequences in F#. They’re sort of like iterators, but with some very nice compositional properties. Sequences can be composed using pipelining. So let me express the above design as a pipeline.

The Pipeline

This is the refinement of the original design that takes into account data structures: sequences, lists, and tuples in various combinations.

The source for the pipeline is a sequence of file paths coming from a recursive enumerator.
The first operation is to group the files that share the same key: in our case the key will be a tuple (file extension, file size).
The next operation is to filter out groups of length one, the singletons.
Since the grouping injected keys into our stream, we need to strip them now and convert groups to lists.
Now we group byte-wise equal files within each group.
Then we remove singletons within those subgroups,
Flatten lists of lists, and
Print the results.

There are only two stages that deal with technical details of data structures: the stripping of the keys and the flattening of the lists. Everything else follows from high-level design.

Here’s the pipeline in its full functional glory. The |> symbol is used to forward the results of one stage to the next.

enumFilesRec 
  (filterOutPaths ["c:\\Windows";"c:\\ProgramData";"c:\\Program Files"])
  (filterExt [".jpg"; ".gif"])
  "c:\\Multimedia" 
|> Seq.groupBy (fun pth->(Path.GetExtension pth, (FileInfo pth).Length))
|> Seq.filter (fun (_, s) -> (Seq.length s) > 1)
|> Seq.map (fun (_, sq) -> [for path in sq -> path]) 
|> Seq.map groupEqualFiles
|> Seq.map filterOutSingletons
|> Seq.collect Seq.ofList
|> Seq.iter (fun lst -> printfn "%A" lst)

I will go through it line by line shortly.

I realize that this is a handful and if you have no previous experience with functional programming you are likely to feel overwhelmed at some point. The important thing is to observe how the original design translates almost one-to-one into implementation. Notice also the points of customization–they are almost universally plugs for user-defined functions. For instance, you customize Seq.map, Seq.filter, or Seq.collect by passing functions, often lambdas, as their arguments. Also, look how the function enumFilesRec is used. I decided to make its first two arguments functions even though my first impulse was to directly pass lists of directories to be skipped and extensions to be accepted. This way my design will work even if I later decide to filter files by, say, time of creation or size.

The Stages

Here’s the line by line reading of the pipeline code. My suggestion is to read as far as your patience permits and then skip to conclusions.

I’m calling my function enumFilesRec with three arguments:
```
enumFilesRec 
  (filterOutPaths ["c:\\Windows";"c:\\ProgramData";"c:\\Program Files"])
  (filterExt [".jpg"; ".gif"])
  "c:\\Multimedia"
```
1. A directory filter: a function (predicate) that returns true for all directories except the ones listed as arguments to filterOutPaths. It’s worth mentioning that filterOutPaths is a function that returns another function — the predicate expected by enumFilesRec.
2. A file filter: a function that returns true only for listed extensions. Again, filterExt is a function that takes a list and returns a predicate.
3. The top directory: the root of the listing.
enumFilesRec returns a sequence of paths. Since the sequence is only evaluated on demand, the call to this function returns almost immediately.
The next stage of the pipeline:
```
|> Seq.groupBy (fun p->(Path.GetExtension p, (FileInfo p).Length))
```
applies Seq.gropuBy to the incoming sequence of paths. Seq.groupBy takes one argument– a function that takes a path and generates a key:
```
fun path -> (Path.GetExtension path, (FileInfo path).Length)
```
The key is the tuple consisting of file extension and file length:
```
(Path.GetExtension path, (FileInfo path).Length)
```
F# notation for anonymous functions (lambdas) is of the form:
```
fun x -> expr
```
The function Seq.gropuBy groups all elements of the sequence into subgroups that share the same key. The result is a sequence of sequences (the groups). Of course, to perform this step the whole input sequence must be scanned. That forces the actual listing of directories on disk, which takes the bulk of the run time.
The next stage performs Seq.filter on the sequence:
```
|> Seq.filter (fun (_, s) -> (Seq.length s) > 1)
```
Seq.filter takes a predicate– here defined by a lambda– and applies it to all elements of the sequence; passing through only those that satisfy the predicate. This is the predicate:
```
fun (_, s) -> (Seq.length s) > 1
```
Notice that the previous step produced a sequence whose elements were tuples of (key, subsequence) with the subsequences sharing the same key. The lambda pattern-matches these tuples, (_, s), ignoring the key and testing the length of the subsequence against one. That eliminates singleton groups.
We can now get rid of the keys and convert the subsequences into plain lists that will be needed for further processing.
```
|> Seq.map (fun (_, sq) -> [for path in sq -> path])
```
I use the workhorse of sequences, Seq.map, that applies a function to every element of the sequence. Remember that the element is still a tuple (key, subsequence). The lambda ignores the key and returns a list:
```
fun (_, sq) -> [for path in sq -> path]
```
The expression:
```
[for path in sq -> path]
```
enumerates the paths in the sequence sq and uses them to initialize a list (the brackets denote a list in F#). In functional programming such constructs are known as list comprehensions. The expression for path in sq -> path is called a generator.
The next stage looks deceptively simple:
```
|> Seq.map groupEqualFiles
```
It applies a function, groupEqualsFiles to each list in the sequence. The interesting work happens in that function, which I will analyze shortly. Suffice it to say that it produces a list of sublists of identical files. Some of the sublists may be singletons.

It might be a little hard to keep track of all those sequences, subsequences, and sublists. A good development environment will show you all the types while you’re developing the program. You may also sketch simple examples:
```
seq[ [[a; a]; [b]]; [[c; c; c]; [d; d]; [e]] ]
```
This one shows a sequence of lists of lists of identical elements analogous to the output of the last stage.
Next I apply another function, filterOutSingletons to each list of sublists.
```
|> Seq.map filterOutSingletons
```
I end up with a sequence of lists of sublists of length greater than one containing identical files. The sequence above would be transformed to:
```
seq[ [[a; a]]; [[c; c; c]; [d; d]] ]
```
In the next step I flatten this hierarchy using Seq.collect.
```
|> Seq.collect Seq.ofList
```
Seq.collect takes a function that turns each element of the original sequence into a sequence and concatenates all those sequences into one. Like this:
```
seq[ [a; a]; [c; c; c]; [d; d] ]
```
Remember that the element of our sequence is a list of sublists. We can easily convert such a list to a sequence by applying Seq.ofList to it. It creates a sequence of sublists, and Seq.collect will concatenate all such sequences. I end up with a big sequence of lists. Those lists contain identical files. Voila!
The final step is to print those lists.
```
|> Seq.iter (fun lst -> printfn "%A" lst)
```
I apply Seq.iter, which takes a void function (a function returning unit, in the F# parlance):
```
fun lst -> printfn "%A" lst
```
(which is really not a function because it has a side effect of printing its argument–a list). Seq.iter is just like Seq.map, but it consumes its input sequence producing nothing (except for side effects). Unlike Haskell, F# doesn’t track I/O side effects in the type system.

Details

For those who are really curious, I can go on filling in the details–the implementations of various functions used in the pipeline. Those functions use a variety of functional features of F# such as lists, recursion, pattern matching, etc. This is the bread and butter of functional programming.

Let me start with the function that enumerates files in a directory tree. The idea is to first list the files in the current directory and pass them through the file filter; then list the subdirectories, filter them through the directory filter, and recurse into each subdirectory. Since this function is the first stage of the pipeline, it should produce a sequence.

let rec enumFilesRec dirFilter fileFilter dir =
   seq {
      yield! 
         enumFilesSafe dir
         |> Seq.filter fileFilter
      yield!
         enumDirsSafe dir
         |> Seq.filter dirFilter
         |> Seq.map (fun sub -> enumFilesRec dirFilter fileFilter sub)
         |> Seq.collect id
   }

Monads Anyone?

I don’t want to scare anyone but F# sequences are monads. People usually have strong feelings about monads, some love them, some hate them. But don’t get intimidated by monads. The theory behind them is hard, but the usage is pretty simple.

To create a sequence you use the seq { ... } block sprinkled with yield and yield! statements. When such a sequence is enumerated (you can do it with a loop for instance: for elem in sqnc), each yield returns an element and suspends the execution of the seq block until the next call. The next iteration resumes right after the last yield. In our case we are building a new sequence from existing ones. To dive into another sequence inside the seq block we use yield! (yield bang). This is what the above code does: It first dives into file enumeration (a sequence returned by enumFilesSafe) and then into the enumeration of files in subdirectories.

enumFilesSafe is a function that calls the system’s Directory.EnumerateFiles API (part of the .NET library). I had to encapsulate it into my own function in order to catch (and ignore) the UnauthorizedAccessExceptions. Notice the use of pipelining to filter the paths.

After the sequence of files paths is exhausted, we enter the second yield!. This one starts by enumerating subdirectories. Subdirectory paths are pipelined through the directory filter. Now we have to do something for each subdirectory– that’s the clue to use Seq.map. The mapping function:

fun sub -> enumFilesRec dirFilter fileFilter sub

simply calls enumFilesRec recursively, passing it the filters and the name of the subdirectory. Notice that enumFilseRec returns another sequence, so we end up with a sequence of sequences corresponding to individual subdirectories. To flatten this hierarchy I use Seq.collect. Notice that I pass it the identity function, id, which just returns its argument: The elements of my sequence are sequences and I don’t have to do anything to them.

The second function I’d like to discuss is groupEqualFiles. It gets a list of file paths and splits it into sublists containing byte-wise identical files. This problem can be decomposed in the following way: Let’s pick the first file and split the rest into two groups: the ones equal to that file and the ones not equal. Then do the same with the non-equal group. The “do the same” part hints at recursion. Here’s the code:

let groupEqualFiles paths =
    let rec groupEqualFilesRec soFar lst =
        match lst with 
        | [] -> soFar
        | (file::tail) ->
            let (eq, rest) = groupFilesEqualTo file tail
            if rest.IsEmpty then eq::soFar
            else groupEqualFilesRec (eq::soFar) rest
    groupEqualFilesRec [] paths

A recursive solution often involves defining a recursive function and then calling it with some initial arguments. That’s the case here as well. The recursive function groupEqualFilesRec takes the accumulator, the soFar list, and a list of files to group.

let rec groupEqualFilesRec soFar lst =
    match lst with 
    | [] -> soFar
    | (file::tail) ->
        let (eq, rest) = groupFilesEqualTo file tail
        if rest.IsEmpty then eq::soFar
        else groupEqualFilesRec (eq::soFar) rest

The new trick here is pattern matching. A list can be empty and match the pattern [], or it can be split into the head and tail using the pattern (file::tail). In the first case I return the soFar list and terminate recursion. Otherwise I call another function groupFilesEqualTo with the head and the tail of the list. This auxiliary function returns a tuple of lists: the equal group and the rest. Symbolically, when called with a and [b; a; c; b; d; a] it produces:

([a; a; a], [b; c; b; d])

The tuple is immediately pattern matched to (eq, rest) in:

let (eq, rest) = groupFilesEqualTo file tail

if the rest is empty, I prepend the eq list to the accumulator, soFar. Otherwise I recursively call groupEqualFilesRec with the augmented accumulator and the rest.

The function groupEqualFiles simply calls the recursive groupEqualFilesRec with an empty accumulator and the initial list. The result is a list of lists.

For completeness, here’s the implementation of the recursive function groupFilesEqualTo

let rec groupFilesEqualTo file files =
   match files with 
   | [] -> ([file], [])
   | (hd::tail) -> 
       let (eqs, rest) = (groupFilesEqualTo file tail)
       if eqFiles file hd then (hd::eqs, rest) else (eqs, hd::rest)

Again, this function pattern-matches the list. If it’s empty, it returns a tuple consisting of the singleton list containing the file in question and the empty rest. Otherwise it calls itself recursively to group the tail. The result is pattern-matched into (eqs, rest). Now the comparison is made between the original file and the head of the original list (we could have done it before making the recursive call, but this way the code is more compact). If they match then the head is prepended to the list of equal files, otherwise it lands in the rest.

Did I mention there would be a test at the end? By now you should be able to analyze the implementation of filterOutSingletons:

let rec filterOutSingletons lstOfLst =
    match lstOfLst with
    | [] -> []
    | (h::t) -> 
        let t1 = filterOutSingletons t
        if (List.length h) > 1 then h::t1 else t1

Conclusions

I am not arguing that we should all switch to functional programming. For one thing, despite great progress, performance is still a problem in many areas. Immutable data structures are great, especially in concurrent programming, but can at times be awkward and inefficient. However, I strongly believe that any programmer worth his or her salt should be fluent with the functional paradigm.

The heart of programming is composition and reuse. In object-oriented programming you compose and reuse objects. In functional programming you do the same with functions. There are myriads of ways you can compose functions, the simplest being pipelining, passing functions as arguments, and returning functions from functions. There are lambdas, closures, continuations, comprehensions and, yes, monads. These are powerful tools in the hands of a skilled programmer. Not every problem fits neatly within the functional paradigm, but neither do all problems fit the OO paradigm. What’s important is having choices.

The code is available on GitHub.

November 29, 2010

Understanding C++ Concepts through Haskell Type Classes

Posted by Bartosz Milewski under C++, Functional Programming, Haskell, Programming, Type System
Leave a Comment

I gave this presentation on Nov 17 at the Northwest C++ Users Group. Here are the links to the two-part video:

There was an unresolved discussion about one of the examples–the one about template specialization conflicting with modular type checking. It turns out the example was correct (that is an error would occur).

template<class T> struct vector {
    vector(int); // has constructor
};

//-- This one typechecks
template<LessThanComparable T>
void f(T x) { vector<T> v(100); ... }

//-- Specialization of vector for int
template<> struct vector<int> {
    // missing constructor!
};

int main() { f(4); } // error

The issue was whether the instantiation of f in main would see the specialization of the vector template for int even though the specialization occurs after the definition of f. Yes, it would, because vector<T> is a dependent name inside f. Dependent names are resolved in the context of template instantiation–here in the context of main, where the specialization is visible. Only non-dependent names are resolved in the context of template definition.

June 24, 2010

C++ Concepts: a Postmortem

Posted by Bartosz Milewski under C++, Haskell, Programming, Type System
[9] Comments

Some of the brightest C++ minds had been working on concepts for years, only to vote them out of the C++0x spec–unanimously, not long after they’ve been voted in. In my naivete I decided to write a short blog post explaining why in my opinion it’s still worth pursuing concepts into the next revision of the Standard. I showed a draft to my friends and was overwhelmed by criticism. The topic turned out to be so controversial that I was forced not only to defend every single point but to discuss alternative approaches and their pluses. That’s why this post grew so large.

What Are Concepts?

In a nutshell, concepts are a type system for types. Below I have tabled corresponding features and examples of non-generic programming with types vs. generic programming with concepts. The concept Iterator specifies the “type” of the type parameter, T, which is used in the definition of the template function, Count.

Regular Programming	Generic Programming
functions, data structures	template functions, parametrized types
Example: `int strlen(char const * s)`	`template<Iterator T> void Count(T first, T last)`
parameter: `s`	type parameter: `T`
type: `char const *`	concept: `Iterator`

Just to give you a feel for the syntax, here's an example of a concept--a simplified definition of Iterator:

concept Iterator<typename T>
    : EqualityComparable<T>
{ 
    typename value_type;
    T& operator++(T&);
    ...
}

Iterator illustrates both the simplicity and the complexity of concepts. Let me concentrate on the simple aspects first and leave the discussion of complexities for later.

Iterator refines (inherits from) another concept, EqualityComparable (anything that can be compared using '=='). It declares an associated type, value_type; and an associated function, operator++. Associated types and functions are the bread and butter of concepts.

You might have seen something like an "associated type" implemented as a typedef, value_type, in STL iterators. Of course, such typedefs are only a convention. You may learn about such conventions by analyzing somebody else's code or by reading documentation (it so happens that STL is well documented, but that's definitely not true of all libraries), but they are not part of the language.

You use concepts when you define template functions or parametrized data structures. For instance, here's a toy function template, Count:

template<Iterator T> int Count(T first, T last) {
    int count = 0;
    while (!(first == last)) { 
        ++first;
        ++count;
    }
    return count;
}

As you can see, this function uses two properties of the concept Iterator: you can compare two iterators for equality and you can increment an iterator.

Ignoring the fine points for now, this seems to be pretty straightforward. Now, let's see what problems were concepts designed to solve.

Who Ordered Concepts?

Besides being an elegant abstraction over the type system, concepts offer some of the same advantages that types do:

Better error reporting
Better documentation
Overloading

Error Reporting

Why do templates produce such horrible error messages? The main reason is that errors are detected too late.

It's the same phenomenon you have in weakly typed languages, except that there error detection is postponed even further--till runtime. In weakly typed languages an error is discovered when, for instance, a certain operation cannot be performed on a certain variable. This might happen deep into the calling tree, often inside a library you're not familiar with. You essentially need to examine a stack trace to figure out where the root cause of the problem is and whether it's in your code or inside a library.

Conversely, in statically typed languages, most errors are detected during type-checking. The fact that a certain operation cannot be performed on a certain variable is encoded in the type of that variable. If the error is indeed in your code, it will be detected at the call site and will be much more informative.

In current C++, template errors are detected when a certain type argument does not support a certain operation (often expressed as the inability to convert one type to another or a failure of type deduction). It often happens deep into the instantiation tree. You need an equivalent of the stack trace to figure out where the root cause of the error is. It's called an "instantiation stack" and is dumped by the compiler upon a template error. It often spans several pages and contains unfamiliar names and implementation details of some library code.

I suspect that the main reason a lot of C++ programmers still shun template programming is because they were burnt by such error messages. I know that I take a coffee break before approaching a template error message.

Self Documentation

The only thing better than early error detection is not making the error in the first place. When you're calling a function, you usually look at its signature first. The signature tells you what types are expected and what type is returned. You don't have to study the body of the function and the bodies of functions that it calls (and so on). You don't have to rely on comments or off-line documentation. Signatures are the kind of documentation that is checked by the compiler. Even more importantly, at some point the compiler also checks the body of the function you're calling to see if its arguments are used in accordance with their declared types.

Contrast it with template programming. If the "signature" of a template is of the form template<class T>, you have no information about T. What kind of T's are acceptable can only be divined by studying the template's body and the bodies of templates it is instantiating. (And, of course, the compiler can do nothing to check this "signature" against the template body.)

The Case of STL

Even if you're not defining templates in your own code, you probably use them through the Standard Library--STL in particular. STL is a library of algorithms and containers. Alexander Stepanov's stroke of genius was to separate the two. Using STL you may apply one of the 112 algorithms to 12 standard containers. It's easy to calculate that there are 1344 possible combinations (not counting those algorithms that operate on two or three containers at a time).

Of course not all combinations of algorithms and containers make sense. For instance, you can't apply std::sort to a std::list. The reason is simple: sort (which is based on quicksort) requires random access to the container; a list can only provide sequential access. And this is the crux of the matter: this important restriction on access cannot be expressed in C++. Granted, the code will not compile because at some point sort will try to perform some operation that the list iterator doesn't support. You'll get a page-long error message. By my last count a leading C++ compiler emits 73 cryptic lines, many of them stretching to 350 characters, and nary a mention of random access being required by sort.

Template error messages would be even worse if not for a system of hacks (I mean, template tricks) and concept-like conventions in the STL. For instance, a dummy field iterator_tag corresponds to a concept name and iterator traits look a lot like concept maps. With a number of new features in C++0x (especially decltype that make SFINAE constructs such as the enable_if more powerful), even more "tricks" became possible. If some of this sounds like Gibberish to you, you're in good company. That's the mess that the concept proposal was trying to fix. Concepts do introduce a lot of complexity into the language but they reduce and organize the complexity of programs.

It's worth mentioning that Alex Stepanov collaborated in the development of the concept description language, Tecton, and participated in the C++ concept effort, so he's been acutely aware of the problems with STL.

Competing Options

The idea of limiting types that can be passed to generic algorithms or data structures has been around for some time. The C++ concept proposal is based on the Haskell's notion of type classes. Several other languages opted for simpler constrained templates instead.

Constrained Templates

One way to look at concepts is that they constrain the kind of types that may be passed to a template function or data type. One can drop concepts and instead impose type constraints directly in the definition of a template. For instance, rather than defining the Iterator concept, we could have tried to impose analogous type constraints in the definition of Count (C++ doesn't have special syntax for it but, of course, there are template tricks for everything).

There are various ways of expressing type constraints, so let's look at a few examples. In Java, everything is an object, and objects are classified by classes and interfaces. Java reuses the same class/interface mechanism for constraining template parameters. A generic SortedList, for instance, is parametrized by the type of its elements, T. The constraint imposed on type T is that it implements the interface Comparable.

class SortedList<T extends Comparable<T>>

This approach breaks down for built-in types since, for instance, int cannot be derived from Comparable. (Java's workaround is to use boxed types, like Integer, which are magically derived from Comparable.)

C# has similar approach, with some restrictions and some extensions. For instance, C# lets you specify that a given type is default constructible, as in this example:

class WidgetFactory<T> where T : Widget, new()

Here type T must be derived from Widget and it must have a default constructor.

In the D programming language, template constraints may be expressed by arbitrary compile-time functions as well as use-patterns, which are the topic of the next section.

These approaches are constrained to structural (as opposed to semantic) matching of types. A type is accepted by a constrained template if the operations supported by it match those specified by the constraints name-by-name. If one library names the same operations differently, they won't match (see how Concept Maps deal with this problem).

Use Patterns vs. Signatures

Use patterns formed the basis on the original Texas proposal for C++ Concepts. The idea was to show to the compiler representative examples of what you were planning to do with type arguments. These use patterns look almost like straightforward C++. For instance, an Iterator concept might be defined as follows:

concept Iterator<typename Iter, typename Value> {
    Var<Iter> i;
    Iter j = i; // copyable
    bool b = (i != j); // (non)-equality comparable
    ++i; // supports operator++
    Var<const Value> v;
    *i = v; // connects Value type with Iter type
    ...
}

Notice the subtlety with defining dummy variables i and v. You can't just write:

Iter i;

The Type Equality Problem

It's hard to express strict equality of types with use patterns. You might think that the following pattern would work:

void f(T t, U u);
Var<T> x;
Var<U> y;
f(y, x);

In reality it only insures that the types are mutually implicit-convertible. In particular, it will match short with long (yes, you can implicitly convert a long to a short in C++).

Who cares?, you might ask. Unfortunately strict type equality might be required for template type inference. The following code will not compile because the compiler will look for f(short, long), even though the instantiation of f(long, long) would work with implicit type conversions.

template<class T>
void f(T x, T y);
short i;
long j;
f(i, j);

In the concept language, there is a special concept, SameType, for this purpose.

because that would imply that Iter has a default constructor-- which is not required. Hence special notation Var<T> for introducing dummy variables. In general, the use-pattern language was supposed to be a subset of C++ with some additional syntax on the side. To my knowledge this subset has never been formally defined. Also, there are some useful concepts that are hard or impossible to define in use-pattern language (see the sidebox).

Then there is the problem of concept-checking the body of a template before it is instantiated-- the holy grail of conceptology. It involves some nontrivial analysis and comparison of two parsing trees (the pattern's, and the template body's). As far as I know no formal algorithm has been presented.

Because of the difficulties in formalizing the use-pattern approach, the C++ committee decided to focus on defining concepts using signatures.

We've seen such a signature in the Iterator concept: T&operator++(T&).

How are we to interpret a signature within the context of a concept? Naively, we'd require that there should exist a free function called operator++. But what if the type T has a method operator++ instead? Do we have to list it too (as T::operator++())? A uniform call syntax was proposed that would unify functions and method calls. Our operator++ would match a function or a method (this feature didn't make it into the final concept proposal.)

Also, an operator signature must also match the built-in operator--this way we'd allow an arbitrary pointer to match the Iterator concept.

These are some of the complications I mentioned at the beginning of this post. There are also complications related to symbol lookup (it has to take into account the Argument-Dependent-Lookup hack) and overload resolution, which are notoriously difficult in C++.

Who Ordered Concept Maps?

So far we've talked about concepts and about templates that use concepts. Now let's focus on what happens when we instantiate a template with concrete types. How does the compiler check that those types fulfill the constraints of the concepts? One possibility is the "duck-typing" approach--if the type in question defines the same associated types and functions (and possibly some other constraints I haven't talked about), it is a match. This is called structural (same set of functions) and nominal (by-name) matching.

Structural matching is simple and straightforward but it might lead to "false positives." Here's the archetypal example: Syntactically, there is no difference between a ForwardIterator and an InputIterator, yet semantically those two are not equivalent. Some algorithms that work fine onForwardIterators will fail on InputIterators (the difference is that you can't go twice through an InputIterator).

Semantic matching, on the other hand, requires the client to explicitly match types with concepts (only the clients knows the meaning of a particular concept). This is done through concept maps (in Haskell they are called "instances"). With this approach, the mapping between Iterator and a pointer to T would require the following statement:

template<T>
concept_map Iterator<T *> {
    typedef T value_type;
};

Notice that we don't have to provide the mapping for operator++; it's automatically mapped into the built-in ++. In general, if the associated functions have the same names as the functions defined for the particular type (like ++ in this case), the concept map may be empty. It's mostly because of the need to create those empty concept maps that the concept proposal was rejected.

Interestingly, both groups that worked on concepts-- the Texas group with Bjarne Stroustrup and the Indiana group with Doug Gregor-- supported concept maps. The difference was that the Texas group wanted concepts to be matched automatically, unless marked as explicit; whereas the Indiana group required explicit concept maps, unless the concept was marked auto.

On a more general level, concept maps enable cross-matching between different libraries and naming conventions. Semantically, a concept from one library may match a type from another library, but the corresponding functions and types may have different names. For instance, a StackOfInt concept from one library may declare an associated function called push. A vector from the Standard Library, conceptually, implements this interface, but uses a different name for it. A concept map would provide the appropriate mapping:

concept_map StackOfInt<std::vector<int>> {
    void push(std::vector<int> & v, int x) {
        v.push_back(x);
    }
}

Conclusion

In my opinion, concepts would add a lot to C++. They would replace conventions and hacks in the STL with a coherent set of language rules. They would provide verifiable documentation and drastically simplify template error messages.

Concept maps have an important role to play as the glue between concepts and types. With concept maps you'll be able to add, post hoc, a translation layer between disparate libraries. The back-and-forth between explicit and auto concepts has a philosophical aspect: Do we prefer precision or recall in concept matching? It also has a practical aspect: Is writing empty concept maps an abomination? More experience is needed to resolve this issue.

The major complaint I've heard about the concept proposal was that it was too complex. Indeed, it took 30 pages to explain it in the Draft Standard. However, adding a new layer of abstraction to an existing language, especially one with such baggage as C++, cannot be trivial. The spec had to take into account all possible interaction with other language features, such as name resolution, overload resolution, ADL, various syntactic idiosyncrasies, and so on. But programmers rarely read language specs. I'm sure that, if there was a book "Effective Concepts" or some such, programmers would quickly absorb the rules and start using concepts.

The current hope is that concepts will become part of the 1x version of C++. Doug Gregor is working on a new version of the C++ compiler, which is likely to become testing grounds for concepts. He also publishes a blog with Dave Abrahams (see for instance this post and the follow up).

On the other hand, the concept effort has definitely lost its momentum after it was voted out of C++0x. Corporate support for further research is dwindling. The hope is in the educated programmers saying no to hacks and demanding the introduction of concepts in future versions of the C++ Standard.

Acknowledgments

I'm grateful to the "Seattle Gang", in particular (alphabetically) Andrei Alexandrescu, Walter Bright, Dave Held, Eric Niebler, and Brad Roberts, for reading and commenting on several drafts of this post. I'd like to thank Jeremy Siek for interesting discussions and the inspiration for this post.

Bibliography

Gabriel Dos Reis and Bjarne Stroustrup, Specifying C++ Concepts, the original Texas proposal with use patterns.
Joint paper by the Indiana and Texas groups, Concepts: Linguistic Support for Generic Programming in C++
Google Video of Doug Gregor talking about concepts.
2009 Draft C++ Standard with concepts still in. See section 14.10.
Bjarne Stroustrup, The concept killer article.
Jeremy Siek, The C++0x Concept Effort presentation slides.
Posts about concepts on C++Next by Dave Abrahams and Doug Gregor.
Dave Abrahams, To Auto or Not?.
Examples of concept-like hacks in C++0x.

February 8, 2010

Real-Life Threads

Posted by Bartosz Milewski under C++, Concurrency, Multithreading, Programming, Type System
Leave a Comment

What’s the use of theory if you can’t apply it in practice? I’ve been blogging about concurrency, starting with the intricacies of multicore memory models, all the way through to experimental thread-safe type systems and obscure languages. In real life, however, I do most of my development in C++, which offers precious little support for multithreading. And yet I find that my coding is very strongly influenced by theoretical work.

C++ doesn’t support the type system based on ownership, which is a way to write provably race-free code; in fact it doesn’t have a notion of immutability, and very limited support for uniqueness. But that doesn’t stop me from considering ownership properties in my design and implementation. A lot of things become much clearer and less error-prone with better understanding of high-level paradigms.

I own a tiny software company, Reliable Software, which makes a distributed version control system, Code Co-op, written entirely in C++. Recently I had to increase the amount of concurrency in one of its components. I reviewed existing code, written some years ago, and realized how unnecessarily complex it was. No, it didn’t have data races or deadlocks, but it wasn’t clear how it would fare if more concurrency were added. I decided to rewrite it using the new understanding.

As it turned out, I was able to make it much more robust and maintainable. I found many examples of patterns based on ownership. I was able to better separate shared state from the non-shared, immutable from mutable, values from references. I was able to put synchronization exactly where it was needed and remove it from where it wasn’t. I’ve found reasonable trade-offs between message passing and data sharing.

It was a pretty amazing trip. I described it in my other blog, which deals more with actual programming experience. I though it might be a nice break from all the theory you normally find in this blog.

November 30, 2009

Unique Objects

Posted by Bartosz Milewski under Concurrency, Java, Multithreading, Programming, Scala, Type System
[6] Comments

I’ve blogged before about the C++ unique_ptr not being unique and how true uniqueness can be implemented in an ownership-based type system. But I’ve been just scratching the surface.

The new push toward uniqueness is largely motivated by the demands of multithreaded programming. Unique objects are alias free and, in particular, cannot be accessed from more than one thread at a time. Because of that, they never require locking. They can also be safely passed between threads without the need for deep copying. In other words, they are a perfect vehicle for safe and efficient message passing. But there’s a rub…

The problem is this: How do you create and modify unique objects that have internal pointers. A classic example is a doubly linked list. Consider this Java code:

public class Node {
    public Node _next;
    public Node _prev;
}
public class LinkedList {
    private Node _head;
    public void insert(Node n) {
        n._next = _head;
        if (_head != null)
            _head._prev = n;
        _head = n;
    }
}

Suppose that you have a unique instance of an empty LinkedList and you want to insert a new link into it without compromising its uniqueness.

The first danger is that there might be external aliases to the node you are inserting–the node is not unique, it is shared. In that case, after the node is absorbed:

_head = n;

_head would be pointing to an alias-contaminated object. The list would “catch” aliases and that would break the uniqueness property.

The remedy is to require that the inserted node be unique too, and the ownership of it be transferred from the caller to the insert method. (Notice however that, in the process of being inserted, the node loses its uniqueness, since there are potentially two aliases pointing to it from inside the list–one is _head and the other is _head._prev. Objects inside the list don’t have to be unique–they may be cross-linked.)

The second danger is that the method insert might “leak” aliases. The tricky part is when we let the external node, n, store the reference to our internal _head:

n._next = _head

We know that this is safe here because the node started unique and it will be absorbed into the list, so this alias will become an internal alias, which is okay. But how do we convince the compiler to certify this code as safe and reject code that isn’t? Type system to the rescue!

Types for Uniqueness

There have been several approaches to uniqueness using the type system. To my knowledge, the most compact and comprehensive one was presented by Haller and Odersky in the paper, Capabilities for External Uniqueness, which I will discuss in this post. The authors not only presented the theory but also implemented the prototype of the system as an extension of Scala. Since not many people are fluent in Scala, I’ll translate their examples into pseudo-Java, hopefully not missing much.

Both in Scala and Java one can use annotations to extend the type system. Uniqueness introduces three such annotations, @unique, @transient, and @exposed; and two additional keywords, expose and localize.

-Objects that are @unique

In the first approximation you may think of a @unique object as a leak-proof version of C++ unique_ptr. Such object is guaranteed to be “tracked” by at most one reference–no aliases are allowed. Also no external references are allowed to point to the object’s internals and, conversely, object internals may not reference any external objects. However, and this is a very important point, the insides of the @unique object may freely alias each other. Such a closed cross-linked mess is called a cluster.

Consider, for instance, a (non-empty) @unique linked list. It’s cluster consists of cross-linked set of nodes. It’s relatively easy for the compiler to guarantee that no external aliases are created to a @unique list–the tricky part is to allow the manipulation of list internals without breaking its uniqueness (Fig 1 shows our starting point).

Fig 1. The linked list and the node form separate clusters

Look at the definition of insert. Without additional annotations we would be able to call it with a node that is shared between several external aliases. After the node is included in the list, those aliases would be pointing to the internals of the list thus breaking its uniqueness. Because of that, the uniqueness-savvy compiler will flag a call to such un-annotated insert on a @unique list as an error. So how can we annotate insert so that it guarantees the preservation of uniqueness?

-Exposing and Localizing

Here’s the modified definition of insert:

public void insert(@unique Node n) @transient {
    expose (this) { list =>
        var node = localize (n, list);
        node._next = list._head;
        if (list._head != null)
            list._head._prev = node;
        list._head = node;
    }
}

Don’t worry, most of the added code can be inferred by the compiler, but I make it explicit here for the sake of presentation. Let’s go over some of the details.

The node, n passed to insert is declared as @unique. This guarantees that it forms its own cluster and that n is the only reference to it. Also, @unique parameters to a method are consumed by that method. The caller thus loses her reference to it (the compiler invalidates it), as demonstrated in this example:

@unique LinkedList lst = new @unique LinkedList();
@unique Node nd = new @unique Node();
lst.insert(nd);
nd._next; // error: nd has been consumed!

The method itself is annotated as @transient. It means that the this object is @unique, but not consumed by the method. In general, the @transient annotation may be applied to any parameter, not only this. You might be familiar with a different name for transient–borrowed.

Inside insert, the this parameter is explicitly exposed (actually, since the method is @transient, the compiler would expose this implicitly).

expose (this) { list => ... }

The new name for the exposed this is list.

Once a cluster is exposed, some re-linking of its constituents is allowed. The trick is not to allow any re-linking that would lead to the leakage of aliases. And here’s the trick: To guarantee no leakage, the compiler assigns the exposed object a special type–its original type tagged by a unique identifier. This identifier is created for the scope of each expose statement. All members of the exposed cluster are also tagged by the same tag. Since the compiler type-checks every assignment it automatically makes sure that both sides have the same tag.

Now we need one more ingredient–bringing the @unique node into the cluster. This is done by localizing the parameter n to the same cluster as list.

var node = localize (n, list);

The localize statement does two things. It consumes n and it returns a reference to it that is tagged by the same tag as its second parameter. From that point on, node has the same tagged type as all the exposed nodes inside the list, and all assignments type-check.

Fig 2. The list has been exposed: all references are tagged. The node has been localized (given the same tag as the list). Re-linking is now possible without violating the type system.

Note that, in my pseudo-Java, I didn’t specify the type of node returned by localize. That’s because tagged types are never explicitly mentioned in the program. They are the realm of the compiler.

Functional Decomposition

The last example was somewhat trivial in that the code that worked on exposed objects fit neatly into one method. But a viable type system cannot impose restrictions on structuring the code. The basic requirement for any programming language is to allow functional decomposition–delegating work to separate subroutines, which can be re-used in other contexts. That’s why we have to be able to define functions that operate on exposed and/or localized objects.

Here’s an example from Haller/Odersky that uses recursion within the expose statement. append is a method of a singly-linked list:

void append(@unique SinglyLinkedList other) @transient
{
    expose(this) { list =>
        if (list._next == null)
            list._next = other; // localize and consume
        else
            list._next.append(other);
    }
}

In the first branch of the if statement, a @unique parameter, other, is (implicitly) localized and consumed. In the second branch, it is recursively passed to append. Notice an important detail, the subject of append, list._next, is not @unique–it is exposed. Its type has been tagged by a unique tag. But the append method is declared as @transient. It turns out that both unique and exposed arguments may be safely accepted as transient parameters (including the implicit this parameter).

Because of this rule, it’s perfectly safe to forgo the explicit expose inside a transient method. The append method may be thus simplified to:

void append(@unique SinglyLinkedList other) @transient
{
    // 'this' is implicitly exposed
    if (_next == null)
        _next = other; // localize and consume
    else
        _next.append(other);
}

Things get a little more interesting when you try to reuse append inside another method. Consider the implementation of insert:

void insert(@unique SingleLinkedList other) @transient
{
    var locOther = localize(other, this);
    if (other != null) 
    {
        locOther.append(_next)
        _next = locOther;
   }
}

The insert method is transient–it works on unique or exposed lists. It accepts a unique list, other, which is consumed by the localize statement. The this reference is implicitly exposed with the same tag as locOther, so the last statement _next=locOther type-checks. The only thing that doesn’t type-check is the argument to append, which is supposed to be unique, but here it’s exposed instead.

This time there is no safe conversion to help us, so if we want to be able to reuse append, we have to modify its definition. First of all, we’ll mark its parameter as @exposed. An exposed parameter is tagged by the caller. In order for append to work, the this reference must also be tagged by the caller–with the same tag. Otherwise the assignment, _next=other, inside append, wouldn’t type-check. It follows that the append method must also be marked as @exposed (when there is more than one exposed parameter, they all have to be tagged with the same tag).

Here’s the new version of append:

void append(@exposed SinglyLinkedList other) @exposed
{
    if (_next == null)
        _next = other; // both exposed under the same tag
    else
        _next.append(other); // both exposed under the same tag
}

Something interesting happened to append. Since it now operates on exposed objects, it’s the caller’s responsibility to expose and localize unique object (this is exactly what we did in insert). Interestingly, append will now also operate on non-annotated types. You may, for instance, append one non-unique list to another non-unique list and it will type-check! That’s because non-annotated types are equivalent to exposed types with a null tag–they form a global cluster of their own.

This kind of polymorphism (non-annotated/annotated) means that in many cases you don’t have to define separate classes for use with unique objects. What Haller and Odersky found out is that almost all class methods in the Scala’s collection library required only the simplest @exposed annotations without changing their implementation. That’s why they proposed to use the @exposed annotation on whole classes.

Conclusion

Every time I read a paper about Scala I’m impressed. It’s a language that has very solid theoretical foundations and yet is very practical–on a par with Java, whose runtime it uses. I like Scala’s approach towards concurrency, with strong emphasis on safe and flexible message passing. Like functional languages, Scala supports immutable messages. With the addition of uniqueness, it will also support safe mutable messages. Neither kind requires synchronization (outside of that provided by the message queue), or deep copying.

There still is a gap in the Scala’s concurrency model–it’s possible to share objects between threads without any protection. It’s up to the programmer to declare shared methods as synchronized–just like in Java; but there is no overall guarantee of data-race freedom. So far, only ownership systems were able to deliver that guarantee, but I wouldn’t be surprised if Martin Odersky had something else up his sleeve for Scala.

I’d like to thank Philip Haller for reading the draft of this post and providing valuable comments. Philip told me that a new version of the prototype is in works, which will simplify the system further, both for the programmer and the implementer.

September 22, 2009

Ownership Systems against Data Races

Posted by Bartosz Milewski under C++, Concurrency, Programming, Type System
[8] Comments

Here’s the video from my recent talk to the Northwest C++ Users Group (NWCPP) about how to translate the data-race free type system into a system of user-defined annotations in C++. I start with the definition of a data race and discuss various ways to eliminate them. Then I describe the ownership system and give a few examples of annotated programs.

Here are the slides from the presentation. They include extensive notes.

By the way, we are looking for speakers at the NWCPP, not necessarily related to C++. We are thinking of changing the charter to include all programming languages. If you are near Seattle on Oct 21 09 (or any third Wednesday of the month), and you are ready to give a 60 min presentation, please contact me.

July 16, 2009

On Actors and Casting

Posted by Bartosz Milewski under C++, Concurrency, D Programming Language, Erlang, Java, Multithreading, Programming, Scala, Type System
[31] Comments

Is the Actor model just another name for message passing between threads? In other words, can you consider a Java Thread object with a message queue an Actor? Or is there more to the Actor model? Bartosz investigates.

I’ll start with listing various properties that define the Actor Model. I will discuss implementation options in several languages.

Concurrency

Actors are objects that execute concurrently. Well, sort of. Erlang, for instance, is not an object-oriented language, so we can’t really talk about “objects”. An actor in Erlang is represented by a thing called a Process ID (Pid). But that’s nitpicking. The second part of the statement is more interesting. Strictly speaking, an actor may execute concurrently but at times it will not. For instance, in Scala, actor code may be executed by the calling thread.

Caveats aside, it’s convenient to think of actors as objects with a thread inside.

Message Passing

Actors communicate through message passing. Actors don’t communicate using shared memory (or at least pretend not to). The only way data may be passed between actors is through messages.

Erlang has a primitive send operation denoted by the exclamation mark. To send a message Msg to the process (actor) Pid you write:

Pid ! Msg

The message is copied to the address space of the receiver, so there is no sharing.

If you were to imitate this mechanism in Java, you would create a Thread object with a mailbox (a concurrent message queue), with no public methods other than put and get for passing messages. Enforcing copy semantics in Java is impossible so, strictly speaking, mailboxes should only store built-in types. Note that passing a Java Strings is okay, since strings are immutable.

-Typed messages

Here’s the first conundrum: in Java, as in any statically typed language, messages have to be typed. If you want to process more than one type of messages, it’s not enough to have just one mailbox per actor. In Erlang, which is dynamically typed, one canonical mailbox per actor suffices. In Java, mailboxes have to be abstracted from actors. So an actor may have one mailbox for accepting strings, another for integers, etc. You build actors from those smaller blocks.

But having multiple mailboxes creates another problem: How to block, waiting for messages from more than one mailbox at a time without breaking the encapsulation? And when one of the mailboxes fires, how to retrieve the correct type of a message from the appropriate mailbox? I’ll describe a few approaches.

-Pattern matching

Scala, which is also a statically typed language, uses the power of functional programming to to solve the typed messages problem. The receive statement uses pattern matching, which can match different types. It looks like a switch statements whose case labels are patterns. A pattern may specify the type it expects. You may send a string, or an integer, or a more complex data structure to an actor. A single receive statement inside the actor code may match any of those.

receive {
    case s: String => println("string: "+ s)
    case i: Int => println("integer: "+ i)
    case m => println("unknown: "+ m)
}

In Scala the type of a variable is specified after the colon, so s:String declares the variable s of the type String. The last case is a catch-all.

This is a very elegant solution to a difficult problem of marrying object-oriented programming to functional programming–a task at which Scala exceeds.

-Casting

Of course, we always have the option of escaping the type system. A mailbox could be just a queue of Objects. When a message is received, the actor could try casting it to each of the expected types in turn or use reflection to find out the type of the message. Here’s what Martin Odersky, the creator of Scala, has to say about it:

The most direct (some would say: crudest) form of decomposition uses the type-test and type-cast instructions available in Java and many other languages.

In the paper he co-authored with Emir and Williams (Matching Objects With Patterns) he gives the following evaluation of this method:

Evaluation: Type-tests and type-casts require zero overhead for the class hierarchy. The pattern matching itself is very verbose, for both shallow and deep patterns. In particular, every match appears as both a type-test and a subsequent type-cast. The scheme raises also the issue that type-casts are potentially unsafe because they can raise ClassCastExceptions. Type-tests and type-casts completely expose representation. They have mixed characteristics with respect to extensibility. On the one hand, one can add new variants without changing the framework (because there is nothing to be done in the framework itself). On the other hand, one cannot invent new patterns over existing variants that use the same syntax as the type-tests and type-casts.

The best one could do in C++ or D is to write generic code that hides casting from the client. Such generic code could use continuations to process messages after they’ve been cast. A continuation is a function that you pass to another function to be executed after that function completes (strictly speaking, a real continuation never returns, so I’m using this word loosely). The above example could be rewritten in C++ as:

void onString(std::string const & s) {
    cout << "string: " << s << std::endl;
}
void onInt(int i) {
    cout << "integer: " << i << std::endl;
}

receive<std::string, int> (&onString, &onInt);

where receive is a variadic template (available in C++0x). It would do the dynamic casting and call the appropriate function to process the result. The syntax is awkward and less flexible than that of Scala, but it works.

The use of lambdas might make things a bit clearer. Here’s an example in D using lambdas (function literals), courtesy Sean Kelly and Jason House:

receive(
    (string s){ writefln("string: %s", s); },
    (int i){ writefln("integer: %s", i); }
);

Interestingly enough, Scala’s receive is a library function with the pattern matching block playing the role of a continuation. Scala has syntactic sugar to make lambdas look like curly-braced blocks of code. Actually, each case statement is interpreted by Scala as a partial function–a function that is not defined for all values (or types) of arguments. The pattern matching part of case becomes the isDefinedAt method of this partial function object, and the code after that becomes its apply method. Of course, partial functions could also be implemented in C++ or D, but with a lot of superfluous awkwardness–lambda notation doesn’t help when partial functions are involved.

-Isolation

Finally, there is the problem of isolation. A message-passing system must be protected from data sharing. As long as the message is a primitive type and is passed by value (or an immutable type passed by reference), there’s no problem. But when you pass a mutable Object as a message, in reality you are passing a reference (a handle) to it. Suddenly your message is shared and may be accessed by more than one thread at a time. You either need additional synchronization outside of the Actor model or risk data races. Languages that are not strictly functional, including Scala, have to deal with this problem. They usually pass this responsibility, conveniently, to the programmer.

-Kilim

Java is not a good language to implement the Actor model. You can extend Java though, and there is one such extension worth mentioning called Kilim by Sriram Srinivasan and Alan Mycroft from Cambridge, UK. Messages in Kilim are restricted to objects with no internal aliasing, which have move semantics. The pre-processor (weaver) checks the structure of messages and generates appropriate Java code for passing them around. I tried to figure out how Kilim deals with waiting on multiple mailboxes, but there isn’t enough documentation available on the Web. The authors mention using the select statement, but never provide any details or examples.

Correction: Sriram was kind enough to provide an example of the use of select:

int n = Mailbox.select(mb0, mb1, .., timeout);

The return value is the index of the mailbox, or -1 for the timeout. Composability is an important feature of the message passing model.

Dynamic Networks

Everything I described so far is common to CSP (Communicating Sequential Processes) and the Actor model. Here’s what makes actors more general:

Connections between actors are dynamic. Unlike processes in CSP, actors may establish communication channels dynamically. They may pass messages containing references to actors (or mailboxes). They can then send messages to those actors. Here’s a Scala example:

receive {
    case (name: String, actor: Actor) =>
        actor ! lookup(name)
}

The original message is a tuple combining a string and an actor object. The receiver sends the result of lookup(name) to the actor it has just learned about. Thus a new communication channel between the receiver and the unknown actor can be established at runtime. (In Kilim the same is possible by passing mailboxes via messages.)

Actors in D

The D programming language with my proposed race-free type system could dramatically improve the safety of message passing. Race-free type system distinguishes between various types of sharing and enforces synchronization when necessary. For instance, since an Actor would be shared between threads, it would have to be declared shared. All objects inside a shared actor, including the mailbox, would automatically inherit the shared property. A shared message queue inside the mailbox could only store value types, unique types with move semantics, or reference types that are either immutable or are monitors (provide their own synchronization). These are exactly the types of messages that may be safely passed between actors. Notice that this is more than is allowed in Erlang (value types only) or Kilim (unique types only), but doesn’t include “dangerous” types that even Scala accepts (not to mention Java or C++).

I will discuss message queues in the next installment.

June 23, 2009

Multithreading Tutorial: Globals

Posted by Bartosz Milewski under C++, Concurrency, D Programming Language, Java, Multithreading, Programming, Type System
[2] Comments

If it weren’t for the multitude of opportunities to shoot yourself in the foot, multithreaded programming would be easy. I’m going to discuss some of these “opportunities” in relation to global variables. I’ll talk about general issues and discuss the ways compilers can detect them. In particular, I’ll show the protections provided by my proposed extensions to the type system.

Global Variables

There are so many ways the sharing of global variables between threads can go wrong, it’s scary.

Let me start with the simplest example: the declaration of a global object of class Foo (in an unspecified language with Java-like syntax).

Foo TheFoo = new Foo;

In C++ or Java, TheFoo would immediately be visible to all threads, even if Foo provided no synchronization whatsoever (strictly speaking Java doesn’t have global variables, but static data members play the same role).

If the programmer doesn’t do anything to protect shared data, the default immediately exposes her to data races.

The D programming language (version 2.0, also known as D2) makes a better choice–global variables are, by default, thread local. That takes away the danger of accidental sharing. If the programmer wants to share a global variable, she has to declare it as such:

shared Foo TheFoo = new shared Foo;

It’s still up to the designer of the class Foo to provide appropriate synchronization.

Currently, the only multithreaded guarantee for shared objects in D2 is the absence of low-level data races on multiprocessors–and even that, only in the safe subset of D. What are low level data races? Those are the races that break some lock-free algorithms, like the infamous Double-Checked Locking Pattern. If I were to explain this to a Java programmer, I’d say that all data members in a shared object are volatile. This property propagates transitively to all objects the current object has access to.

Still, the following implementation of a shared object in D would most likely be incorrect even with the absence of low-level data races:

class Foo {
    private int[] _arr;
    public void append(int i) {
       _arr ~= i; // array append
    }
}

auto TheFoo = new shared Foo;

The problem is that an array in D has two fields: the length and the pointer to a buffer. In shared Foo, each of them would be updated atomically, but the duo would not. So two threads calling TheFoo.append could interleave their updates in an unpredictable way, possibly leading to loss of data.

My race-free type system goes further–it eliminates all data races, both low- and high-level. The same code would work differently in my scheme. When an object is declared shared, all its methods are automatically synchronized. TheFoo.append would take Foo‘s lock and make the whole append operation atomic. (For the advanced programmer who wants to implement lock-free algorithms my scheme offers a special lockfree qualifier, which I’ll describe shortly.)

Now suppose that you were cautious enough to design your Java/D2 class Foo to be thread safe:

class Foo {
    private int [] _arr;
    public synchronized void append(int i) {
       _arr ~= i; // array append
    }
}

Does it mean your global variable, TheFoo, is safe to use? Not in Java. Consider this:

static Foo TheFoo; // static = global
// Thread 1
TheFoo = new Foo();
// Thread 2
while (TheFoo == null)
    continue;
TheFoo.append(1);

You won’t even know what hit you when your program fails. I will direct the reader to one of my older posts that explains the problems of publication safety on a multiprocessor machine. The bottom line is that, in order to make your program work correctly in Java, you have to declare TheFoo as volatile (or final, if you simply want to prevent such usage). Again, it looks like in Java the defaults are stacked against safe multithreading.

This is not a problem in D2, since shared implies volatile.

In my scheme, the default behavior of shared is different. It works like Java’s final. The code that tries to rebind the shared object (re-assign to the handle) would not compile. This is to prevent accidental lock-free programming. (If you haven’t noticed, the code that waits on the handle of TheFoo to switch from null to non-null is lock-free. The handle is not protected by any lock.) Unlike D2, I don’t want to make lock-free programming “easy,” because it isn’t easy. It’s almost like D2 is endorsing lock-free programming by giving the programmer a false sense of security.

So what do you do if you really want to spin on the handle? You declare your object lockfree.

lockfree Foo TheFoo;

lockfree implies shared (it doesn’t make sense otherwise), but it also makes the handle “volatile”. All accesses to it will be made sequentially consistent (on the x86, it means all stores will compile to xchg).

Note that lockfree is shallow–data members of TheFoo don’t inherit the lockfree qualification. Instead, they inherit the implied shared property of TheFoo.

It’s not only object handles that can be made lockfree. Other atomic data types like integers, Booleans, etc., can also be lockfree. A lockfree struct is also possible–it is treated as a tuple whose all elements are lockfree. There is no atomicity guarantee for the whole struct. Methods can be declared lockfree to turn off default synchronization.

Conclusion

Even the simplest case of sharing a global variable between threads is fraught with danger. My proposal inobtrusively eliminates most common traps. The defaults are carefully chosen to let the beginners avoid the pitfalls of multithreaded programming.

June 15, 2009

Race-free Multithreading : Owner polymorphism

Posted by Bartosz Milewski under Concurrency, D Programming Language, Java, Multithreading, Programming, Type System
[4] Comments

In my last post I talked about the proposal for the ownership scheme for multithreaded programs that provides alias control and eliminates data races. The scheme requires the addition of new type qualifiers to the (hypothetical) language. The standard concern is that new type qualifiers introduce code duplication. The classic example is the duplication of getters required by the introduction of the const modifier:

class Foo {
    private Bar _bar;
public:
    Bar get() {
        return _bar;
    }
    const Bar get() const {
        return _bar;
    }
}

Do ownership annotations lead to the same kind of duplication? Fortunately not. It’s true that, in most cases, two implementations of each public method are needed–with and without synchronization–but this is taken care by the compiler, not by the programmer. Unlike in Java, we don’t need a different class for shared Vector and thread-local ArrayList. In my scheme, when a vector is instantiated as a monitor (shared), the compiler automatically puts in the necessary synchronization code.

Need for generic code

The ownership scheme introduces an element of genericity by letting the programmer specify ownership during the instantiation of a class (just as a template parameter is specified during the instantiation of a template).

I already mentioned that most declarations can be expressed in two ways: one using type qualifiers, another using template notation–the latter exposing the generic nature of ownership annotations. For instance, these two forms are equivalent:

auto foo2 = new shared Foo;
auto foo2 = new Foo<owner::self>;

The template form emphasizes the generic nature of ownership annotations.

With the ownership system in place, regular templates parametrized by types also gain an additional degree of genericity since their parameters now include implicit ownership information. This is best seen in objects that don’t own the objects they hold. Most containers have this property (unless they are restricted to storing value types). For instance, a stack object might be thread-local while its elements are either thread-local or shared. Or the stack might be shared, with shared elements, etc. The source code to implement such stacks may be identical.

The polymorphic scheme and the examples are based on the GRFJ paper I discussed in a past post.

An example

– Stack

A parameterized stack might look like this :

class Stack<T> {
    private Node<T> _top;
public:
    void push(T value) {
        auto newNode = new Node<owner::this, T>;
        newNode.init(:=value, _top);
        _top = newNode;
    }
    T pop() {
        if (_top is null) return null;
        auto value := _top.value();
        _top = _top.next();
        return value;
    }
}

This stack is parametrized by the type T. This time, however, T includes ownership information. In particular T could be shared, unique, immutable or–the default–thread-local. The template picks up whatever qualifier is specified during instantiation, as in:

auto stack = new Stack<unique Foo>;

The _top (the head of the linked list) is of the type Node, which is parametrized by the type T. What is implicit in this declaration is that _top is owned by this–the default assignment of ownership for subobjects. If you want, you can make this more explicit:

private Node<owner::this, T> _top;

Notice that, when constructing a new Node, I had to specify the owner explicitly as this. The default would be thread-local and could leak thread-local aliases in the constructor. It is technically possible that the owner::this could, at least in this case, be inferred by the compiler through simple flow analysis.

Let’s have a closer look at the method push, where some interesting things are happening. First push creates a new node, which is later assigned to _top. The compiler cannot be sure that the Node constructor or the init method don’t leak aliases. That looks bad at first glance because, if an alias to newNode were leaked, that would lead to the leakage of _top as well (after a pop).

And here’s the clincher: Because newNode was declared with the correct owner–the stack itself–it can’t leak an alias that has a different owner. So anybody who tries to access the (correctly typed) leaked alias would have to hold the lock on the stack. Which means that, if the stack is shared, unsynchronized access to any of the nodes and their aliases is impossible. Which means no races on Nodes.

I also used the move operator := to move the values stored on the stack. That will make the stack usable with unique types. (For non-unique types, the move operator turns into regular assignment.)

I can now instantiate various stacks with different combinations of ownerships. The simplest one is:

auto localStack = new Stack<Foo>;

which is thread-local and stores thread-local objects of class Foo. There are no restrictions on Foo.

A more interesting combination is:

auto localStackOfMonitors = new Stack<shared Foo>;

This is a thread-local stack which stores monitor objects (the opposite is illegal though, as I’ll explain in a moment).

There is also a primitive multithreaded message queue:

auto msgQueue = new shared Stack<shared Foo>;

Notice that code that would try to push a thread-local object on the localStackOfMonitors or the msgQueue would not compile. We need the rich type system to be able to express such subtleties.

Other interesting combinations are:

auto stackOfImmutable = new shared Stack<immutable Foo>;
auto stackOfUnique = new shared Stack<unique Foo>;

The latter is possible because I used move operators in the body of Stack.

– Node

Now I’ll show you the fully parameterized definition of Node. I made all ownership annotations explicit for explanatory purposes. Later I’ll argue later that all of them could be elided.

class Node<T> {
private:
    T _value;
    Node<owner::of_this, T> _next;
public:
    void init(T v, Node<owner::of_this, T> next)
    {
        _value := v;
        _next = next;
    }
    T value() {
        return :=_value;
    }
    Node<owner::of_this, T> next() {
        return _next;
    }
}

Notice the declaration of _next: I specified that it must be owned by the owner of the current object, owner::of_this. In our case, the current object is a node and its owner is an instance of the Stack (let’s assume it’s the self-owned msgQueue).

This is the most logical assignment of ownership: all nodes are owned by the same stack object. That means no ownership conversions need be done, for instance, in the implementation of pop. In this assignment:

_top = _top.next();

the owner of _top is msgQueue, and so is the owner of its _next object. The types match exactly. I drew the ownership tree below. The fat arrows point at owners.
Ownership hierarchy using owner::of_this

But that’s not the only possibility. The default–that is _next being owned by the current node–would work too. The corresponding ownership tree is shown below.

Ownership hierarchy using owner::this

The left-hand side of the assignment

_top = _top.next();

is still owned by msgQueue. But the _next object inside the _top is not. It is owned by the _top node itself. These are two different owners so, during the assignment, the compiler has to do an implicit ownership conversion. Such conversion is only safe if both owners belong to the same ownership tree (sometimes called a “region”). Indeed, ownership is only needed for correct locking, and the locks always reside at the top of the tree (msgQueue in this case). So, after all, we don’t need to annotate _next with the ownership qualifier.

The two other annotations can be inferred by the compiler (there are some elements of type inference even in C++0x and D). The argument next to the method init must be either owned by this or be convertible to owner::this because of the assignment

_next = next;

Similarly, the return from the method next is implicitly owned by this (the node). When it’s used in Stack.pop:

_top = _top.next();

the owner conversion is performed.

With ownership inference, the definition of Node simplifies to the following:

class Node<T> {
private:
    T _value;
    Node<T> _next; // by default owned by this
public:
    void init(T v, Node<T> next)
    {
        _value := v;
        _next = next; // inference: owner of next must be this
    }
    T value() {
        return :=_value;
    }
    Node<T> next() {
        return _next; // inference: returned node is owned by this
    }
}

which has no ownership annotations.

Let me stress again a very important point: If init wanted to leak the alias to next, it would have to assign it to a variable of the type Node<owner::this, T>, where this is the current node. The compiler would make sure that such a variable is accessed only by code that locks the root of the ownership tree, msgQueue. This arrangement ensures the absence of races for the nodes of the list.

Another important point is that Node contains _value of type T as a subobject. The compiler will refuse instantiations where Node‘s ownership tree is shared (its root is is self-owned), and T is thread-local. Indeed, such instantiation would lead to races if an alias to _value escaped from Node. Such an alias, being thread-local, would be accessible without locking.

Comment on notation

In general, a template parameter list might contain a mixture of types, type qualifiers, values, (and, in D, aliases). Because of this mixture, I’m using special syntax for ownership qualifiers, owner::x to distinguish them from other kinds of parameters.

As you have seen, a naked ownership qualifier may be specified during instantiation. If it’s the first template argument, it becomes the owner of the object. Class templates don’t specify this parameter, but they have access to it as owner::of_this.

Other uses of qualifier polymorphism

Once qualifier polymorphism is in the language, there is no reason not to allow other qualifiers to take part in polymorphism. For instance, the old problem of having to write separate const versions of accessors can be easily solved:

class Foo {
    private Bar _bar;
    public mut_q Bar get<mutability::mut_q>() mut_q
    {
        return _bar;
    }
}

Here method get is parametrized by the mutability qualifier mut_q. The values it can take are: mutable (the default), const, or immutable. For instance, in

auto immFoo = new immutable Foo;
immutable Bar b = immFoo.get();

the immutable version of get is called. Similarly, in

void f(const Foo foo) {
    const Bar b = foo.get();
}

the const version is called (notice that f may also be called with an immutable object–it will work just fine).

Class methods in Java or D are by default virtual. This is why, in general, non-final class methods cannot be templatized (an infinite number of possible versions of a method would have to be included in the vtable). Type qualifiers are an exception, because there is a finite number of them. It would be okay for the vtable to have three entries for the method get, one for each possible value of the mutability parameter. In this case, however, all three are identical, so the compiler will generate just one entry.

Conclusion

The hard part–explaining the theory and the details of the ownership scheme–is over. I will now switch to a tutorial-style presentation that is much more programmer friendly. You’ll see how simple the scheme really is in practice.

« Previous Page — Next Page »

Type System

Why Bother?

A Categorical Answer

Categories

Functors

Endofunctors

Monads

Bibliography

The Problem

The Design

The Pipeline

The Stages

Details

Monads Anyone?

Conclusions

What Are Concepts?

Who Ordered Concepts?

Error Reporting

Self Documentation

The Case of STL

Competing Options

Constrained Templates

Use Patterns vs. Signatures

The Type Equality Problem

Who Ordered Concept Maps?

Conclusion

Acknowledgments

Bibliography

Types for Uniqueness

-Objects that are @unique

-Exposing and Localizing

Functional Decomposition

Conclusion

Concurrency

Message Passing

-Typed messages

-Pattern matching

-Casting

-Isolation

-Kilim

Dynamic Networks

Actors in D

Global Variables

Conclusion

Need for generic code

An example

– Stack

– Node

Comment on notation

Other uses of qualifier polymorphism

Conclusion

Top Posts

License

Blogroll

Follow Me

Archives