The more things change, the more we need “immutable”

Why const?

With judicious use of const you can catch many bugs and improve code documentation and maintenance. For me the classic example is reading from and writing to files. Look at these two interfaces in stylized C++:

bool Read(File & file, char * buffer, unsigned & size)
bool Write(File & file, char const * buffer, unsigned const & size);

The first one tells me that the buffer may be modified by the function. In fact, the Read function will most likely write characters into it. It is also allowed to modify the parameter size, putting the number of characters read into it. The incorrect call

Read("abc", 3);

will not compile.

The second declaration promises that the buffer will not be modified and neither will size. So the following (correct) call will compile:

Write("abc", 3);

What’s more, when compiling the implementation of Write, the compiler will ensure that the arguments are not modified (unless the programmer subverts the type system using const_cast).

What does const guarantee? Not much!

This optimistic picture is marred by a few gotchas. You’d think that if you call a function taking a const reference, the object you are passing won’t change (assume there’s no use of const_cast or mutable members). Think again!

In a multithreaded program, another thread might change the object concurrently. You may avoid this problem by preventing concurrent access using critical sections.
But even in a single threaded program there is no guarantee the const object won’t change. That’s because there might be a non-const alias to the argument you’re passing. This alias might be accessible to the function either through a global variable, or through another argument. The classic example is copying elements between overlapping ranges. The source for the copy is passed as const, the target as non-const, but if they point at overlapping regions, the source may end up modified.
There is a stronger version of constness that might guarantee immutability. The qualifier is usually called immutable and it’s available in the D programming language and in some dialects of Java. I’ll come back to it later.
On top of those gotchas, even in a single-threaded C++ program with no aliasing of arguments, a const function argument might get mutated. That’s because constness is shallow–it doesn’t protect indirect data members (pointers or references). See next section.

Transitivity or “deep” const

Consider a list that is defined recursively:

class List {
public:
  List * GetNext() const { return _next; }
  void SetNext(List * next) { _next = next; }
private:
  List * _next;
};
void clip(List const * list) {
  List * next = list.GetNext();
  if (next)
    next.SetNext(0);
}

Function clip takes a const list and blatantly modifies it. That’s because the indirect member, _next, of a const List is not const. C++ constness is not transitive! This fact is reflected in GetNext being a const method and returning a non-const pointer to _next. So even if this is const inside GetNext, indirect parts of this are open to modification.

I always found the non-transitive definition of const counter-intuitive, and made sure that in my code constness was always transitive. For instance, I would implement GetNext to return a const pointer:

List const * GetNext() const { return _next; }

and maybe provide another non-const method that returns a non-const pointer. Notice the need for code duplication in this approach.

The depths of immutability

I have an even larger problem with non-transitive immutability (as opposed to constness). I expect a const method of an immutable object to leave the object unchanged. In particular I expect such method to behave like a pure function (not to be confused with pure virtual function). A pure function returns the same value every time it’s called with the same arguments. I expect the length of an immutable string to never change. Every call to size should return the same value. I would even expect the compiler to eliminate redundant calls to size is such a case. Here’s a hypothetical example:

immutable string("Hello!");
doSomething(string.data(), string.size());
doSomethingElse(string.data(), string.size());

A smart compiler could cache the results of data and size from the first set of calls and reuse them when calling doSomethingElse. However, if immutable is not transitive, such an optimization is incorrect. Just imagine a bizarre implementation of string that stores the length indirectly:

class string {
public:
  string(char const * src) {
    _length = new unsigned;
    *_length = strlen(src);
    ...
  }
  unsigned size() const {
    ++*_length;
    return *_length;
  }
private:
  unsigned * _length;
};

The D-language model

Taking all the above arguments into account, we decided to implement both const and immutable as transitive type modifiers in the D programming language. This enabled a lot of interesting compiler optimizations.

Making const and immutable transitive is essential for D’s support for functional-style programming. It’s important that const methods of an immutable object be pure functions. It’s also important to let the compiler optimize functional-style code, which otherwise tends to be less efficient than its imperative-style equivalent.

Functional programming is one of the safest paradigms for concurrency. Pure functions and immutable objects don’t require locking.

The price of transitivity

Having said that, transitivity of const imposes some constraints on the programming style. It enforces the view that anything reachable from an object is part of the object. When you are given a const linked list, all links in it must necessarily be const. That makes logical sense–the links are part of the list.

The picture gets muddier when the object contains references to some external services. Just by calling them “external,” I’m admitting that they are not part of the object.

Can an object that contains a Windows handle be const? Is it enough that it doesn’t modify the handle itself? Or shouldn’t it also refrain from modifying the window or the file behind the handle? Handles are often implemented as opaque pointers so, strictly speaking, constness should propagate through them. But how do I know which APIs don’t mutate the state of Windows and which do? This is an extreme example, but there are many in-between cases.

In an editor you might have an undo stack that stores editing commands. When you call the undo method of the command object, you may pass it a reference to the document object on which it is to operate. In that case undo may be declared as const. But if you store a reference to the document within each command, the undo method can no longer be const–it modifies the document. The classic Command Pattern uses the latter approach.

It hasn’t been decided yet if and how D will support an escape mechanism from const transitivity (and constness in general). Following C++, D could use the mutable type qualifier for that purpose. This would also solve the problem of memoizing and caching inside a const object.

In the next installment, I’ll describe more complexities of immutability and discuss a dialect of Java that supports immutability constraints.

You can vote for this article on reddit

Bill Baxter Says:

January 26, 2009 at 3:16 pm

Nice overview of the need for transitivity and immutable data.

You let an “invariant” slip in there a couple of times where you meant “immutable”.

Bartosz Milewski Says:

January 26, 2009 at 4:38 pm

Thanks! I fixed it.

The One With D » Blog Archive » Bartosz Milewski on const Transitivity Says:

January 26, 2009 at 7:37 pm

[…] latest post is on const transitivity. He first discusses the reality of transitivity in C++, before moving on to the goals for […]

pizer Says:

January 31, 2009 at 10:04 am

It strikes me as a bit of an apple/oranges comparison. What works for one language might not be the best idea for another language.

I’m totally fine with const being “shallow” in C++. IMHO, in most cases (assuming good design) “const” does the right thing. If it doesn’t in some case, you can define (logically) what is part of the object and what is not through member function overloads (see std::vector::operator[]). One difference between C++ and D is that a C++ object can have class members that are really a part of the object. In D you have reference semantics for class objects. No class object is really part of another one. That seems to be the motivation for making const transitive in D.

-P

January 31, 2009 at 10:09 am

Hold on a sec: “const” IS transitive in C++ in the sense that a member of a member of a member is still const. It just isn’t for pointers and references which is — as already mentioned — OK the way it is. Consider:

struct Foo { int x };

const auto_ptr<Foo> pF (new Foo);
pF->x = 42; // Is this supposed to fail? No!

January 31, 2009 at 4:00 pm

This a great insight: D’s (and Java’s) reference semantics make const transitivity more important than it would be in C++. It’s worth mentioning though that in D you can have both semantics. Unlike in C++, D’s structs are a different beast than classes. By default they follow value semantics, but at the cost of polymorphism (no virtual functions).

As always when designing a language there are choices. How important it is to have a const auto_ptr referring to a non-const object? Is this a common case or an exception? We’ve had a lot of design discussions about what we call “head constness” (or “tail mutability”). Personally I’m in favor of having an escape mechanism from transitivity–explicit “mutable” qualifier. Your auto_ptr example would look like this:

const auto_ptr<mutable Foo> pF(new Foo);

February 1, 2009 at 6:32 am

Well, auto_ptr is supposed to behave like a pointer. Of course, in C++ pointer and pointee are two different things. Making an auto_ptr const guarantees that it can’t loose ownership which is good considering the danger of its destructive “copy” c’tor. 😉

As far as I can tell there are only three cases:

(1) A “component” is logically and actually a member of some object. Here, the “const rules” do the right thing. It is “transitive” by induction: All members of a const object are also const.

(2) A “component” is logically a member but actually referenced through a pointer. Example: std::vector and its elements. Here, we need member function overloads. Since exposing data members directly is considered bad style we’re all good.

(3) A “component” is logically AND actually NOT a member. So, we don’t want “transitivity of const”. The “const rules” do the right thing here.

auto_ptr would be part of the 3rd category. I think it is not the only one in this category. Off the top of my head I can mention iterators and “image views”. An image view object represents a view into some other image. It also behaves like a pointer — a pointer that knows more about what it points to (image size, color depth, how to access pixels, etc).

The majority of all class designs probably belong to category 1, followed by category 2. Category 3 seems to cover all the pointer-like objects (including iterators).

February 1, 2009 at 6:41 am

There is no category 1 for component=class object in D. Under the assumption that category 3 represents exceptions I agree it makes sense to render “member” objects also const by default when the “containing” object is const.

Immutable Things, where do they all come from? « Bartosz Milewski’s Programming Cafe Says:

February 5, 2009 at 12:41 pm

[…] An immutable object never changes. You can bet your program on it. As I explained in my previous post, the same is not true for const objects (or readonly objects, in dialects of Java). They may be […]

Brian Says:

March 2, 2009 at 9:39 pm

You really should change these to read:

bool Read(File & file, char * const buffer, unsigned & size)
bool Write(File & file, char const * const buffer, unsigned const & size)

I’ve been burned in the past by doing
something like okay = false instead of *okay = false with passing a bool by pointer. bool * const okay makes the compiler catch this.

Btw great that you understand how to position const!!!!

March 2, 2009 at 10:42 pm

The additional const’s you put there have no meaning for the caller. As you point out, they may sometimes prevent bugs in the implementation, but a library should not export details of the implementation.

(The fact that C++ can automatically convert from a bool to a pointer is disturbing though.)

In your example, instead of a const pointer to bool I would rather use a reference.

Functional Data Structures and Concurrency in C++ | Bartosz Milewski's Programming Cafe Says:

December 10, 2013 at 10:36 am

[…] course, it would be nice if immutability were enforced by the type system, as it is in the D language. In C++ we have to replace the type system with discipline, but still, it helps to know exactly […]

Bartosz Milewski's Programming Cafe