Who ordered rvalue references? Part 3

In the previous two installments I tried to explain the origins of rvalue references in C++. If you were left with the impression that each design step was a reaction to some previously unanticipated problem, you’re right. As it often happens in C++, the design was reactive rather than proactive.

Passing by value

If you think about it long enough, you realize that the process that led to rvalue references started with the way C++ implemented passing by value. In C, passing by value was simple–a shallow copy of data was made. “Shallow” means, no pointers were followed in the process of copying. C++ decided to provide hooks for the programmer to override this behavior. If you define a copy constructor it will be used instead of shallow copy. A user-defined copy constructor may for instance follow pointers or references and clone the pointed-to objects, thus implementing “deep copy”. It may also do bookkeeping–count references, etc.

Whenever the compiler needs to copy an object–and that may happen during return by value or pass by value–it inserts the appropriate call to the copy constructor (if one is not provided by the programmer, the default shallow copy is used).

This is a very general mechanism, which however leads to some awkwardness and inefficiencies. For instance, the writer of a copy constructor must remember to copy all of the shallow data members–something that the default shallow copy does automatically. The inefficiencies show up when copying rvalues.

Return value optimizations

Let’s revisit the example from my previous post:

 auto_ptr<Foo> factory() {
    return auto_ptr<Foo>(new Foo);
 }
 // client code
 auto_ptr<Foo> ap = factory();

Here, a temporary auto_ptr is created inside the function, copied using a copy constructor into the caller’s auto_ptr, and then immediately destroyed.

This is so obviously inefficient that the compiler will often optimize it away by constructing the return auto_ptr directly in the space allocated by the caller (a pointer to which is passed as a hidden argument to the function). This is called return value optimization (RVO) and it eliminates the call to the copy constructor and to the destructor.

But what about more general cases, like this one:

 auto_ptr<Foo> factory() {
    auto_ptr<Foo> src;
    ...
    src.reset(new Foo);
    ...
    return src;
 }
 // client code
 auto_ptr<Foo> ap = factory();

Some compilers will attempt to optimize this code as well, using the named return value optimization (NRVO).

The C++ Standard allows such optimizations, which elide the calls to a copy constructor and a destructor, but it doesn’t mandate them. Moreover, NRVO cannot be applied to functions that have multiple return statements, with different objects being returned from each.

But notice an interesting thing: If (N)RVO were mandated by the language and applicable in every case, we could almost implement auto_ptr without the rvalue reference copy constructor. We wouldn’t need to reach back at the source auto_ptr from the copy constructor to turn off its destruction. The destructor (together with the copy constructor) would be guaranteed to be elided.

In fact, if we don’t insist on using caller’s space for creating the return value, we could cover the case of multiple return statements. And how would we transfer bits between the two spaces? We’d do a shallow, bitwise, copy!

An alternative return-by-value scheme

Here’s the scheme that was proposed for the D programming language:

The compiler adds a hidden parameter to every function that returns a non-trivial data structure by value. This parameter is the address of the stack space allocated by the caller for the return value. (This is a temporary space for the rvalue returned by the function) This part is common with the C++ implementation
The function allocates its local variables on the stack, as usual, including the ones that are (or might be–in case of multiple return statements) returned. RVO may be used to eliminate some of these allocations. This is still in common with C++.
When executing the return statement, the value to be returned is bit-copied into the caller’s space.
No copy constructor is called. No destructor is called for the variable that’s been returned.

Look ma! No copy constructor!

What is amazing about this scheme, is that it may be applied to all situations where the source of the copy is an rvalue. This is also true when passing an rvalue to a function, as in this example:

  struct S;
  void f(S s); // by value
  // caller's code
  f(S(1, 2)); // passing an rvalue

Suddenly there is no need whatsoever for an rvalue-taking copy constructor because it would never be called. Since an rvalue-taking copy constructor was the main motivation for introducing rvalue references to C++, in D we can get away without them.

It takes some getting used to thinking of passing rvalues without calling a copy constructor. Notice however that, even in C++, there is no guarantee that the copy constructor will be called (remember RVO?).

But what about non-trivial copy constructors, like in ref_ptr (reference-counting smart pointer)? It must be called to keep count of the reference count, right? Not when the source is an rvalue! Remember that we are eliding not only the copy constructor (which increments the reference count) but also the destructor of the source (which decrements the reference count). The net result is no change to the reference count.

This is a very important optimization, especially in multithreaded code, when reference counting incurs expensive synchronization overhead.

Location transparency

There is one gotcha in this scheme–we are assuming that a shallow copy will always result in a valid object. This is not true if the object contains a pointer to itself. A shallow copy will duplicate the pointer, the object itself will move to a new location. but the pointer will point at the old location. This could be a disaster.

For this and several other reasons D assumes location transparency. An object cannot point at itself (or any of its shallow members), because then it couldn’t be transparently moved to a new location.

Unfortunately this requirement cannot be enforced statically. A dynamic check during copying is possible, but it’s not clear that the overhead is always worth paying. This is still an open design area.

Notice that, in D, the scope of this problem is narrowed to structs and fixed arrays that contain pointers. There is no direct way to manipulate the location of class objects (they are always accessed indirectly)–except by the garbage collector. A generational garbage collector that moves objects in memory could definitely profit from location transparency.

The binding table

The binding table in D is slightly different than the one in C++:

references bind to lvalues (same as C++)
const references bind only to lvalues (different)
“values” (arguments passed by value) bind to
- lvalues (same as C++)
- rvalues (same as C++, except that no copy constructor is called)

Notice that in D we don’t want a const reference to bind to an rvalue. This is an inherently dangerous feature of C++. Consider the following C++ example:

  struct S;
  const S& f(const S& x) {
     return x;
  }
  // caller's code
  const S& y = f(S(1, 2));
  y.access(); // kaboom!

Instead, if you want to have a version of a function that binds to rvalues, you declare its arguments as passed by value, as in this D example:

  S f(S x) {
    return x;
  }
  // caller's code
  auto x = f(S(1, 2));
  x.access; // fine, it's a copy

In the worst case, the overhead of the D solution is a bitwise copy of a struct (which can often be optimized away) against the indirect access of the C++ solution. But considering that it buys us safety, it’s a trade off well worth it.

I’d like to thank Andrei Alexandrescu for his comments and, in general, for driving the design of pass-by-value in D.

BCS Says:

November 3, 2008 at 4:47 pm

Location transparency has the added downside of causing problems with structures like doubly linked lists. Is this being addressed regarding D?

Bartosz Milewski Says:

November 3, 2008 at 5:06 pm

In the narrower context of my post, this is not a problem, since we are moving rvalues. If you have another object (another link in the list) pointing to the rvalue being moved, you’re in trouble no matter what. An rvalue is ephemeral by definition, so having an external pointer pointing at it is a bug.

In the more general context of garbage collection, the collector will adjust external pointers when moving an object.

Lio Says:

November 3, 2008 at 6:01 pm

“The compiler adds a hidden parameter to every function that returns a non-trivial data structure by value.”

So, essentially, the return value gets treated as an implicit ‘out’ parameter. I think this makes a lot of sense. Why would D stick to this return vs. out duality anyway?

November 3, 2008 at 8:50 pm

Yes, you can think of it as syntactic sugar for an out parameter, except that (a) the compiler toggles between using a register vs. a hidden variable for return values depending on the target processor, and (b) when the return is used as an rvalue, the caller would have to explicitly define a temporary variable to store the out parameter. Consider the chaining f(g()) vs.

auto tmp;

g(tmp); // out parameter of g

f(tmp);

It’s pretty awkward.

Boris Kolpackov Says:

November 4, 2008 at 4:08 am

There is a big conceptual problem with the D approach: you are never calling the destructor on a stack object (inside the function) and you are calling a destructor on an object that was never constructed (outside the function). In other words you have a constructor called on one piece of memory and its “matching” destructor on another.

Any logic that somehow depends on the underlying object memory (e.g., a self-registration of an object in a set at construction and de-registration at destruction) will most likely be broken by this mechanism. Perhaps things like this are not allowed in D so it is Ok, but in C++ this is definitely a no-go.

BTW, this function signature in the last section:

const& S f(const& S x)

should probably be:

const S& f(const S& x)

November 4, 2008 at 12:07 pm

[I fixed the syntax in the example, thanks.]
I should emphasize again: We are talking about stack-based rvalues. The address of such an object should never escape its scope–neither in D nor in C++.

This discussion does not apply to class objects, which in D, like in Java, have reference semantics. You can self-register a class object. When you return such an object, you are not reallocating it–you are returning a handle to it. The address of the object itself (hidden in the handle) never changes. (Except, of course, if you have a re-allocating garbage collector–but in that case all pointers to your object will be automatically updated.)

kiki Says:

November 10, 2008 at 4:43 am

“This discussion does not apply to class objects, which in D, like in Java, have reference semantics” .

That’s the whole problem. Trying to compare two very different object semantic. Isn’t it too much like apple and orange ? rvalue reference only makes sense in the value semantic.

November 10, 2008 at 11:44 am

It’s more of a syntactic sugar issue. If you think of class object variables in Java or D as pointers (which they are behind the scenes) they are just like heap-allocated class objects in C++.

In C++ you can also allocate class objects on the stack, and value semantics kicks in. In D you can’t do it with classes, but you can do it with structs. The equivalent of C++ unique_ptr is a struct Unique!(T). It obeys D value semantics.

Rob T Says:

July 12, 2012 at 2:41 pm

I know this is a few years ago, and C++11 is now available, but it’s still an excellent article and very helpful to understand why C++11 move semantics came about and how they work. It is also excellent wrt D, because I am new to the language and I was wondering why D did not appear to have move semantics. For me to convert my C++11 code to D seemed problematic without move semantics to rely on. If my understanding is correct, D takes on a very different approach to memory management, at least in some situations, so one will have to make significant adjustments when moving from C++11 to D. Anyway, I think I’ll have to re-read all of this again. If you have any suggestions for more reading sources on moving from C++ to D, that would be excellent.

Thank You!

July 12, 2012 at 3:40 pm

You might want to ask this question on one of the D forums. I moved from D to Haskell 😉

July 12, 2012 at 4:32 pm

Good suggestion, that’s where I need to go. Thanks.

>>I moved from D to Haskell

Haskell is certainly interesting, but I’m looking for something that is not a radical departure from C/C++. It has to be practical in terms of production coding, I require low level control in some areas, and ability to control GC fully where required, not to mention native interfacing with at least existing C libs. The only viable C/C++ alternative I have seen so far appears to be D, and I’m nervous enough about switching to it as it is since you cannot bind to C++ libs naively and the user base is relatively small.

The reason for wanting to switch away from C++ is that I spend far too much time hacking away at unnecessarily complex C++ syntax and complex necessities to get things exactly right that otherwise should be simple. The templates in C++ for example are almost incomprehensible except for doing only simple things with it. The include system is rather insane, etc etc (we all know the issues).

One thing I will mention about C++ to make a point, is that I find that the OO model is not very useful except in some specific situations, so I’m looking for a language that supports multi-paradigms, rather than just one paradigm, at least then I can then pick and choose the methods which fit a particular situation best, rather that fight with the language when it does not. I know C++11 introduces new paradigms to the language, but it is patched in, and is generally complex or “messy”, hence I find it all overwhelming and far too confusing to bother with.

I’m seriously looking for something better than C++, but it has to be practical and a lot better to make a shift worth while.

I certainly appreciate your input, and sorry about this lengthy blurb.
🙂

Bartosz Milewski's Programming Cafe