In my last post I discussed the relationship between references and lvalues/rvalues. It can be summarized using the following binding rules.

  • references bind to lvalues
  • const references bind to both lvalues and rvalues, but they don’t allow the modification of the source

It looks like all the bases are covered, except for references that could bind to rvalues and allow their modification. But who would want that? Why modify something that’s being discarded anyway?

The auto_ptr disaster

auto_ptr is an interesting beast. It’s a member of the family of smart pointers–objects that subvert value semantics to provide tighter control over reference (pointer) semantics. The need for such controls is related to resource management–making sure that program resources have well defined lifetimes.

One way to manage a resource is to ensure that it has a single owner throughout its lifetime. auto_ptr is such a designated owner.

On the surface, auto_ptr embodies value semantics. It is allocated on the stack or as a direct member of another data structure. It is passed by value to and from functions.

When auto_ptr is passed by value, the object it contains is not copied. In this respect auto_ptr behaves like a reference. Also, you can access auto_ptr as if it were a pointer to object using the overloaded member-access operator, ->.

Since auto_ptr is supposed to be the only owner of the resource (which is a pointer to object), when an auto_ptr goes out of scope it deletes the object.

Let’s look in more detail at how auto_ptr is returned from a function. Consider this example:

 auto_ptr<Foo> create() {
   return auto_ptr<Foo>(new Foo);
 }
 // caller's code
 auto_ptr<Foo> ap = create();

Notice two things:

  • The original auto_ptr goes out of scope at the end of create(). If we don’t do something special, its destructor will be called and it will destroy the newly created Foo.
  • We are returning an rvalue. The auto_ptr is created on the spot inside create(). (In fact, every local variable turns into an rvalue right before being returned from a function.)

There is one way to prevent the Foo object from being deleted–nulling the pointer inside auto_ptr. We need a hook to do that when returning auto_ptr.

The returning of objects by value follows a specific C++ protocol. It involves copy construction and/or assignment. The caller’s instance of auto_ptr, ap, is copy-constructed from the callee’s instance.

The obvious way to prevent the resource from disappearing is to define an auto_ptr copy constructor that nulls the pointer inside its source.

And here’s the catch:

  • If you define the copy constructor to take the source auto_ptr by const reference, you won’t be able to modify it.
  • If you define it to take a non-const reference, it won’t bind to an rvalue.

Finding themselves between a rock and a hard place, C++ designers came up with some ingenuous hacks (the infamous auto_ptr_ref object). Unfortunately, some important functionality of auto_ptr had to be sacrificed in the process. In particular, you can’t return auto_ptr<Derived> as auto_ptr<Base>, which makes writing polymorphic factory functions hard.

Rvalue references

What we need is something that can bind to an rvalue and modify it too. And this is exactly what rvalue references do.

Since it was too late to fix auto_ptr, it joined the ranks of deprecated features. Its place was taken by unique_ptr. This new template has a “copy constructor” that takes another unique_ptr by rvalue reference (denoted by double ampersand).

 unique_ptr::unique_ptr(unique_ptr && src)

An rvalue reference can bind both to rvalues and lvalues and does not prohibit the modification of its source.

Move that looks like a copy

Another problem with ownership passing using auto_ptr is that it sometimes looks deceptively like copying. Consider this example:

 auto_ptr<Foo> pSrc(new Foo);
 auto_ptr<Foo> pDest = pSrc; // looks like a copy, but it nulls the source
 pSrc->method(); // runtime error, pSrc is now empty

With a little discipline, such errors could be avoided, but the problem gets virtually unmanageable in generic code. In particular, the use of auto_ptr in standard containers and algorithms could easily lead to serious trouble. Some algorithms work with temporary copies of elements, not suspecting that making such a “copy” might destroy the source.

Again, intricate hackery was employed to make sure that a container of auto_ptrs doesn’t compile.

Fine tuning

It seems like in some cases we want to enable implicit move semantics (returning auto_ptr from a function) while disabling it in others (the example above). How can we characterize those two cases? The essential feature is the lifetime of the source. If the source is no longer accessible after the move, we want the move to be implicit. But if the source remains accessible, we want to force the programmer to be more explicit. We want the programmer to say, “Hey, I’m transferring the ownership, so don’t try to access the source any more.”

This difference translates directly into rvalue/lvalue duality. Rvalues are the ones that disappear. Lvalues remain accessible.

But the unique_ptr “copy constructor”

 unique_ptr::unique_ptr(unique_ptr && src)

binds equally ot rvalues and to lvalues. Fortunately, it’s possible to overload functions based on rvalue references vs.regular references. unique_ptr has a second “copy constructor”;

 unique_ptr::unique_ptr(unique_ptr & src)

which binds exclusively to lvalues. When the source is an lvalue, the compiler is forced to pick that overload. When the source is an rvalue, the rvalue reference overload is used. All that remains is to make the lvalue-binding constructor private (and forgo its implementation) to prevent implicit moves of lvalues.

So how do you explicitly move an lvalue unique_ptr when you really need it? You use a special template function, move, like this:

 unique_ptr pSrc(new Foo);
 unique_ptr pDest = move(pSrc);

Since the move is explicit, there is no chance of confusing it with a copy. Most algorithms in the Standard Library are being rewritten to use move whenever possible (actually, swap, which uses move). The ones that require objects to be copyable will not compile for unique_ptrs.

By the way, all move does is to convert an lvalue into an rvalue.

 template <class T>
 typename remove_reference<T>::type&& move(T&& t) {
    return t;
 }

In the next installment, I’ll discuss a totally different solution adopted by D.

About these ads