Making the C language race-free is quite a challenge. I discussed one approach in my previous post. Here’s another one, based on lightweight annotations and whole-program analysis.

SharC is a sharing checker and program transformer for C. It is described in the paper SharC: Checking Data Sharing Strategies for Multithreaded C.

Sharing modes

The goal of SharC is to detect (and possibly eliminate) data races from legacy C programs. This is an incremental process in which the programmer annotates the program with type qualifiers, runs the checker, listens to its complaints, ads more annotations, etc., until SharC is totally satisfied and produces an instrumented version of the program with additional built-in runtime checks.

Some of the annotations that specify sharing modes are similar to those of Cyclone, but there are some new ones.

  • private: Thread-local
  • readonly: Does not change (immutable). Interestingly, it may be cast to another sharing mode that is writable.
  • locked(lock): Shared under specific lock. The lock must be readonly.
  • racy: Intentionally racy. No compile- or runtime checks.
  • dynamic: May change sharing mode at runtime, at which point it’s checked for readonly or single-thread access.

Note that SharC provides ways to move data between various sharing modes. The key to these operations is a very clever reference counting scheme built into the memory allocator. The paper provides the details of that scheme, but I’ll skip them here.

The overall policy of SharC is to consider almost all sharing (even under locks) illegal unless explicitly declared by the user.

Safe cast

Dynamic types can be cast between sharing modes. Here’s an example:

int readonly * y;
...
x = SCAST(int private *, y);

A pointer to a readonly int (which may be shared) is cast to a pointer to private (thread-only) int. Normally such cast would be unsafe. The int might be referenced by multiple readonly pointers (potentially shared between threads). The clients of those pointers don’t expect the value to ever change and may access it without locking. That’s why SCAST not only nulls the original pointer y, but also checks if the data reference count is zero (using the clever reference counting scheme I mentioned before). If it’s not, it reports an error.

Example

The paper offers a more elaborate example, which shows the use of dynamic data types and checked casting. The example is based on a pattern typical for multimedia processing and games.

A data buffer is processed in stages and passed from thread to thread.

We start with a linked list of stages. Each stage is run by a separate thread. The first stage is fed some data in a buffer (for instance a frame of a movie). The thread responsible for that stage starts its own processing. When it’s done, it hands the buffer over to the next stage, and so on. The buffer is like a car on an assembly line.

When the assembly line is created, each stage is assigned a different processing function through the function pointer fun (see code below).

You may also think of a stage as a one-element synchronous channel, with the buffer being the message.

What is important in this example is that we don’t want to copy a potentially very large buffer. We want to pass it between threads by reference without risking data races.

This is a stage structure with all the necessary SharC annotations.

typedef struct stage(q) {
    struct stage dynamic * q next;
    cond racy * q cv;
    mutex racy * readonly mut;
    char locked(mut) * locked(mut) sdata;
    void (*q fun)(char private * private fdata);
} stage_t;

Don’t be put off by the sheer number of annotations. Most of them are defaults or can be easily inferred by SharC.

This is what the programmer must annotate:

  • The buffer sdata is shared and protected by mut.
  • The function pointed to by fun takes a private (thread-local) buffer, fdata.

Notice that, without casting, fun cannot be called with sdata, because sdata is shared.

What SharC infers is that:

  • Structure stage is parametrized by some sharing policy, q (which stands for “qualifier”). This qualifier will be provided when the variable of type stage is defined. (This is called qualifier polymorphism.)
  • All pointers inside stage, except for mut and sdata inherit the qualifier q.
  • The field next, which is a pointer to the next stage, has two qualifiers. The pointer itself inherits the qualifier q from its parent. The data it points to is marked dynamic, which means that it will be checked during runtime to be either accessed by a single thread or immutable.
  • The pointers cv and fun also inherit the parent qualifier, q. Condition variable cv is declared racy to turn off compiler checks for sharing. cv is shared between threads but does not require locking.
  • The mutex object is also racy. The pointer to it though is readonly (immutable). This is an important detail, since it is unsafe to change the mutex during the lifetime of data it protects (which is sdata in this case).
  • sdata is a pointer to a char buffer. Both the pointer and the buffer it points to are protected by the mutex, mut.
  • Finally, the pointer fun inherits its qualifier from the parent. The function parameter, fdata, is a private pointer to a private buffer.

This is the beginning of the thread function, thrFunc:

void dynamic * private thrFunc(void dynamic * private d) {
    stage_t dynamic * private S = d;
    stage_t dynamic * private nextS = S->next;
    char private * private ldata;
    ...

The function takes and returns a private void pointer to dynamically shared data. In reality this pointer points to a stage structure–this is reflected in the definition of S. Notice that in C a void pointer can be converted to a typed pointer. In SharC, the compiler makes sure that the qualifiers match.

The next field of S inherits its qualifier private from S. The data it points to (the next stage) is dynamic, as per the definition of stage.

Finally there is a private pointer to a private buffer, which will be used to point to the shared buffer. Since in general private and shared don’t mix, a sharing cast will be performed in the body of thrFunc (see below).

Here’s the rest of thrFunc:

    ...
    while (notDone) {
        mutexLock(S->mut);
        while (S->sdata == NULL)
            condWait(S->cv,S->mut);
        ldata = SCAST(char private *, S->sdata);
        condSignal(S->cv);
        mutexUnlock(S->mut);
        S->fun(ldata);
        if (nextS) {
            mutexLock(nextS->mut);
            while (nextS->sdata)
                condWait(nextS->cv,nextS->mut);
            nextS->sdata =
                SCAST(char locked(nextS->mut) *, ldata);
            condSignal(nextS->cv);
            mutexUnlock(nextS->mut);
        }
    }
    return NULL;
}

The stage S that was passed to the thread function contains shared pointer sdata. To access this pointer we need to take the correct lock, S->mut. This is the lock that was specified in the declaration of sdata inside stage.

First we loop waiting for S->sdata to become non-null. Another thread is supposed to initialize that pointer. As long as the pointer is null, we block on the condition variable S->cv.

while (S->sdata == NULL)
    condWait(S->cv,S->mut);

Notice that condWait takes the mutex as its second argument. It has to unlock it while waiting for the signal, otherwise no other thread would have a chance to modify S->sdata (and we’d have a livelock). (In the channel analogy, we are performing a synchronous receive.)

Now we are ready to call the function S->fun to process the buffer. However, that function is declared to take a private buffer, not a shared buffer. It could, for instance, be a library function that has no provisions for sharing. It could be strlen or a Fast Fourier Transform.

The programmer knows that it is safe to call this function, because the buffer has been handed over between threads and shared access is no longer in the picture. What remains is to convince the compiler that that’s okay, and to add some code that does necessary checks at runtime. All this is done by SCAST.

The line

ldata = SCAST(char private *, S->sdata);

casts S->sdata to a pointer to a private (thread-local) buffer and stores it in the private variable ldata (local data). It also nulls the source and makes sure there are no more references to it (a very clever reference counting scheme, remember?). If this looks to you like move semantics, you get the right idea.

Still under lock, we signal the previous thread that the shared buffer has been nulled so it can re-fill it.

We are now ready to release the lock and do some local processing. We call the function S->fun with the (now private) buffer.

If there is a next stage (pointed to by S->next), we want to pass our processed buffer to it. Before we do that, we have to wait until the buffer in the next stage becomes null. (In the channel analogy, we are now doing a synchronous send.)

while (nextS->sdata)
    condWait(nextS->cv,nextS->mut);

We have to do it under the next stage lock.

We assign our current buffer to the next stage and signal the next thread. Here, again, we have to perform a cast: our buffer is private, and next->sdata is shared and locked under next->mut. The statement:

nextS->sdata = SCAST(char locked(nextS->mut) *, ldata);

casts ldata (local data) to a pointer to char locked(nextS->mut). This is exactly the opposite of what we did earlier (casting shared data to local data). As before, the runtime has to null the source and check if there are no other references to it. This way thread-local unique data may be moved to another thread. If you followed my previous posts, you notice the pattern: unique data, move semantics.

Conclusion

What is unique in SharC approach is the presence of dynamically checked casts that are used to change sharing modes at runtime. This is probably the best that can be done on top of C.

When designing a new language, we could use different mechanisms to achieve the same goals. For instance, the buffer could be declared unique. The move semantics would then guarantee the absence of dangerous aliasing thus eliminating the need for behind-the-scenes reference counting.

The problem of passing a unique object to an unsuspecting library function (without nulling the source) can be solved by declaring the function parameter as lent (which should be the default).

If you read this space on a semi-regular basis, you might have noticed that recently it turned into a series of “Cliff Notes for CS grad students.” That was because I tried to prepare myself for the task of designing a multithreading system and memory model for the D programming language. My future posts will (hopefully) track progress in that direction.