[musl] value of thread-specific data immediately after initialization

Discussion:

Benjamin Peterson

2018-09-13 20:39:38 UTC

POSIX says that after the creation of a thread-specific key, its associated value should be NULL in all threads. musl may not uphold this requirement if a particular key is deleted and later resued.

Here's an example of the bug:

#include <pthread.h>
#include <stdio.h>

int main() {
pthread_key_t k;
pthread_key_create(&k, NULL);
printf("%p\n", pthread_getspecific(k));
pthread_setspecific(k, &k);
pthread_key_delete(k);
pthread_key_create(&k, NULL);
printf("%p\n", pthread_getspecific(k));
pthread_key_delete(k);
return 0;
}

Per POSIX, I would expect this testcase to output two NULLs. However, musl prints the address of the old data even after the second initialization:

$ musl-gcc -o test test.c
$ ./test
0
0x7fff2ba3d004

Rich Felker

2018-09-17 20:54:56 UTC

Permalink

Post by Benjamin Peterson
POSIX says that after the creation of a thread-specific key, its
associated value should be NULL in all threads. musl may not uphold
this requirement if a particular key is deleted and later resued.
#include <pthread.h>
#include <stdio.h>
int main() {
pthread_key_t k;
pthread_key_create(&k, NULL);
printf("%p\n", pthread_getspecific(k));
pthread_setspecific(k, &k);
pthread_key_delete(k);
pthread_key_create(&k, NULL);
printf("%p\n", pthread_getspecific(k));
pthread_key_delete(k);
return 0;
}
Per POSIX, I would expect this testcase to output two NULLs.
However, musl prints the address of the old data even after the
$ musl-gcc -o test test.c
$ ./test
0
0x7fff2ba3d004

I'm aware of this, and was aware of it when I wrote the code. My view
was that it's a programming error to delete a key while there are
still values associated with it, especially when the intended usage
has dtors (which obviously couldn't be run) linked to the keys, with
the implication that the data stored may have nontrivial destruction
requirements. However, this is not in agreement with POSIX, which
says:

"The thread-specific data values associated with key need not be
NULL at the time pthread_key_delete() is called."

I think this should be fixed, but I'm not sure the whole thing is
sufficiently specified. For example what is supposed to happen if a
call to pthread_key_delete is made concurrently with the exit of
another thread?

I think pthread_key_create and delete need to take an rwlock for write
on the keys (dtors) array, and __pthread_tsd_run_dtors needs to hold a
read lock while walking the list, releasing and reacquiring it
whenever it finds a dtor it needs to call. This at least makes it
thread-safe. The current code is not; if a key is deleted while data
remains, the loop in __pthread_tsd_run_dtors has a TOCTOU race where
it could find (self->tsd[i] && keys[i]) true, then crash when calling
keys[i] if keys[i] has become a null pointer.

For clearing values at delete time, I don't want to use sequence
numbers and lazy deletion. They waste a lot of memory, aren't robust
unless they're very large (at least 64-bit, preferably larger), and
slow down read access to tsd values. Instead I think we can move
pthread_key_delete to a separate file (so it doesn't pull in bulky
dependencies if not linked) and have it call __synccall to zero the
value in all threads. This is very expensive, but deletion is not
something you expect to have happen often if at all.

If we maintained a linked list of all live threads we could just walk
that instead, but doing that would impose heavy synchronization costs
on reasonable operations rather than just on unreasonable ones, I
think.

One mitigation of the cost would be not to reuse keys until they're
exhausted. That is, on deletion, set the dtor value to a sentinel
rather than clearing it. Then, pthread_key_create would only allocate
new key values that haven't been recycled until they all run out. When
they run out, __synccall would be used to clear all the stale values
at once, and all the sentinel dtor pointers could be cleared for
reuse.

Rich

Rich Felker

2018-09-18 03:18:07 UTC

Permalink

Post by Rich Felker

I'm aware of this, and was aware of it when I wrote the code. My view
was that it's a programming error to delete a key while there are
still values associated with it, especially when the intended usage
has dtors (which obviously couldn't be run) linked to the keys, with
the implication that the data stored may have nontrivial destruction
requirements. However, this is not in agreement with POSIX, which
"The thread-specific data values associated with key need not be
NULL at the time pthread_key_delete() is called."
I think this should be fixed, but I'm not sure the whole thing is
sufficiently specified. For example what is supposed to happen if a
call to pthread_key_delete is made concurrently with the exit of
another thread?
I think pthread_key_create and delete need to take an rwlock for write
on the keys (dtors) array, and __pthread_tsd_run_dtors needs to hold a
read lock while walking the list, releasing and reacquiring it
whenever it finds a dtor it needs to call. This at least makes it
thread-safe. The current code is not; if a key is deleted while data
remains, the loop in __pthread_tsd_run_dtors has a TOCTOU race where
it could find (self->tsd[i] && keys[i]) true, then crash when calling
keys[i] if keys[i] has become a null pointer.
For clearing values at delete time, I don't want to use sequence
numbers and lazy deletion. They waste a lot of memory, aren't robust
unless they're very large (at least 64-bit, preferably larger), and
slow down read access to tsd values. Instead I think we can move
pthread_key_delete to a separate file (so it doesn't pull in bulky
dependencies if not linked) and have it call __synccall to zero the
value in all threads. This is very expensive, but deletion is not
something you expect to have happen often if at all.
If we maintained a linked list of all live threads we could just walk
that instead, but doing that would impose heavy synchronization costs
on reasonable operations rather than just on unreasonable ones, I
think.
One mitigation of the cost would be not to reuse keys until they're
exhausted. That is, on deletion, set the dtor value to a sentinel
rather than clearing it. Then, pthread_key_create would only allocate
new key values that haven't been recycled until they all run out. When
they run out, __synccall would be used to clear all the stale values
at once, and all the sentinel dtor pointers could be cleared for
reuse.

I have a patch I'm testing that implements this idea, including the
mitigation in the last paragraph. It seems to work fine; I've attached
it in case you or anyone else wants to take a look before I'm ready to
commit/push.

Note that your test program is insufficient to demonstrate that it's
working; it assumes the same key will be handed out again, which now
won't happen until all keys have been exhausted. The best way I know
to test is loop allocating keys until you get an error, setting a
value for all of them, then delete and recreate just one and confirm
that its value is null.

Rich