Discussion:
pthread_getattr_np
Szabolcs Nagy
2013-03-31 17:35:19 UTC
Permalink
pthread_getattr_np is used by some libs (gc, sanitizer) to get the
beginning of the stack of a thread

it is a gnu extension but bsds have similar non-portable functions
(none of them are properly documented)

glibc: pthread_getattr_np
freebsd: pthread_attr_get_np
netbsd: pthread_attr_get_np and pthread_getattr_np
openbsd: pthread_stackseg_np
osx: pthread_get_stackaddr_np, pthread_get_stacksize_np
solaris: thr_stksegment
hp-ux: _pthread_stack_info_np

(glibc and freebsd use locks to synchronize the reading of
thread attributes, may matter for detach state)
(glibc and openbsd try to get correct info for the main
stack others don't)

returning reasonable result for the main stack is not trivial
(the stack starts at some unspecified address and can grow downward
until the stack rlimit or some already mmapped region is hit)

possible ways to get the top of the main thread stack:

1) /proc/self/maps (fopen,scanf,.. this is precise, glibc does this)

2) save the top address of the stack at libc entry somewhere
(eg by keeping a pointer to the original environ which is high
up the stack) this is a good approximation but underestimates
top by a few pages

3) the previous approach can be tweaked: the real top is close
so the next few pages can be checked if they are mapped (eg with
madvise) and we declare the address of the first unmapped page
as the top (usually this gives precise result, but when the pages
above the stack are mapped it can overestimate top by a few pages)

then the stack is [top-rlimit,top]
(the low end should be tweaked when it overlaps with existing
mappings, the current sp is below it or rlimit is unlimited)

i looked at how the stack is layed out by the kernel:

according to linux/fs/exec.c the stack top looks like
...|args|env vars|execed filename|(void*)0|(top)
and then in linux/fs/binfmt_elf.c
(libc entry)|argc|argv|environ|auxv|16byte random|capability string|...
the top is page aligned and from args to the top it is
at most 32 pages (ARG_MAX)

the libc entry sp is saved by the kernel and it can be
read out from /proc/self/stat (stackstart)
Rich Felker
2013-03-31 18:07:17 UTC
Permalink
Post by Szabolcs Nagy
pthread_getattr_np is used by some libs (gc, sanitizer) to get the
beginning of the stack of a thread
it is a gnu extension but bsds have similar non-portable functions
(none of them are properly documented)
glibc: pthread_getattr_np
freebsd: pthread_attr_get_np
netbsd: pthread_attr_get_np and pthread_getattr_np
openbsd: pthread_stackseg_np
osx: pthread_get_stackaddr_np, pthread_get_stacksize_np
solaris: thr_stksegment
hp-ux: _pthread_stack_info_np
(glibc and freebsd use locks to synchronize the reading of
thread attributes, may matter for detach state)
(glibc and openbsd try to get correct info for the main
stack others don't)
returning reasonable result for the main stack is not trivial
(the stack starts at some unspecified address and can grow downward
until the stack rlimit or some already mmapped region is hit)
1) /proc/self/maps (fopen,scanf,.. this is precise, glibc does this)
2) save the top address of the stack at libc entry somewhere
(eg by keeping a pointer to the original environ which is high
up the stack) this is a good approximation but underestimates
top by a few pages
3) the previous approach can be tweaked: the real top is close
so the next few pages can be checked if they are mapped (eg with
madvise) and we declare the address of the first unmapped page
as the top (usually this gives precise result, but when the pages
above the stack are mapped it can overestimate top by a few pages)
then the stack is [top-rlimit,top]
(the low end should be tweaked when it overlaps with existing
mappings, the current sp is below it or rlimit is unlimited)
Getting the high address (or "top" as you've called it) is trivial;
your efforts to find the end of the last page that's part of the
"stack mapping" are unnecessary. Any address that's past the address
of any automatic variable in the main thread, but such that all pages
between are valid, is a valid choice for the upper-limit address. The
hard part is getting the lower-limit. The rlimit is not a valid way to
measure this. For example, rlimit could be unlimited, or the stack
might have already grown large before the rlimit was reduced.

In practice, it seems like GC applications only care about the start
(upper limit) of the stack, not the other end; they use the current
stack pointer for the other limit. We could probe the current stack
pointer of the target thread by freezing it (with the synccall magic),
but this seems like it might be excessively costly for no practical
benefit...

Rich
Szabolcs Nagy
2013-03-31 20:51:39 UTC
Permalink
Post by Rich Felker
Getting the high address (or "top" as you've called it) is trivial;
your efforts to find the end of the last page that's part of the
"stack mapping" are unnecessary. Any address that's past the address
of any automatic variable in the main thread, but such that all pages
between are valid, is a valid choice for the upper-limit address. The
yes but rlimit counts from the high end of the stack
so if [highend-rlimit, highend] method is used then
you have to find the real high end to have a good
lowend
Post by Rich Felker
hard part is getting the lower-limit. The rlimit is not a valid way to
measure this. For example, rlimit could be unlimited, or the stack
might have already grown large before the rlimit was reduced.
yes but there is no valid way: the libs i saw queried
this info once, even though rlimit can change and one
can map or unmap areas in the way of the stack growth

so the api only makes sense if one does not do such
things, in which case rlimit gives a useful estimate
Post by Rich Felker
In practice, it seems like GC applications only care about the start
(upper limit) of the stack, not the other end; they use the current
stack pointer for the other limit. We could probe the current stack
pointer of the target thread by freezing it (with the synccall magic),
but this seems like it might be excessively costly for no practical
benefit...
eg. address sanitizer creates a shadow map for the stack so
at least it needs a reasonably sized upper bound on the
stack size (but it does the /proc parsing magic itselfs for
the main thread at startup so we don't have to support that)

if the lowend is not used otherwise then we can give arbitrary
result (eg always returning highend-5MB or using the rlimit
truncated to some value when it's unlimited)

all the calls to this function seem to use pthread_self()
at thread creation or startup time, so synccall is probably
not needed to get a sp

to get a 'precize' lowend one can:
1) parse /proc/self/maps which gives the current [low,high] mapping
and 'prev' the high end of the last mapping below the stack
2) if we are the main thread check if low <= sp <= high
3) check rlimit

lowend = min(max(prev, high-rlimit, high-1G), low)

then we can return [lowend,high] or [lowend,libc_high]
(where libc_high is below the real high, but we need the
real one for the calculations)
Rich Felker
2013-03-31 21:00:05 UTC
Permalink
Post by Szabolcs Nagy
Post by Rich Felker
In practice, it seems like GC applications only care about the start
(upper limit) of the stack, not the other end; they use the current
stack pointer for the other limit. We could probe the current stack
pointer of the target thread by freezing it (with the synccall magic),
but this seems like it might be excessively costly for no practical
benefit...
eg. address sanitizer creates a shadow map for the stack so
at least it needs a reasonably sized upper bound on the
stack size (but it does the /proc parsing magic itselfs for
the main thread at startup so we don't have to support that)
if the lowend is not used otherwise then we can give arbitrary
result (eg always returning highend-5MB or using the rlimit
truncated to some value when it's unlimited)
all the calls to this function seem to use pthread_self()
at thread creation or startup time, so synccall is probably
not needed to get a sp
I just meant if we want the API to work in general...
Post by Szabolcs Nagy
1) parse /proc/self/maps which gives the current [low,high] mapping
and 'prev' the high end of the last mapping below the stack
2) if we are the main thread check if low <= sp <= high
3) check rlimit
Parsing /proc/self/maps is utterly useless for non-main-thread. Unless
the thread has a guard page, its stack mapping can be adjacent to
another thread's stack mapping, and thus they can get merged into a
single mapping.

Rich
Szabolcs Nagy
2013-03-31 21:37:59 UTC
Permalink
Post by Rich Felker
Post by Szabolcs Nagy
1) parse /proc/self/maps which gives the current [low,high] mapping
and 'prev' the high end of the last mapping below the stack
2) if we are the main thread check if low <= sp <= high
3) check rlimit
Parsing /proc/self/maps is utterly useless for non-main-thread. Unless
the thread has a guard page, its stack mapping can be adjacent to
another thread's stack mapping, and thus they can get merged into a
single mapping.
i was only talking about the main thread case,
because you said the other case is simple

what i meant in 2) is if another thread tries
to query the stack of the main thread

the /proc/self/maps works for that too, but
then you cannot check if sp is really in the
given intervall
Szabolcs Nagy
2013-03-31 23:31:35 UTC
Permalink
Post by Szabolcs Nagy
1) parse /proc/self/maps which gives the current [low,high] mapping
and 'prev' the high end of the last mapping below the stack
2) if we are the main thread check if low <= sp <= high
3) check rlimit
lowend = min(max(prev, high-rlimit, high-1G), low)
attached a getstack for the main thread

Continue reading on narkive:
Loading...