Discussion:
Todo for release?
(too old to reply)
Rich Felker
2012-08-13 18:53:30 UTC
Permalink
Starting a new thread since the old one got too long and OT.. :-)

New stuff we have so far:

- Blowfish crypt
- MIPS dynamic linker
- Major MIPS bug fixes
- ARM hard float support
- BSD fgetln function
- Major bug fix in wcsstr
- Optimized memcpy for i386 and x86_64
- Public exposure for getdents
- Added significand function to math lib

The requested stuff that's still pending (with person who requested or
is working on it):

- Exception behavior in i386/x86_64 exponential asm (nsz).
- Finer-grained _XOPEN_SOURCE (Luca)
- Support for __progname (Daniel)
- MD5 and SHA crypt (nsz?)

In addition, there are the GNU hash and dladdr patches (Boris)
pending, which I don't want to overlook, but I think unless I manage
to get a discussion of simplifying them finished really soon and find
some time to thoroughly test them, it's best to hold off on this until
just after the next release so we have plenty of time to test and tune
it before another release.

Anything else I missed?

Rich
Szabolcs Nagy
2012-08-13 21:31:54 UTC
Permalink
Post by Rich Felker
- MD5 and SHA crypt (nsz?)
i only have code for the hashes, not crypt

it seems these crypt schemes are fairly ugly
i don't mind if their implementation is delayed


md5 based crypt is not recommended anymore
http://phk.freebsd.dk/sagas/md5crypt_eol.html

the sha2 based crypt seems to be designed recently
and the spec has a public domain implementation
http://www.akkadia.org/drepper/SHA-crypt.txt
Rich Felker
2012-08-13 21:53:44 UTC
Permalink
Post by Szabolcs Nagy
Post by Rich Felker
- MD5 and SHA crypt (nsz?)
i only have code for the hashes, not crypt
it seems these crypt schemes are fairly ugly
i don't mind if their implementation is delayed
It looks like the API the hash functions provide matches closely what
the BSD crypt functions expect, so I think we could potentially just
use or adapt one of them..
Post by Szabolcs Nagy
md5 based crypt is not recommended anymore
http://phk.freebsd.dk/sagas/md5crypt_eol.html
Indeed. But is it used in existing Linux user databases on any
significant scale? If not, I agree we can just drop it.
Post by Szabolcs Nagy
the sha2 based crypt seems to be designed recently
and the spec has a public domain implementation
http://www.akkadia.org/drepper/SHA-crypt.txt
I'm confused by all the SHA names (1/2/256/512)...

Rich
Solar Designer
2012-08-13 22:06:01 UTC
Permalink
Post by Rich Felker
Post by Szabolcs Nagy
md5 based crypt is not recommended anymore
http://phk.freebsd.dk/sagas/md5crypt_eol.html
Indeed. But is it used in existing Linux user databases on any
significant scale?
It is.
Post by Rich Felker
If not, I agree we can just drop it.
We should support it.

Maybe use my MD5 code, but for md5crypt write new code to avoid the
beerware license (I would be happy to buy phk a beer, but having to
mention another license for a component in musl's license is not nice).
Post by Rich Felker
I'm confused by all the SHA names (1/2/256/512)...
You need sha512crypt and sha256crypt. SHA-1 is irrelevant (not used in
any common crypt(3) flavor). SHA-2 is a common name for the
SHA-224/256/384/512 primitives (although these are actually different).

Of sha512crypt and sha256crypt, only the former is commonly used, but
you may choose to support both anyway (systems generally support both).

The high-level structure of md5crypt, sha512crypt, and sha256crypt is
similar, but it'd be tricky/unreasonable to exploit that for reduced
code size as you'd likely increase source code complexity and make the
code slower (important in case of sha512crypt and sha256crypt, which
support variable iteration counts).

Alexander
Szabolcs Nagy
2012-08-14 15:02:37 UTC
Permalink
Post by Solar Designer
Maybe use my MD5 code, but for md5crypt write new code to avoid the
beerware license (I would be happy to buy phk a beer, but having to
mention another license for a component in musl's license is not nice).
the license seems to be changed to 2clause bsd in 2003
http://svnweb.FreeBSD.org/base/head/lib/libcrypt/crypt-md5.c?view=log

(openbsd still use the beerware version though)
Szabolcs Nagy
2012-08-15 00:30:35 UTC
Permalink
Post by Szabolcs Nagy
Post by Solar Designer
Maybe use my MD5 code, but for md5crypt write new code to avoid the
beerware license (I would be happy to buy phk a beer, but having to
mention another license for a component in musl's license is not nice).
the license seems to be changed to 2clause bsd in 2003
http://svnweb.FreeBSD.org/base/head/lib/libcrypt/crypt-md5.c?view=log
(openbsd still use the beerware version though)
i attached a crypt_md5 based on this
(not complete)
Solar Designer
2012-08-13 22:20:58 UTC
Permalink
Post by Szabolcs Nagy
the sha2 based crypt seems to be designed recently
and the spec has a public domain implementation
http://www.akkadia.org/drepper/SHA-crypt.txt
Unfortunately, the reference implementation uses alloca() on both salt
and key strings. glibc has recently fixed that by using malloc() and
returning NULL on its failure, but that's not great.

Also, if potentially unreasonably long running time is a concern, it
should be noted that for md5crypt and sha*crypt it is roughly
proportional to password length (modulo block size of the underlying
primitive). So e.g. a 1 million char password (which may realistically
be passed to libc's crypt() e.g. via a scripting language) may take
thousands of times longer to be hashed than the sysadmin had intended by
tuning the iteration count.

I'm not sure whether and how a libc should deal with that. In a sense,
it is similar to the issue of high iteration counts, but it's worse in
that the input that may trigger the issue very often comes from a remote
system.

For the extended DES-based crypt() hashes that we now support, this
issue mostly does not arise since the password (even if very long, which
is supported) is passed through just one instance of DES block-by-block,
which is quick. The multiple iterations loop is then applied to the
"compressed" version of the password.

For bcrypt hashes, the issue does not arise because they truncate
passwords at 72 characters (not great, but that's how they're defined,
and it's good enough for practical purposes so far).

Alexander
Rich Felker
2012-08-14 01:46:53 UTC
Permalink
Post by Solar Designer
Post by Szabolcs Nagy
the sha2 based crypt seems to be designed recently
and the spec has a public domain implementation
http://www.akkadia.org/drepper/SHA-crypt.txt
Unfortunately, the reference implementation uses alloca() on both salt
and key strings.
Why? Does it need working space proportional to the input length?
Post by Solar Designer
glibc has recently fixed that by using malloc() and
returning NULL on its failure, but that's not great.
Hmm, and as you've pointed out several times, that means the daemon
will almost surely crash, since few of them check for failure...
Post by Solar Designer
Also, if potentially unreasonably long running time is a concern, it
should be noted that for md5crypt and sha*crypt it is roughly
proportional to password length (modulo block size of the underlying
primitive). So e.g. a 1 million char password (which may realistically
be passed to libc's crypt() e.g. via a scripting language) may take
thousands of times longer to be hashed than the sysadmin had intended by
tuning the iteration count.
I'm not sure whether and how a libc should deal with that. In a sense,
it is similar to the issue of high iteration counts, but it's worse in
that the input that may trigger the issue very often comes from a remote
system.
In light of both the alloca issue and the way runtime scales with key
length, I think we should just put an arbitrary limit on the key
length and return failure for longer keys. This should not affect any
real-world authentication systems, since the daemon you're attempting
to login to will also be placing a (probably much lower) limit on the
input buffer size for passwords (if it's not, you can trivially DoS
the server by sending gigabyte-long passwords for random users).

Something like 128-256 bytes would probably be a very generous limit.
Post by Solar Designer
For the extended DES-based crypt() hashes that we now support, this
issue mostly does not arise since the password (even if very long, which
is supported) is passed through just one instance of DES block-by-block,
which is quick. The multiple iterations loop is then applied to the
"compressed" version of the password.
For bcrypt hashes, the issue does not arise because they truncate
passwords at 72 characters (not great, but that's how they're defined,
and it's good enough for practical purposes so far).
Thanks for explaining the issues well.

Rich
Solar Designer
2012-08-14 02:13:17 UTC
Permalink
Post by Rich Felker
Post by Solar Designer
Post by Szabolcs Nagy
the sha2 based crypt seems to be designed recently
and the spec has a public domain implementation
http://www.akkadia.org/drepper/SHA-crypt.txt
Unfortunately, the reference implementation uses alloca() on both salt
and key strings.
Why? Does it need working space proportional to the input length?
It uses implementations of SHA-512 and SHA-256 that assume alignment, so
it provides such alignment by copying the inputs to aligned buffers if
the inputs to crypt() don't happen to be already aligned.

The same applies to glibc's md5crypt (but we're not going to use that
implementation of md5crypt anyway).
Post by Rich Felker
In light of both the alloca issue and the way runtime scales with key
length, I think we should just put an arbitrary limit on the key
length and return failure for longer keys. This should not affect any
real-world authentication systems, since the daemon you're attempting
to login to will also be placing a (probably much lower) limit on the
input buffer size for passwords (if it's not, you can trivially DoS
the server by sending gigabyte-long passwords for random users).
Something like 128-256 bytes would probably be a very generous limit.
Yes, but the failure should be indicated in the way we discussed - those
"*0" and "*1" strings, not NULL. Some real-world authentication systems
may be affected; it is not unrealistic even for a C program to use a
buffer several kilobytes large.

Alexander
Rich Felker
2012-08-14 02:35:08 UTC
Permalink
Post by Solar Designer
Post by Rich Felker
Post by Solar Designer
Post by Szabolcs Nagy
the sha2 based crypt seems to be designed recently
and the spec has a public domain implementation
http://www.akkadia.org/drepper/SHA-crypt.txt
Unfortunately, the reference implementation uses alloca() on both salt
and key strings.
Why? Does it need working space proportional to the input length?
It uses implementations of SHA-512 and SHA-256 that assume alignment, so
it provides such alignment by copying the inputs to aligned buffers if
the inputs to crypt() don't happen to be already aligned.
This could be solved by doing the copy a block at a time, and
submitting the blocks to the encryption code a block at a time.
Failure to do so is just laziness, and it's the same type of laziness
that's all over glibc.

If this issue is as simple to solve as it sounds, it might make sense
to allow arbitrary key sizes. After all, programs that could be DoS'd
by long keys are already going to be limiting key length themselves.
If time grew superlinearly in key length, I'd say it should definitely
be limited, but since the growth is just linear (the expected growth
rate for any interface that takes a string argument), I think it's
less clear what should be done.
Post by Solar Designer
Post by Rich Felker
In light of both the alloca issue and the way runtime scales with key
length, I think we should just put an arbitrary limit on the key
length and return failure for longer keys. This should not affect any
real-world authentication systems, since the daemon you're attempting
to login to will also be placing a (probably much lower) limit on the
input buffer size for passwords (if it's not, you can trivially DoS
the server by sending gigabyte-long passwords for random users).
Something like 128-256 bytes would probably be a very generous limit.
Yes, but the failure should be indicated in the way we discussed - those
"*0" and "*1" strings, not NULL.
Yes, it should be treated like any other invalid input and hashed to
something that can never match. By the way, would you agree that all
programs that generate new password hashes should do so by calling
crypt twice, the second time using the output of the first as the
setting/salt, and verify that the results match? This seems to be the
only safe/portable way to make sure you got a valid hash and not an
error.

Rich
Solar Designer
2012-08-14 02:49:03 UTC
Permalink
Post by Rich Felker
Post by Solar Designer
Post by Rich Felker
Post by Solar Designer
Post by Szabolcs Nagy
http://www.akkadia.org/drepper/SHA-crypt.txt
Unfortunately, the reference implementation uses alloca() on both salt
and key strings.
Why? Does it need working space proportional to the input length?
It uses implementations of SHA-512 and SHA-256 that assume alignment, so
it provides such alignment by copying the inputs to aligned buffers if
the inputs to crypt() don't happen to be already aligned.
This could be solved by doing the copy a block at a time, and
submitting the blocks to the encryption code a block at a time.
Note that this moves the copying inside the many-iterations loop, so
there will be a difference in speed.

That said, yes, e.g. my public domain implementation of MD5 avoids the
alignment requirement by possibly copying the block the first time it's
used. There may be some performance hit from using it to implement
md5crypt as compared to using glibc's implementation of MD5 (because the
copying will in fact be in the 1000 iterations loop), but not much - and
overall my implementation might be faster (for other reasons),
especially on x86, where the copying may be avoided altogether in favor
of possibly unaligned accesses.
Post by Rich Felker
If this issue is as simple to solve as it sounds, it might make sense
to allow arbitrary key sizes. After all, programs that could be DoS'd
by long keys are already going to be limiting key length themselves.
That's wishful thinking.
Post by Rich Felker
If time grew superlinearly in key length, I'd say it should definitely
be limited, but since the growth is just linear (the expected growth
rate for any interface that takes a string argument), I think it's
less clear what should be done.
Yes, it is not clear what should be done.
Post by Rich Felker
Post by Solar Designer
Yes, but the failure should be indicated in the way we discussed - those
"*0" and "*1" strings, not NULL.
Yes, it should be treated like any other invalid input and hashed to
something that can never match. By the way, would you agree that all
programs that generate new password hashes should do so by calling
crypt twice, the second time using the output of the first as the
setting/salt, and verify that the results match? This seems to be the
only safe/portable way to make sure you got a valid hash and not an
error.
This makes sense, yet it sounds overkill to me. I'd simply check for
hash && strlen(hash) >= 13.

Alexander
Rich Felker
2012-08-14 02:58:05 UTC
Permalink
Post by Solar Designer
Post by Rich Felker
If this issue is as simple to solve as it sounds, it might make sense
to allow arbitrary key sizes. After all, programs that could be DoS'd
by long keys are already going to be limiting key length themselves.
That's wishful thinking.
Are you sure? I haven't read the code lately, but I can't imagine any
login daemon is going to be calling realloc() in a loop to read an
arbitrarily long password before authentication. That just sounds
gratuitously broken (i.e. someone went out of their way to write
painful code that does nothing useful and makes their daemon
susceptible to DoS).
Post by Solar Designer
Post by Rich Felker
If time grew superlinearly in key length, I'd say it should definitely
be limited, but since the growth is just linear (the expected growth
rate for any interface that takes a string argument), I think it's
less clear what should be done.
Yes, it is not clear what should be done.
If it's a toss-up on whether we should limit key length for runtime
considerations, I might just go on a basis of how it affects code
complexity for handling long keys and thus still limit them.
Post by Solar Designer
Post by Rich Felker
something that can never match. By the way, would you agree that all
programs that generate new password hashes should do so by calling
crypt twice, the second time using the output of the first as the
setting/salt, and verify that the results match? This seems to be the
only safe/portable way to make sure you got a valid hash and not an
error.
This makes sense, yet it sounds overkill to me. I'd simply check for
hash && strlen(hash) >= 13.
Indeed, strlen(hash)>=13 is certainly a necessary condition, but is it
sufficient? I could imagine a hypothetical crypt implementation that
puts error messages in the unmatchable hash as a debugging aid to why
generating the hash failed, but I agree they probably don't exist.
Still, changing your password should not be a frequent action, so it
might make the most sense to do the check the way I suggested.

Rich
Solar Designer
2012-08-14 03:35:14 UTC
Permalink
Post by Rich Felker
Post by Solar Designer
Post by Rich Felker
If this issue is as simple to solve as it sounds, it might make sense
to allow arbitrary key sizes. After all, programs that could be DoS'd
by long keys are already going to be limiting key length themselves.
That's wishful thinking.
Are you sure? I haven't read the code lately, but I can't imagine any
login daemon is going to be calling realloc() in a loop to read an
arbitrarily long password before authentication. That just sounds
gratuitously broken (i.e. someone went out of their way to write
painful code that does nothing useful and makes their daemon
susceptible to DoS).
I guess many daemons written in C limit the length at a few kilobytes -
which may allow for about 100 times greater than intended (by sysadmin)
crypt() running time. For md5crypt and sha*crypt, the first slowdown
occurs between length 15 and 16.

Then, it does not take explicit realloc() for just the password string
to support arbitrarily long passwords. The daemon may be using an
abstraction layer for all strings - e.g., qmail, Postfix, and vsftpd
have such dynamic string libraries of their own, and overall this is
good (it avoids buffer overflows and artificial limits in other places).
I don't know if vsftpd would in fact pass arbitrarily long passwords to
crypt() - this is worth checking.

Finally, some services are written in languages that support dynamically
allocated strings natively. I recall that OpenStack's Python code was
patched to impose a limit of 4096 chars on passwords recently,
specifically in response to risks like what we're discussing here.
(And 4096 is still a lot - may allow for some attacks.)
Post by Rich Felker
Post by Solar Designer
Post by Rich Felker
something that can never match. By the way, would you agree that all
programs that generate new password hashes should do so by calling
crypt twice, the second time using the output of the first as the
setting/salt, and verify that the results match? This seems to be the
only safe/portable way to make sure you got a valid hash and not an
error.
This makes sense, yet it sounds overkill to me. I'd simply check for
hash && strlen(hash) >= 13.
Indeed, strlen(hash)>=13 is certainly a necessary condition, but is it
sufficient? I could imagine a hypothetical crypt implementation that
puts error messages in the unmatchable hash as a debugging aid to why
generating the hash failed, but I agree they probably don't exist.
Still, changing your password should not be a frequent action, so it
might make the most sense to do the check the way I suggested.
An aspect to consider is that if you call crypt() twice for just one
request from the user (to set a password), you effectively double the
iteration count as it relates to potential CPU time exhaustion DoS
attacks (twice more CPU time is consumed per request). So if these
attacks are the limiting factor in a sysadmin's ability to set a higher
iteration count, this will result in the iteration count being twice
lower (and accordingly weaker hashes being used on the system). (If the
protocol is such that the same request contains the old password to be
checked first, then the difference may be 3/2.) While each individual
sysadmin is somewhat unlikely to apply this reasoning and consider
password setting as the worst DoS attack vector, I think that overall
this may in fact result in somewhat lower iteration counts being used
(average across many systems).

Alexander
Rich Felker
2012-08-14 04:49:27 UTC
Permalink
Post by Solar Designer
I guess many daemons written in C limit the length at a few kilobytes -
which may allow for about 100 times greater than intended (by sysadmin)
crypt() running time. For md5crypt and sha*crypt, the first slowdown
occurs between length 15 and 16.
Then, it does not take explicit realloc() for just the password string
to support arbitrarily long passwords. The daemon may be using an
abstraction layer for all strings - e.g., qmail, Postfix, and vsftpd
have such dynamic string libraries of their own, and overall this is
good (it avoids buffer overflows and artificial limits in other places).
I disagree. Avoiding artificial limits almost always means creating
difficult-to-debug corner cases when resources are exhausted. It was a
popular mantra of the GNU folks in the 80s and 90s, when they boasted
how superior their system was to Unix with its hard-coded limits. Of
course traditional Unices did have very bad, very low arbitrary
limits, and this is what allowed the GNU philosophy to look good, but
on a conceptual level, the difference was that the traditional tools
with arbitrary limits were able to promise that they would ALWAYS work
on conforming input (e.g. text files that met the line-length limit),
whereas the GNU utilities would work, well, whenever they didn't run
out of memory.

Back to the point about logins and daemons that run as root prior to
authentication: I consider it a moderate-level security bug for any
such program to allow unbounded resource allocation by an
unauthenticated client or prior to dropping privileges to the
authenticated uses.

I'm also pretty cold to the idea of "safe string libraries". Just
recently I got to looking inside the MaraDNS code, and its string
library, promoted as being extremely secure, just fails to handle
allocation failures at all. Sadly this kind of attitude seems to be
common. My idea of a safe string library is snprintf. If you use plain
C strings, most of them in fixed-size buffers, and never use any
function but snprintf (with the correct length argument) to write to
them, you're not going to have string overflow exploits. And your code
is going to be a lot simpler and more robust than code that's trying
to emulate Python/JavaScript/etc.-style strings in C...
Post by Solar Designer
I don't know if vsftpd would in fact pass arbitrarily long passwords to
crypt() - this is worth checking.
I actually just checked DropBear, and had a hard time finding if/how
it limits the length, but it turned out to be a simple 35000-byte
limit on packet size in the packet reception code. Presumably the vast
majority of that can be the password, if an attacker so desired.
Post by Solar Designer
Finally, some services are written in languages that support dynamically
allocated strings natively. I recall that OpenStack's Python code was
patched to impose a limit of 4096 chars on passwords recently,
specifically in response to risks like what we're discussing here.
(And 4096 is still a lot - may allow for some attacks.)
This is a great example of why the idea that higher-level languages
are more secure than C is such a fallacy... Different language idioms
just lead to different things that are easy to get wrong in
security-critical ways if you're not careful... Perhaps an ideal
language without security issues could be designed, but it would
require scrapping the idea that you can pretend resources are infinite
and that the runtime magically manages object lifetimes for you.

Rich
Rich Felker
2012-08-15 04:08:37 UTC
Permalink
Post by Rich Felker
The requested stuff that's still pending (with person who requested or
- Exception behavior in i386/x86_64 exponential asm (nsz).
Committed.
Post by Rich Felker
- Finer-grained _XOPEN_SOURCE (Luca)
On hold pending details on what the real problem is; unlikely to make
it in for next release.

Luca, if you still want this, please provide details on what issues
you're facing that could be solved. I don't want to target old
versions of standards unless there's a concrete practical goal. I
mentioned one possible approach (using old versions only as a way to
reenable stuff that was removed from the standard, not a way to get
the entire outdated-standard behavior that would also require removing
new symbols) but I haven't heard back on whether that would meet your
needs.
Post by Rich Felker
- Support for __progname (Daniel)
Daniel, any more thoughts on this? Are there lots of programs that
want it that can't easily be patched to simply use argv[0] themselves?
Post by Rich Felker
- MD5 and SHA crypt (nsz?)
Thanks nsz for the further work on MD5. It's looking like this won't
be ready in the next few days, but I'd be happy to be proven wrong.
Post by Rich Felker
Anything else I missed?
I got Landley's request for $CROSS_COMPILER prefix support into
configure too. I'd forgotten about that long-standing one.

Anything else that's been overlooked for a while? One thing that comes
to my mind is updating credits. I actually realized just after the
last release I forgot to add rdp and Solar to the COPYRIGHT file for
their contributions, and credit for rdp is actually missing from most
(all?) of the source files too - I felt really bad about this
omission. Please accept my apologies, and let me know if there are any
other credits I've forgotten to add.

Rich
Daniel Cegiełka
2012-08-15 08:55:06 UTC
Permalink
Post by Rich Felker
Post by Rich Felker
- Support for __progname (Daniel)
Daniel, any more thoughts on this? Are there lots of programs that
want it that can't easily be patched to simply use argv[0] themselves?
This is not something that is absolutely necessary. __progname quite
often is used on *BSD and less on Linux (eg. Owl's msulogin,
popa3d)... but __progname is always easy to fix.

Here we have the OpenBSD repo and content /bin directories:

http://www.openbsd.org/cgi-bin/cvsweb/src/bin/

And here's a list of programs (from /bin/) that require __progname (70% of all):

mv
systrace
md5
cp
chmod
cat
rmail
kill
sleep
rmdir
mkdir
extern
ps
df
rcp
ln
date
chio
domainname
stty
rm
pwd
hostname

For __progname we probably need to modify (asm) files in the musl/crt/.

Best regards,
Daniel
Szabolcs Nagy
2012-08-15 10:20:29 UTC
Permalink
Post by Daniel Cegiełka
Post by Rich Felker
Post by Rich Felker
- Support for __progname (Daniel)
Daniel, any more thoughts on this? Are there lots of programs that
want it that can't easily be patched to simply use argv[0] themselves?
This is not something that is absolutely necessary. __progname quite
often is used on *BSD and less on Linux (eg. Owl's msulogin,
popa3d)... but __progname is always easy to fix.
i think the fact that *bsd uses it
is not enough justification

openbsd uses it because it's part of
their style guide for whatever reason

"The __progname string may be used instead
of hard-coding the program name."
http://www.openbsd.org/cgi-bin/man.cgi?query=style&sektion=9

but we don't support many things from
there (like sys/queue.h)


i don't think many linux tools uses it
as it's not part of the lsb and glibc
has its own silly
program_invocation_name and
program_invocation_short_name
(which are aliases to __progname and
__progname_full)

the main justification i see is that
we already support bsd err and warn
apis which are required to print
the __progname as well
(currently they don't and actually
a simple warn("hi"); segfaults here
with musl but i havent investigated
it)
Daniel Cegiełka
2012-08-15 10:53:37 UTC
Permalink
Post by Szabolcs Nagy
Post by Daniel Cegiełka
Post by Rich Felker
Post by Rich Felker
- Support for __progname (Daniel)
Daniel, any more thoughts on this? Are there lots of programs that
want it that can't easily be patched to simply use argv[0] themselves?
This is not something that is absolutely necessary. __progname quite
often is used on *BSD and less on Linux (eg. Owl's msulogin,
popa3d)... but __progname is always easy to fix.
i think the fact that *bsd uses it
is not enough justification
openbsd uses it because it's part of
their style guide for whatever reason
"The __progname string may be used instead
of hard-coding the program name."
http://www.openbsd.org/cgi-bin/man.cgi?query=style&sektion=9
but we don't support many things from
there (like sys/queue.h)
i don't think many linux tools uses it
as it's not part of the lsb and glibc
has its own silly
program_invocation_name and
program_invocation_short_name
(which are aliases to __progname and
__progname_full)
the main justification i see is that
we already support bsd err and warn
apis which are required to print
the __progname as well
(currently they don't and actually
a simple warn("hi"); segfaults here
with musl but i havent investigated
it)
I understand that and that's why my first sentence was: This is not
something that is absolutely necessary.

We often say that we don't want to reproduce 'ugly stuff from glibc
etc.' (eg. __progname). This does not change the fact that a lot of
code will require patches to fix __progname problem. If Rich has taken
the effort to rewrite/fix libc, we can fix __progname... if needed
(it's really small patch.. discussed on the list).

Daniel
John Spencer
2012-08-15 13:10:36 UTC
Permalink
Post by Daniel Cegiełka
Post by Rich Felker
- Support for __progname (Daniel)
We often say that we don't want to reproduce 'ugly stuff from glibc
etc.' (eg. __progname). This does not change the fact that a lot of
code will require patches to fix __progname problem.
i don't think that "a lot" of programs need it. actually i never
encountered it in the 200 packages i ported to sabotage,
and i even doubt that it is in pkgsrc as i never heard anything about it
in the irc channel.
since this feature would add bloat to every program, i am strongly
opposed to adding it.
instead you can just do a quick sed over BSD coreutils that use it.
Post by Daniel Cegiełka
If Rich has taken
the effort to rewrite/fix libc, we can fix __progname... if needed
(it's really small patch.. discussed on the list).
Daniel
Daniel Cegiełka
2012-08-15 13:23:03 UTC
Permalink
i don't think that "a lot" of programs need it. actually i never encountered
it in the 200 packages i ported to sabotage,
and i even doubt that it is in pkgsrc as i never heard anything about it in
the irc channel.
I gave a list of packages from src/bin from OpenBSD repo (grep -r
progname src/bin).
since this feature would add bloat to every program, i am strongly opposed
to adding it.
instead you can just do a quick sed over BSD coreutils that use it.
And this may be the best solution... I also dislike this __progname.

Daniel
Szabolcs Nagy
2012-08-15 13:32:57 UTC
Permalink
Post by Szabolcs Nagy
the main justification i see is that
we already support bsd err and warn
apis which are required to print
the __progname as well
(currently they don't and actually
a simple warn("hi"); segfaults here
with musl but i havent investigated
it)
it seems warn(0) and err(1,0) segfault
(they should handle fmt==0 before passing
it to vfprintf)
and they do not print the ': ' nor the
__progname

(perror works correctly)

test program:

#include <stdio.h>
#include <err.h>
int main()
{
warn("warntest");
warn(0);
perror("perrortest");
perror("");
perror(0);
}


$ gcc err.c && ./a.out
a.out: warntest: Success
a.out: Success
perrortest: Success
Success
Success

$ musl-gcc err.c && ./a.out
warntestNo error information
Segmentation fault (core dumped)
Rich Felker
2012-08-15 14:36:40 UTC
Permalink
Post by Szabolcs Nagy
Post by Szabolcs Nagy
the main justification i see is that
we already support bsd err and warn
apis which are required to print
the __progname as well
(currently they don't and actually
a simple warn("hi"); segfaults here
with musl but i havent investigated
it)
it seems warn(0) and err(1,0) segfault
(they should handle fmt==0 before passing
it to vfprintf)
and they do not print the ': ' nor the
__progname
Thanks for the report. This should be easy to fix. By the way, is it
worth making these functions take a lock on the file for the whole
operation (to make it atomic)? I'm leaning towards no, since they seem
to only be used in legacy junk that's all single-threaded anyway.

Rich
Szabolcs Nagy
2012-08-17 09:49:01 UTC
Permalink
Post by Rich Felker
Post by Szabolcs Nagy
it seems warn(0) and err(1,0) segfault
(they should handle fmt==0 before passing
it to vfprintf)
and they do not print the ': ' nor the
__progname
Thanks for the report. This should be easy to fix. By the way, is it
worth making these functions take a lock on the file for the whole
operation (to make it atomic)? I'm leaning towards no, since they seem
to only be used in legacy junk that's all single-threaded anyway.
i dont mind if the file is not locked although it would not
itroduce too much overhead..

the ": " is not fixed, i think that should be added
Rich Felker
2012-08-17 12:10:59 UTC
Permalink
Post by Szabolcs Nagy
Post by Rich Felker
Post by Szabolcs Nagy
it seems warn(0) and err(1,0) segfault
(they should handle fmt==0 before passing
it to vfprintf)
and they do not print the ': ' nor the
__progname
Thanks for the report. This should be easy to fix. By the way, is it
worth making these functions take a lock on the file for the whole
operation (to make it atomic)? I'm leaning towards no, since they seem
to only be used in legacy junk that's all single-threaded anyway.
i dont mind if the file is not locked although it would not
itroduce too much overhead..
the ": " is not fixed, i think that should be added
Oh, my intent was for perror to do the printing of it. I forgot perror
special cases not only NULL but also ""...

Rich
Daniel Cegiełka
2012-08-22 17:45:01 UTC
Permalink
Hi,
Very helpful would be support for fts.h in musl:

http://www.kernel.org/doc/man-pages/online/pages/man3/fts.3.html

Daniel
Rich Felker
2012-08-22 18:57:09 UTC
Permalink
Post by Daniel Cegiełka
Hi,
http://www.kernel.org/doc/man-pages/online/pages/man3/fts.3.html
Can these functions be implemented purely as library functionality,
without any hooks into libc internals? I was thinking it might be nice
to isolate all such code into a single subtree in musl, so that folks
wanting to use it outside musl (either directly in application source
trees, or in libraries) could easily find and use it.

Of course there's also the option of putting such code in a separate
library outside of libc (i.e. not including it in libc) which some
people might prefer, but if it's small and historically in libcs and
doesn't create maintenance burden, I'm not opposed to having it in
libc..

Rich
Daniel Cegiełka
2012-08-22 19:15:16 UTC
Permalink
Post by Rich Felker
Post by Daniel Cegiełka
Hi,
http://www.kernel.org/doc/man-pages/online/pages/man3/fts.3.html
Can these functions be implemented purely as library functionality,
without any hooks into libc internals? I was thinking it might be nice
to isolate all such code into a single subtree in musl, so that folks
wanting to use it outside musl (either directly in application source
trees, or in libraries) could easily find and use it.
This is a very good idea!

Daniel
Post by Rich Felker
Of course there's also the option of putting such code in a separate
library outside of libc (i.e. not including it in libc) which some
people might prefer, but if it's small and historically in libcs and
doesn't create maintenance burden, I'm not opposed to having it in
libc..
Rich
Richard Pennington
2012-08-22 20:24:13 UTC
Permalink
Post by Rich Felker
Post by Daniel Cegiełka
Hi,
http://www.kernel.org/doc/man-pages/online/pages/man3/fts.3.html
Can these functions be implemented purely as library functionality,
without any hooks into libc internals? I was thinking it might be nice
to isolate all such code into a single subtree in musl, so that folks
wanting to use it outside musl (either directly in application source
trees, or in libraries) could easily find and use it.
Of course there's also the option of putting such code in a separate
library outside of libc (i.e. not including it in libc) which some
people might prefer, but if it's small and historically in libcs and
doesn't create maintenance burden, I'm not opposed to having it in
libc..
Rich
I used the NetBSD version and it built with musl just fine with no internal
stuff, as I recall.

http://ellcc.org/viewvc/svn/ellcc/trunk/libecc/src/musl/src/bsd/

-Rich
Rich Felker
2012-08-22 22:44:42 UTC
Permalink
Post by Richard Pennington
Post by Rich Felker
Post by Daniel Cegiełka
Hi,
http://www.kernel.org/doc/man-pages/online/pages/man3/fts.3.html
Can these functions be implemented purely as library functionality,
without any hooks into libc internals? I was thinking it might be nice
to isolate all such code into a single subtree in musl, so that folks
wanting to use it outside musl (either directly in application source
trees, or in libraries) could easily find and use it.
Of course there's also the option of putting such code in a separate
library outside of libc (i.e. not including it in libc) which some
people might prefer, but if it's small and historically in libcs and
doesn't create maintenance burden, I'm not opposed to having it in
libc..
Rich
I used the NetBSD version and it built with musl just fine with no internal
stuff, as I recall.
http://ellcc.org/viewvc/svn/ellcc/trunk/libecc/src/musl/src/bsd/
How big is it? I looked at it casually earlier today and didn't see
anything horribly offensive in the code; it looks to even get
allocation failure error cases right.

Rich

Rich Felker
2012-08-15 12:36:40 UTC
Permalink
Post by Daniel Cegiełka
Post by Rich Felker
Post by Rich Felker
- Support for __progname (Daniel)
Daniel, any more thoughts on this? Are there lots of programs that
want it that can't easily be patched to simply use argv[0] themselves?
This is not something that is absolutely necessary. __progname quite
often is used on *BSD and less on Linux (eg. Owl's msulogin,
popa3d)... but __progname is always easy to fix.
My leaning is to omit it at least for now then.
Post by Daniel Cegiełka
http://www.openbsd.org/cgi-bin/cvsweb/src/bin/
This might be a little bit inflated if it includes programs which
detected the presence of __progname at build time and only used it
because of that.
Post by Daniel Cegiełka
For __progname we probably need to modify (asm) files in the musl/crt/.
No, the only thing that belongs there is the minimum code to get argc,
argv, the addresses of main, _init, _fini, and jump to
__libc_start_main. The latter is responsible for things like
__progname. If the code were put in crt1.o, all programs would have a
reference to __progname encoded into them, which is not something
desirable; it would also increase the amount of per-arch code that
must be maintained.

Rich
Luca Barbato
2012-08-15 12:57:26 UTC
Permalink
Post by Rich Felker
On hold pending details on what the real problem is; unlikely to make
it in for next release.
Luca, if you still want this, please provide details on what issues
you're facing that could be solved. I don't want to target old
versions of standards unless there's a concrete practical goal. I
mentioned one possible approach (using old versions only as a way to
reenable stuff that was removed from the standard, not a way to get
the entire outdated-standard behavior that would also require removing
new symbols) but I haven't heard back on whether that would meet your
needs.
Your approach would be fine even making something wrong like exposing
all the symbols would be ok for my specific purposes.

Alternatively forcing one of the two SOURCES would work as well.

lu
Rich Felker
2012-08-15 14:34:36 UTC
Permalink
Post by Luca Barbato
Post by Rich Felker
On hold pending details on what the real problem is; unlikely to make
it in for next release.
Luca, if you still want this, please provide details on what issues
you're facing that could be solved. I don't want to target old
versions of standards unless there's a concrete practical goal. I
mentioned one possible approach (using old versions only as a way to
reenable stuff that was removed from the standard, not a way to get
the entire outdated-standard behavior that would also require removing
new symbols) but I haven't heard back on whether that would meet your
needs.
Your approach would be fine even making something wrong like
exposing all the symbols would be ok for my specific purposes.
Could you explain a little bit what the problem is, like "I'm trying
to build X and function Y is undeclared" or similar? Before trying to
address the issue, I'd like to know what problem it's solving. :-)

Rich
Luca Barbato
2012-08-15 18:28:18 UTC
Permalink
Post by Rich Felker
Could you explain a little bit what the problem is, like "I'm trying
to build X and function Y is undeclared" or similar? Before trying to
address the issue, I'd like to know what problem it's solving. :-)
One is libav, I need to complete my gentoo chroot to name exactly other
programs relying on XOPEN but I recall that there are others.

lu
Rich Felker
2012-08-15 18:35:35 UTC
Permalink
Post by Luca Barbato
Post by Rich Felker
Could you explain a little bit what the problem is, like "I'm trying
to build X and function Y is undeclared" or similar? Before trying to
address the issue, I'd like to know what problem it's solving. :-)
One is libav, I need to complete my gentoo chroot to name exactly
other programs relying on XOPEN but I recall that there are others.
I'm sorry, I think we're not communicating well. :-( What happens when
you try to build libav against musl as-is, with -D_XOPEN_SOURCE=700 or
600 or whatever you're using?

Rich
Rich Felker
2012-08-15 21:25:25 UTC
Permalink
Post by Luca Barbato
Post by Rich Felker
Could you explain a little bit what the problem is, like "I'm trying
to build X and function Y is undeclared" or similar? Before trying to
address the issue, I'd like to know what problem it's solving. :-)
One is libav, I need to complete my gentoo chroot to name exactly
other programs relying on XOPEN but I recall that there are others.
Does the latest git solve your problem?

Rich
Luca Barbato
2012-08-16 17:11:54 UTC
Permalink
Post by Rich Felker
Post by Luca Barbato
Post by Rich Felker
Could you explain a little bit what the problem is, like "I'm trying
to build X and function Y is undeclared" or similar? Before trying to
address the issue, I'd like to know what problem it's solving. :-)
One is libav, I need to complete my gentoo chroot to name exactly
other programs relying on XOPEN but I recall that there are others.
Does the latest git solve your problem?
Yes it does. Thank you.
Richard Pennington
2012-08-15 13:27:54 UTC
Permalink
Post by Rich Felker
Anything else that's been overlooked for a while? One thing that comes
to my mind is updating credits. I actually realized just after the
last release I forgot to add rdp and Solar to the COPYRIGHT file for
their contributions, and credit for rdp is actually missing from most
(all?) of the source files too - I felt really bad about this
omission. Please accept my apologies, and let me know if there are any
other credits I've forgotten to add.
No apology required. I'm sorry that I haven't been more engaged.
Unfortunately, my day job is in 24/7 mode (working on a new product release)
and I haven't been able to focus on the important stuff as much as I'd like.

Plus, I got way more from musl than I contributed. ;-)

-Rich
boris brezillon
2012-08-15 22:44:39 UTC
Permalink
Post by Rich Felker
Starting a new thread since the old one got too long and OT.. :-)
- Blowfish crypt
- MIPS dynamic linker
- Major MIPS bug fixes
- ARM hard float support
- BSD fgetln function
- Major bug fix in wcsstr
- Optimized memcpy for i386 and x86_64
- Public exposure for getdents
- Added significand function to math lib
The requested stuff that's still pending (with person who requested or
- Exception behavior in i386/x86_64 exponential asm (nsz).
- Finer-grained _XOPEN_SOURCE (Luca)
- Support for __progname (Daniel)
- MD5 and SHA crypt (nsz?)
In addition, there are the GNU hash and dladdr patches (Boris)
pending, which I don't want to overlook, but I think unless I manage
to get a discussion of simplifying them finished really soon and find
some time to thoroughly test them, it's best to hold off on this until
just after the next release so we have plenty of time to test and tune
it before another release.
I agree. We should wait for the next release.
Post by Rich Felker
Anything else I missed?
Rich
Continue reading on narkive:
Loading...