Discussion:
NULL
(too old to reply)
John Spencer
2013-01-09 11:02:29 UTC
Permalink
glibc defines NULL as __null: a magic variable supplied by GCC and
compatibles which always has pointer context.

musl defines NULL to 0 in C++.
this is correct per the standard, but breaks a lot of software on 64bit
archs, because it promotes to int.

for example there are 140 variadic functions in glib/gtk, most of them
depend on a so-called sentinel value (nullptr end marker).
examples:
http://developer.gnome.org/gobject/2.35/gobject-The-Base-Object-Type.html#g-object-new
http://developer.gnome.org/gobject/2.35/gobject-The-Base-Object-Type.html#g-object-get

any C++ program that uses those variadic functions and uses NULL as the
sentinel, will invoke UB and consequently segfault on musl, with a very
hard-to-debug backtrace (you cannot inspect variadic arguments in gdb
(unless you know exactly what kind of asm gcc creates and how to read
it), but exactly those arguments will end up being bogus).
the first such bug i encountered (in audacity) costed me several hours
of debugging distributed over 2 days.

note that most example code involving these variadic functions is
written in C, and happily uses NULL, as the problems surrounding NULL
are not exactly well-known.
http://gustedt.wordpress.com/2010/11/07/dont-use-null/

so the C++ guys basically just copypaste existing C code snippets (which
are theoretically broken, but work in any real world implementation),
and since it works in their environment they assume the code is correct.

after having fixed audacity, i tackled the segfaults encountered in evince.
here not even evince itself is broken, but also poppler, on which it
depends.
i've fixed the bugs in evince since, but still need to tackle the UB in
poppler.

frankly speaking i'm not very willing to go into the trouble of
debugging any C++ gtk application, and reporting the bugs i found.
when i discussed this with evince developers, they were asking all the
time "why does it work for me ? why didn't it break for anyone ?"
i told them that in C++ NULL expands to 0 and it worked because of luck.
after discussing with them for about half an hour, i got them to the
point where they agreed to merge a patch if i come up with one.

had i told them that i use another libc which does things differently
regarding NULL, i am not sure if i could have convinced them to change
their code.
at this point the issue in evince is not fixed, i still need to create a
patch based on their current git master, subscribe to their ML, send
them the patch and hope that they haven't changed their mind.

so for me, there are 3 options how to deal with issue in the future:

1) go into the trouble of debugging all future C++ apps i will port,
potentially spending hours to fix each of them, and then even more hours
discussing this topic on their mailing lists. since sabotage will very
likely only support a small subset of these programs, there will remain
a lot of others that stay broken.
the gentoo guys or whoever is going to port them will have a *very* hard
time figuring out why it crashes for them on musl.
this approach is in my eyes insane, as it will need a huge investment of
time and nerves, and the number of broken C++ apps is theoretically
infinite.

(as a side note: when i googled for g_signal_emit, i found multiple bug
reports to multiple projects, of which the backtraces showed exactly the
kind of varargs UB that i encountered.
all of them were not fixed, but merely worked around. apparently the
developers of these projects didnt find the real cause of their bugs.
this is not unlikely, as it is very hard to find out whats going on.
https://bugs.launchpad.net/compiz/+bug/932382
https://trac.transmissionbt.com/ticket/1135 ... )


2) change musl so it is compatible with those apps. this would mean:
#if defined(__GNUC__) && defined(__cplusplus__)
#define NULL __null
#elif defined (__cplusplus__)
#define NULL 0
#else
#define NULL (void *) 0 /* for C code */
#end
this change is the easiest solution: any problem will be magically fixed.

3) create some kind of code analysis tool that will scan C++ code for
usage of NULL that emits warnings for each occurence of NULL used in a
variadic function call. the results could then be used to created
automated patches and used in distros based on musl and sent to the
corresponding package maintainers.
i suspect it is possible to convince the authors of the "cppcheck"
program to add such a check, as i alreaded contacted them in the past
and they seemed very cooperative.
this could be tackled as a side-project even if we decide for option 2.

i'm welcoming any comments on the issue. i'm especially interested what
the gentoo developers think.
Szabolcs Nagy
2013-01-09 12:18:32 UTC
Permalink
Post by John Spencer
at this point the issue in evince is not fixed, i still need to
create a patch based on their current git master, subscribe to their
ML, send them the patch and hope that they haven't changed their
mind.
i think you are overcomplicating this issue

this is clearly a violation of the standard and that's what
you should tell to the maintainers:

using NULL in the argument of variadic functions is ub both
in c and c++

the problem is worse in c++ because it actually breaks code
in practice with the usual c++ definition of NULL, in c the
usual definition happens to work
Post by John Spencer
1) go into the trouble of debugging all future C++ apps i will port,
i think this issue should be catched with automated tools, not debugging:
static code analyzer or runtime instrumentation of vararg functions
Post by John Spencer
(as a side note: when i googled for g_signal_emit, i found multiple
bug reports to multiple projects, of which the backtraces showed
exactly the kind of varargs UB that i encountered.
all of them were not fixed, but merely worked around. apparently the
developers of these projects didnt find the real cause of their bugs.
this is not unlikely, as it is very hard to find out whats going on.
https://bugs.launchpad.net/compiz/+bug/932382
https://trac.transmissionbt.com/ticket/1135 ... )
may be this should get a bit more publicity,
it's an easy to fix bug
Post by John Spencer
#if defined(__GNUC__) && defined(__cplusplus__)
#define NULL __null
i think this is not needed, you can have a definition
in c++ that "happens to work" just like the (void*)0
in c:

#define NULL 0L

but this is just a workaround, the bugs still need to be fixed

(in c++11 we could use nullptr which has std::nullptr_t type
which converts to (void*)0 in vararg context, but c++11 is not
widely used yet)
Post by John Spencer
3) create some kind of code analysis tool that will scan C++ code
i don't know any good open source static code analyzer for
c and c++, maybe the clang based one can do the job

i tried splint but that does not catch it

catching null pointers in general can be tricky as 0 may be a
valid argument to a variadic function and the code analyzer has
no way to tell if it's meant to be a pointer or not

but if NULL is used without a cast then that's an error both in
c and c++ as both languages allow various different definitions
of NULL which may have different size and representation in
vararg context
Post by John Spencer
i'm welcoming any comments on the issue. i'm especially interested
what the gentoo developers think.
i wonder what makes them special :)
John Spencer
2013-01-09 13:36:43 UTC
Permalink
Post by Szabolcs Nagy
Post by John Spencer
at this point the issue in evince is not fixed, i still need to
create a patch based on their current git master, subscribe to their
ML, send them the patch and hope that they haven't changed their
mind.
i think you are overcomplicating this issue
this is clearly a violation of the standard and that's what
using NULL in the argument of variadic functions is ub both
in c and c++
many developers don't care about the standard. they take the stance:
"works for me, if you want it patched then do it yourself and we'll
eventually merge"
Post by Szabolcs Nagy
the problem is worse in c++ because it actually breaks code
in practice with the usual c++ definition of NULL, in c the
usual definition happens to work
Post by John Spencer
1) go into the trouble of debugging all future C++ apps i will port,
static code analyzer or runtime instrumentation of vararg functions
yes, as i proposed in 3)
Post by Szabolcs Nagy
Post by John Spencer
(as a side note: when i googled for g_signal_emit, i found multiple
bug reports to multiple projects, of which the backtraces showed
exactly the kind of varargs UB that i encountered.
all of them were not fixed, but merely worked around. apparently the
developers of these projects didnt find the real cause of their bugs.
this is not unlikely, as it is very hard to find out whats going on.
https://bugs.launchpad.net/compiz/+bug/932382
https://trac.transmissionbt.com/ticket/1135 ... )
may be this should get a bit more publicity,
it's an easy to fix bug
yes, once you know the details, it's easy to fix.
but when you don't, you'll have a hard time figuring out where the
segfault comes from.
i agree that this should get more publicity.
Post by Szabolcs Nagy
Post by John Spencer
#if defined(__GNUC__)&& defined(__cplusplus__)
#define NULL __null
i think this is not needed, you can have a definition
in c++ that "happens to work" just like the (void*)0
#define NULL 0L
yes, that'll work as well.
Post by Szabolcs Nagy
but this is just a workaround, the bugs still need to be fixed
again, some ppl don't agree, even if you cite the standard.
i even heard things like "using a cast is ugly (or C'ish), we prefer NULL"
Post by Szabolcs Nagy
(in c++11 we could use nullptr which has std::nullptr_t type
which converts to (void*)0 in vararg context, but c++11 is not
widely used yet)
Post by John Spencer
3) create some kind of code analysis tool that will scan C++ code
i don't know any good open source static code analyzer for
c and c++, maybe the clang based one can do the job
did you look at cppcheck ? i think this could be the right tool for the job.
if we raise awareness of the issue, i'm sure they'll add a check for this.
Post by Szabolcs Nagy
i tried splint but that does not catch it
catching null pointers in general can be tricky as 0 may be a
valid argument to a variadic function and the code analyzer has
no way to tell if it's meant to be a pointer or not
but if NULL is used without a cast then that's an error both in
c and c++ as both languages allow various different definitions
of NULL which may have different size and representation in
vararg context
Post by John Spencer
i'm welcoming any comments on the issue. i'm especially interested
what the gentoo developers think.
i wonder what makes them special :)
well, from what i heard on IRC they started to work on a musl port 2
weeks ago (but it got silent since...).
since they have likely more packages than sabotage (350) this issue
could cause them major pain.
Rob Landley
2013-01-12 06:32:44 UTC
Permalink
Post by John Spencer
Post by Szabolcs Nagy
using NULL in the argument of variadic functions is ub both
in c and c++
"works for me, if you want it patched then do it yourself and we'll
eventually merge"
Why is it UB? The standard says it's a pointer. If you pull %p off in
printf, feeding NULL in that slot should work fine.
Post by John Spencer
yes, once you know the details, it's easy to fix.
but when you don't, you'll have a hard time figuring out where the
segfault comes
from. i agree that this should get more publicity.
"C++ is hard to debug and requires you to know how nested template
expansion gets implemented down to the bare metal" is not a new problem.

Programming in C++ and hitting seemingly-trivial problems you can't
debug without reading the compiler's source code is like riding a
motorcyle and wind up with maimed for life. (There's a reason medical
personnel call them "donorcycles".)
Post by John Spencer
Post by Szabolcs Nagy
i think this is not needed, you can have a definition
in c++ that "happens to work" just like the (void*)0
#define NULL 0L
yes, that'll work as well.
Post by Szabolcs Nagy
but this is just a workaround, the bugs still need to be fixed
It's not a workaround, it's what C99+LP64 explicitly specifies.

If doing something well-defined in C99 on Linux goes nuts on C++ in
Windows, how is this our problem?
Post by John Spencer
Post by Szabolcs Nagy
(in c++11 we could use nullptr which has std::nullptr_t type
which converts to (void*)0 in vararg context, but c++11 is not
widely used yet)
Is there actually a point to the C++1!!1one! standard? The only person
I've heard actually be happy with it is the author of uClibc++, but he
liked the previous C++ standards and thinks Corba is a good idea, so...
Post by John Spencer
well, from what i heard on IRC they started to work on a musl port 2
weeks ago (but
it got silent since...). since they have likely more packages than
sabotage (350)
this issue could cause them major pain.
I wouldn't be too impressed by this.

There are somewhere between 200 and 900 packages that cross compile
"easily", for a decreasingly obvious definition of "easily" depending
on how many rocket engines you want to strap to the turtle. Projects
like OpenEmbedded and Beyond Linux From Scratch recapitulate phylogeny
with these packages, and then hit the point where your volunteers' time
is entirely consumed dealing with package upgrades to hold the existing
turf against bit-rot. (Personally, I refer to this as "the buildroot
event horizon".)

Actual distributions eventually separate "the OS" from "the
repository", where they have a core team who does work on the operating
system and a separate (much, much larger) set of package maintainers
who keep their packages of interest working but don't generally work on
the base OS other than complaining when something breaks.

You only get to the "real distro" stage when the base OS stops being
interesting. While the base OS remains a moving target, package
maintainers can't do their jobs without also being OS maintainers,
which is a much bigger time commitment and has Brooks' Law problems
with coordination overhead scaling your core team.

There are plenty of existing interesting repositories: Debian, Ubuntu,
Red Hat, SuSE, Gentoo... How much work do they do maintaining those
repos? According to
https://admin.fedoraproject.org/pkgdb/stats/?_csrf_token=1048fa94db94990f5c39ed12c7ca4cd8cb840ca7
Fedora has 150,000 packages (but then they break packages into several
smaller packages for no apparent reason, and this may treat x86 and
i686 versions of the same thing as separate packages). A much cleaner
reading is "wget
http://packages.debian.org/stable/allpackages?format=txt.gz -O - | zcat
| wc -l" which gives around 35,000 packages. (You can get larger
numbers by checking what ubuntu adds, looking at testing instead of
stable, adding in the external repositories that debian's
ultraconservative definition of "proprietary" kicks stuff to, and so
on. But this is a good ballpark.)

A more recent attempt at being a real <strike>boy</strike> distro would
be Arch Linux, and
https://www.archlinux.org/packages/?sort=&arch=i686&q=&maintainer=&last_update=&flagged=&limit=50
finds 4300 packages for the i686 target, and they've been doing this
since 2002.

Reinventing the wheel because you have a new libc: not very
interesting. Trying to get a musl version of debian or gentoo that you
can push "upstream": a lot more interesting.

Rob
Rich Felker
2013-01-12 06:46:11 UTC
Permalink
Post by Rob Landley
Post by John Spencer
Post by Szabolcs Nagy
using NULL in the argument of variadic functions is ub both
in c and c++
"works for me, if you want it patched then do it yourself and
we'll eventually merge"
Why is it UB? The standard says it's a pointer. If you pull %p off
in printf, feeding NULL in that slot should work fine.
See my other message. NULL is not required to have pointer type. It
can be any null pointer constant, which includes things like 0, 0L,
0ULL, (sizeof 1 - sizeof 2), (void *)(1ULL/2ULL), etc.

The %p specifier, on the other hand, requires an argument of type void
*; passing any other type yields UB.

Rich
Luca Barbato
2013-01-12 07:15:21 UTC
Permalink
Post by Rich Felker
Post by Rob Landley
Post by John Spencer
Post by Szabolcs Nagy
using NULL in the argument of variadic functions is ub both
in c and c++
"works for me, if you want it patched then do it yourself and
we'll eventually merge"
Why is it UB? The standard says it's a pointer. If you pull %p off
in printf, feeding NULL in that slot should work fine.
See my other message. NULL is not required to have pointer type. It
can be any null pointer constant, which includes things like 0, 0L,
0ULL, (sizeof 1 - sizeof 2), (void *)(1ULL/2ULL), etc.
The %p specifier, on the other hand, requires an argument of type void
*; passing any other type yields UB.
so printf("%s", NULL) would lead to UB if NULL is 0L ?

lu
Rich Felker
2013-01-12 13:33:43 UTC
Permalink
Post by Luca Barbato
Post by Rich Felker
Post by Rob Landley
Post by John Spencer
Post by Szabolcs Nagy
using NULL in the argument of variadic functions is ub both
in c and c++
"works for me, if you want it patched then do it yourself and
we'll eventually merge"
Why is it UB? The standard says it's a pointer. If you pull %p off
in printf, feeding NULL in that slot should work fine.
See my other message. NULL is not required to have pointer type. It
can be any null pointer constant, which includes things like 0, 0L,
0ULL, (sizeof 1 - sizeof 2), (void *)(1ULL/2ULL), etc.
The %p specifier, on the other hand, requires an argument of type void
*; passing any other type yields UB.
so printf("%s", NULL) would lead to UB if NULL is 0L ?
printf("%s", (void *)0) leads to UB too. The %s specifier requires a
pointer to a string, not a null pointer.

Perhaps you meant printf("%p", NULL), and in that case yes, it could
also be UB. It will _work_ if NULL is defined to 0L (since the size,
representation, and argument passing convention is the same for long
and pointers on all relevant systems) but it's still UB. On the other
hand, it subtly breaks if NULL is simply 0, which is what this whole
thread is about: whether we should work around such broken programs.

Rich
Jens Staal
2013-01-12 11:39:20 UTC
Permalink
Post by Rob Landley
I wouldn't be too impressed by this.
There are somewhere between 200 and 900 packages that cross compile
"easily", for a decreasingly obvious definition of "easily" depending
on how many rocket engines you want to strap to the turtle. Projects
like OpenEmbedded and Beyond Linux From Scratch recapitulate phylogeny
with these packages, and then hit the point where your volunteers' time
is entirely consumed dealing with package upgrades to hold the existing
turf against bit-rot. (Personally, I refer to this as "the buildroot
event horizon".)
Actual distributions eventually separate "the OS" from "the
repository", where they have a core team who does work on the operating
system and a separate (much, much larger) set of package maintainers
who keep their packages of interest working but don't generally work on
the base OS other than complaining when something breaks.
You only get to the "real distro" stage when the base OS stops being
interesting. While the base OS remains a moving target, package
maintainers can't do their jobs without also being OS maintainers,
which is a much bigger time commitment and has Brooks' Law problems
with coordination overhead scaling your core team.
pkgsrc is already doing quite well with musl libc and Gregor's "per package
namespace" ideas in Snowflake seeem very interesting*, also utilized in
Sabotage. Most likely, source-based or combined source/binary based
distributions like pkgsrc or gentoo are probably the fastest and "easiest" to
get going. Hooking up to a binary distribution distro like the debian-based or
rpm-based ones still means that one needs sepparate repositories for the new
libc (so the number of repositories will then be supported archs * supported
libcs) - probably a more difficult proposition.

* Even cooler would ofcourse be to be able to use union mounts and private
namespaces instead of symlinks to a default PATH like /bin, but that is not
really relevant for this particular discussion.
c***@openwall.com
2013-01-09 13:09:27 UTC
Permalink
Hi folks,
Post by John Spencer
#if defined(__GNUC__) && defined(__cplusplus__)
#define NULL __null
#elif defined (__cplusplus__)
#define NULL 0
#else
#define NULL (void *) 0 /* for C code */
#end
this change is the easiest solution: any problem will be magically fixed.
Actually, the problem is not fixed this way; this just gives the problem
the right (hmmm.. license, heh) to exist forever. My experience of
teaching students shows that when smart people introduce something to help
stupid people to remain useful without learning, we end up with more people
who don't want to learn, and that's all (arhgggg, I hate the
one-unknown-to-me who decided one day that two object modules having global
variables of the same name and type should link successfully... so that
newbie can simply place 'int my_damn_var;' line into his/her header not
bothering with understanding of 'extern' and the linker as such, and then
they tend to ask the well-known thing 'why do you say this is incorrect if
it works')

However, sometimes the practice forces us to do wrong things just because
we have no time or resources to do what is the right, and it looks like
this is exactly the case. So perhaps the "option 2" will finally be
choosen, despite we don't like it. However, I'd suggest at least to let
the people know this is a WORKAROUND for the bugs THEY introduce: make this
hack disabled by default, enabled by a compile-time option, and issue a
warning which points them to this discussion or something similar.
Something like "Okay, if your program doesn't work without this workaround,
then you can use the workaround, but you'd better fix your program". This
will not do much influence while musl is not so popular, but I hope it will
become popular one day (I really do... let's give the damn world a chance),
and then the people will have something to think about.


Thanks!

Andrey Stolyarov
John Spencer
2013-01-09 13:47:42 UTC
Permalink
Post by c***@openwall.com
Hi folks,
Post by John Spencer
this change is the easiest solution: any problem will be magically fixed.
[...]
However, sometimes the practice forces us to do wrong things just because
we have no time or resources to do what is the right, and it looks like
this is exactly the case. So perhaps the "option 2" will finally be
choosen, despite we don't like it. However, I'd suggest at least to let
the people know this is a WORKAROUND for the bugs THEY introduce: make this
hack disabled by default, enabled by a compile-time option, and issue a
warning which points them to this discussion or something similar.
as of now, musl only supports a single configuration.
having 2 different versions of musl in the wild, one that works with
their apps and another one that does not, is definitely not desirable
Post by c***@openwall.com
Something like "Okay, if your program doesn't work without this workaround,
then you can use the workaround, but you'd better fix your program". This
will not do much influence while musl is not so popular, but I hope it will
become popular one day (I really do... let's give the damn world a chance),
and then the people will have something to think about.
here we have the typical chicken-and-egg problem: as long as
applications compiled with musl just crash, while they work perfectly
well with glibc, i think most contributors will become discouraged soon
and continue using what they're familiar with.
c***@openwall.com
2013-01-09 14:49:25 UTC
Permalink
Post by John Spencer
Post by c***@openwall.com
choosen, despite we don't like it. However, I'd suggest at least to let
the people know this is a WORKAROUND for the bugs THEY introduce: make this
hack disabled by default, enabled by a compile-time option, and issue a
warning which points them to this discussion or something similar.
as of now, musl only supports a single configuration.
having 2 different versions of musl in the wild, one that works with
their apps and another one that does not, is definitely not desirable
Well, the matter we discuss here doesn't affect the compiled code of musl,
does it? And the header can have these #idfef's so that people have to
compile their _buggy_ apps using smth. like -DMUSL_CPLUSPLUS_NULL_WORKAROUND=1

If I'm wrong and miss something (so that my suggestion requires to maintain
separate musl binaries), then... heh... Well, then I'd vote against such a
workaround (for the reasons I already mentioned about the more and more
people who doesn't want to learn), but my opinion shouldn't perhaps cost
much as I'm not a musl developer.
Post by John Spencer
Post by c***@openwall.com
Something like "Okay, if your program doesn't work without this workaround,
then you can use the workaround, but you'd better fix your program". This
will not do much influence while musl is not so popular, but I hope it will
become popular one day (I really do... let's give the damn world a chance),
and then the people will have something to think about.
here we have the typical chicken-and-egg problem: as long as
applications compiled with musl just crash, while they work perfectly
well with glibc, i think most contributors will become discouraged soon
and continue using what they're familiar with.
Definitely you're right. That's why I don't suggest to "ignore them all"
or something like that -- but may be at least we can let them know they do
something wrong.



Thanks!

Andrey Stolyarov
Luca Barbato
2013-01-09 14:42:07 UTC
Permalink
Post by John Spencer
#if defined(__GNUC__) && defined(__cplusplus__)
#define NULL __null
#elif defined (__cplusplus__)
#define NULL 0
#else
#define NULL (void *) 0 /* for C code */
#end
this change is the easiest solution: any problem will be magically fixed.
I'm not sure if there is a way to warn properly at compile time for that
specific usage.

IMHO going with 2+3 is the only safe way to grant musl more support

Having a flag to turn those compatibility hacks off would be good.

I wonder why in the hell C++ can't use the (void *) 0 definition or
equivalent.

lu
Rich Felker
2013-01-09 14:47:12 UTC
Permalink
Post by Luca Barbato
Post by John Spencer
#if defined(__GNUC__) && defined(__cplusplus__)
#define NULL __null
#elif defined (__cplusplus__)
#define NULL 0
#else
#define NULL (void *) 0 /* for C code */
#end
this change is the easiest solution: any problem will be magically fixed.
I'm not sure if there is a way to warn properly at compile time for that
specific usage.
__attribute__ ((sentinel)) may be used. Adding this to the appropriate
gtk headers (even just as a temporary debugging measure if it's not
desirable permanently) would catch all the bugs calling gtk variadic
functions.
Post by Luca Barbato
IMHO going with 2+3 is the only safe way to grant musl more support
2 is not appropriate as written (it's more complexity, and ugly, and
in multiple locations). 3 already exists; it's called GCC.

If we decide something is needed at the musl level, in my opinion the
only acceptable solution is just replacing 0 with 0L unconditionally.
Actually I'd like to remove the special-case for C++ and make NULL
_always_ be defined to 0 or 0L, but I worry too many people would
complain...
Post by Luca Barbato
I wonder why in the hell C++ can't use the (void *) 0 definition or
equivalent.
Because then char *s = NULL; would be a constraint violation.

Rich
Luca Barbato
2013-01-09 15:03:52 UTC
Permalink
Post by Rich Felker
Post by Luca Barbato
Post by John Spencer
#if defined(__GNUC__) && defined(__cplusplus__)
#define NULL __null
#elif defined (__cplusplus__)
#define NULL 0
#else
#define NULL (void *) 0 /* for C code */
#end
this change is the easiest solution: any problem will be magically fixed.
I'm not sure if there is a way to warn properly at compile time for that
specific usage.
__attribute__ ((sentinel)) may be used. Adding this to the appropriate
gtk headers (even just as a temporary debugging measure if it's not
desirable permanently) would catch all the bugs calling gtk variadic
functions.
That would be worthy notwithstanding.
Post by Rich Felker
Post by Luca Barbato
IMHO going with 2+3 is the only safe way to grant musl more support
2 is not appropriate as written (it's more complexity, and ugly, and
in multiple locations). 3 already exists; it's called GCC.
=/
Post by Rich Felker
Post by Luca Barbato
I wonder why in the hell C++ can't use the (void *) 0 definition or
equivalent.
Because then char *s = NULL; would be a constraint violation.
Indeed, how foolish of me.

lu
John Spencer
2013-01-09 15:18:12 UTC
Permalink
Post by Rich Felker
Post by Luca Barbato
Post by John Spencer
#if defined(__GNUC__)&& defined(__cplusplus__)
#define NULL __null
#elif defined (__cplusplus__)
#define NULL 0
#else
#define NULL (void *) 0 /* for C code */
#end
this change is the easiest solution: any problem will be magically fixed.
I'm not sure if there is a way to warn properly at compile time for that
specific usage.
__attribute__ ((sentinel)) may be used. Adding this to the appropriate
gtk headers (even just as a temporary debugging measure if it's not
desirable permanently) would catch all the bugs calling gtk variadic
functions.
indeed this does emit a warning. however, it will only detect sentinels,
not other variadic arguments that are expected to be pointers but will
be passed as int instead. i haven't tested, but it will most likely also
cause crashes.
Post by Rich Felker
Post by Luca Barbato
IMHO going with 2+3 is the only safe way to grant musl more support
2 is not appropriate as written (it's more complexity, and ugly, and
in multiple locations). 3 already exists; it's called GCC.
If we decide something is needed at the musl level, in my opinion the
only acceptable solution is just replacing 0 with 0L unconditionally.
Actually I'd like to remove the special-case for C++ and make NULL
_always_ be defined to 0 or 0L, but I worry too many people would
complain...
yes, 0L is definitely nicer.
regarding C code, it would infact be more consequent if you make it 0/0L
there as well.
what issues could arise in C code when (void* ) 0 is replaced with 0L ?
Rich Felker
2013-01-09 15:36:30 UTC
Permalink
Post by John Spencer
Post by Rich Felker
__attribute__ ((sentinel)) may be used. Adding this to the appropriate
gtk headers (even just as a temporary debugging measure if it's not
desirable permanently) would catch all the bugs calling gtk variadic
functions.
indeed this does emit a warning. however, it will only detect
sentinels, not other variadic arguments that are expected to be
pointers but will be passed as int instead. i haven't tested, but it
will most likely also cause crashes.
Indeed, you are correct. I suspect most such uses _also_ have a
sentinel where incorrect code will also mess up the sentinel, but in
principle it's possible that this does not happen.

A good system of static analysis for variadic functions would need
some way to express the interface contract in terms of a predicate on
the argument types (and possibly contents) of arguments. I'm not aware
of any such system existing, so the matter falls back to whether we
would really need it to avoid all these bugs.
Post by John Spencer
Post by Rich Felker
If we decide something is needed at the musl level, in my opinion the
only acceptable solution is just replacing 0 with 0L unconditionally.
Actually I'd like to remove the special-case for C++ and make NULL
_always_ be defined to 0 or 0L, but I worry too many people would
complain...
yes, 0L is definitely nicer.
regarding C code, it would infact be more consequent if you make it
0/0L there as well.
what issues could arise in C code when (void* ) 0 is replaced with 0L ?
The original reason I left NULL with pointer type was to catch the
other idiotic error:

str[len]=NULL;

i.e. confusion of NULL with ASCII NUL. However, this raises a good
question: short of C11 _Generic, is it even possible for a program to
detect whether NULL has integer or pointer type?

I know of one way, but it's very obscure:

int null_is_ptr_type()
{
char s[1][1+(int)NULL];
int i = 0;
return sizeof s[i++], i;
}

Are there any ways that might actually affect/break programs?

Rich
Rob
2013-01-09 21:11:28 UTC
Permalink
Post by Rich Felker
i.e. confusion of NULL with ASCII NUL. However, this raises a good
question: short of C11 _Generic, is it even possible for a program to
detect whether NULL has integer or pointer type?
int null_is_ptr_type()
{
char s[1][1+(int)NULL];
int i = 0;
return sizeof s[i++], i;
}
Magic... is `s' a VLA here? My mind is boggled because
__builtin_constant_p(1+(int)NULL) returns 1, and I can't think of any
reason why the sizeof is evaluated.

Also, seeing that clang and tcc return 0 in all cases, is this a bug in
both of them?

Cheers,
Rob
Szabolcs Nagy
2013-01-09 21:53:27 UTC
Permalink
Post by Rob
Post by Rich Felker
char s[1][1+(int)NULL];
int i = 0;
return sizeof s[i++], i;
Magic... is `s' a VLA here? My mind is boggled because
__builtin_constant_p(1+(int)NULL) returns 1, and I can't think of any
reason why the sizeof is evaluated.
Also, seeing that clang and tcc return 0 in all cases, is this a bug in
both of them?
sizeof evaluates its argument if and only if it is a vla
(c11 6.5.3.4p2)

in c99 (and c11) vla is created if the size in the array
declarator is not an "integer constant expression"
(c11 6.7.6.2p4)

eg '1 + (int)(void*)0' is not an integer constant expression
because of the pointer cast, but '1 + (int)0' is
(c11 6.6p6)

hence sizeof s[i++] evaluates the argument if NULL has a pointer
cast in it
Rob
2013-01-09 22:17:28 UTC
Permalink
Post by Szabolcs Nagy
Post by Rob
Post by Rich Felker
char s[1][1+(int)NULL];
int i = 0;
return sizeof s[i++], i;
Magic... is `s' a VLA here? My mind is boggled because
__builtin_constant_p(1+(int)NULL) returns 1, and I can't think of any
reason why the sizeof is evaluated.
Also, seeing that clang and tcc return 0 in all cases, is this a bug in
both of them?
sizeof evaluates its argument if and only if it is a vla
(c11 6.5.3.4p2)
in c99 (and c11) vla is created if the size in the array
declarator is not an "integer constant expression"
(c11 6.7.6.2p4)
eg '1 + (int)(void*)0' is not an integer constant expression
because of the pointer cast, but '1 + (int)0' is
(c11 6.6p6)
hence sizeof s[i++] evaluates the argument if NULL has a pointer
cast in it
Ah, thanks for the explanation.
Szabolcs Nagy
2013-01-09 23:42:01 UTC
Permalink
Post by Rich Felker
question: short of C11 _Generic, is it even possible for a program to
detect whether NULL has integer or pointer type?
if it's ok to detect at compile time then it's easy:

int a = -NULL;
int b = NULL+NULL;
int c = NULL+.0f;
int d[NULL+1];
int e = 0?NULL:1;

these are compile time errors if NULL has pointer type
but ok when NULL is integer
Rob Landley
2013-01-12 06:56:08 UTC
Permalink
Post by Rich Felker
Post by John Spencer
Post by Rich Felker
__attribute__ ((sentinel)) may be used. Adding this to the
appropriate
Post by John Spencer
Post by Rich Felker
gtk headers (even just as a temporary debugging measure if it's not
desirable permanently) would catch all the bugs calling gtk
variadic
Post by John Spencer
Post by Rich Felker
functions.
indeed this does emit a warning. however, it will only detect
sentinels, not other variadic arguments that are expected to be
pointers but will be passed as int instead. i haven't tested, but it
will most likely also cause crashes.
Indeed, you are correct. I suspect most such uses _also_ have a
sentinel where incorrect code will also mess up the sentinel, but in
principle it's possible that this does not happen.
A good system of static analysis for variadic functions would need
some way to express the interface contract in terms of a predicate on
the argument types (and possibly contents) of arguments. I'm not aware
of any such system existing, so the matter falls back to whether we
would really need it to avoid all these bugs.
You mean like "__attribute__ ((__format__ (__printf__, 2, 3)));"?

This is the job of lint's descendents (ala the stanford checker or
sparse). This is really not the compiler's job. (The printf checking is
bad enough and it's hardwiring in constraints expressed in the C
standard. If you write your own function that's implementing its own
rules, the compiler has no idea what you did. Working _out_ what you
did equals solving the halting problem.)
Post by Rich Felker
Post by John Spencer
Post by Rich Felker
If we decide something is needed at the musl level, in my opinion
the
Post by John Spencer
Post by Rich Felker
only acceptable solution is just replacing 0 with 0L
unconditionally.
Post by John Spencer
Post by Rich Felker
Actually I'd like to remove the special-case for C++ and make NULL
_always_ be defined to 0 or 0L, but I worry too many people would
complain...
yes, 0L is definitely nicer.
regarding C code, it would infact be more consequent if you make it
0/0L there as well.
what issues could arise in C code when (void* ) 0 is replaced with
0L ?
The original reason I left NULL with pointer type was to catch the
str[len]=NULL;
i.e. confusion of NULL with ASCII NUL.
They're both 0. If the optimizer can't convert the type down when
handed a constant assignment, the optimizer should be shot.

(That said, I use 0 in my code instead of NUL or '\0' because 0 is 0.)
Post by Rich Felker
However, this raises a good
question: short of C11 _Generic, is it even possible for a program to
detect whether NULL has integer or pointer type?
The C99 standard says that NULL has pointer type. Thus when you pass it
in varargs, it should be a long on any LP64 system which is basically
"everything but windows" for about 20 years now.
You can do sizeof(NULL) and (char *)(NULL+1)-(char *)(NULL) to get the
size of the type it points to?

Not sure what question you're asking...
Post by Rich Felker
int null_is_ptr_type()
{
char s[1][1+(int)NULL];
int i = 0;
return sizeof s[i++], i;
}
(int)NULL is 0 according to C99 so the NULL in there has no effect. And
referring to "i++" and "i" in the same statement is explicitly
undefined behavior (comma is not a sequence point, the compiler is free
to evaluate those in any order and different optimization flags _will_
change that order; I got bit by that many moons ago).

Rob
Bobby Bingham
2013-01-12 07:07:46 UTC
Permalink
Post by Rob Landley
Post by Rich Felker
int null_is_ptr_type()
{
char s[1][1+(int)NULL];
int i = 0;
return sizeof s[i++], i;
}
(int)NULL is 0 according to C99 so the NULL in there has no effect. And
referring to "i++" and "i" in the same statement is explicitly undefined
behavior (comma is not a sequence point, the compiler is free to evaluate
those in any order and different optimization flags _will_ change that
order; I got bit by that many moons ago).
"The left operand of a comma operator is evaluated as a void
expression; there is a sequence point after its evaluation. Then the
right operand is evaluated; the result has its type and value"

--
Bobby Bingham
Rich Felker
2013-01-12 13:31:14 UTC
Permalink
Post by Rob Landley
Post by Rich Felker
The original reason I left NULL with pointer type was to catch the
str[len]=NULL;
i.e. confusion of NULL with ASCII NUL.
They're both 0. If the optimizer can't convert the type down when
handed a constant assignment, the optimizer should be shot.
No. ASCII nul is an integer 0. NULL is a null pointer constant, which
may be an integer constant expression 0 or may be (void *)0. This has
nothing to do with the optimizer and everything to do with constraint
violations. The error was mainly made by C++ programmers (or C
programmers wrongly compiling their programs with C++ compilers...) on
implementations that used 0 as the definition of NULL; when compiled
on most proper C implementations, the code yields an error, because
assignment of a pointer to an integer is a constraint violation. (On
gcc, it's just a warning by default.)

I don't think there's a lot of value in catching this error anymore.
Post by Rob Landley
Post by Rich Felker
However, this raises a good
question: short of C11 _Generic, is it even possible for a program to
detect whether NULL has integer or pointer type?
The C99 standard says that NULL has pointer type. Thus when you pass
No it does not. We have addressed this multiple times already.
Post by Rob Landley
it in varargs, it should be a long on any LP64 system which is
basically "everything but windows" for about 20 years now.
Actually the type doesn't matter to correct programs. The question is
whether we want to coddle incorrect programs, and the answer folks
seem to be leaning towards is yes, in which case 0L would be the right
definition to accomplish this.
Post by Rob Landley
You can do sizeof(NULL) and (char *)(NULL+1)-(char *)(NULL) to get
the size of the type it points to?
NULL+1 is a constraint violation if NULL has pointer type (since the
only pointer type it's permitted to have is void *).
Post by Rob Landley
Not sure what question you're asking...
Post by Rich Felker
int null_is_ptr_type()
{
char s[1][1+(int)NULL];
int i = 0;
return sizeof s[i++], i;
}
(int)NULL is 0 according to C99 so the NULL in there has no effect.
It does. (int)0 is an integer constant expression. (int)(void *)0
happens to be semantically constant, but it's not an integer constant
expression. Therefore, depending on the definition of NULL, s may be a
regular array type or a variable-length array type. In the latter
case, s[i++] has VLA type and thus sizeof is required to evaluate its
argument. GCC versions prior to 4.5 were buggy in this regard.
Post by Rob Landley
And referring to "i++" and "i" in the same statement is explicitly
undefined behavior (comma is not a sequence point, the compiler is
Comma is a sequence point.
Post by Rob Landley
free to evaluate those in any order and different optimization flags
_will_ change that order; I got bit by that many moons ago).
No, you were doing something else wrong. To my knowledge there has
never been a compiler that did not honor the comma operator sequence
point, certainly not any GCC or clang.

Rich
Rob Landley
2013-01-13 14:29:20 UTC
Permalink
Post by Rich Felker
Post by Rob Landley
Post by Rich Felker
The original reason I left NULL with pointer type was to catch the
str[len]=NULL;
i.e. confusion of NULL with ASCII NUL.
They're both 0. If the optimizer can't convert the type down when
handed a constant assignment, the optimizer should be shot.
No. ASCII nul is an integer 0. NULL is a null pointer constant, which
may be an integer constant expression 0 or may be (void *)0.
Ah right, possible warning. (Behavior's guaranteed correct in this
instance, but the compiler complains anyway.)
Post by Rich Felker
Post by Rob Landley
Post by Rich Felker
However, this raises a good
question: short of C11 _Generic, is it even possible for a program
to
Post by Rob Landley
Post by Rich Felker
detect whether NULL has integer or pointer type?
The C99 standard says that NULL has pointer type. Thus when you pass
No it does not. We have addressed this multiple times already.
You read the standard as saying a pointer constant does not need
pointer type. *shrug* Ok...
Post by Rich Felker
Post by Rob Landley
it in varargs, it should be a long on any LP64 system which is
basically "everything but windows" for about 20 years now.
Actually the type doesn't matter to correct programs. The question is
whether we want to coddle incorrect programs, and the answer folks
seem to be leaning towards is yes, in which case 0L would be the right
definition to accomplish this.
I read "incorrect programs" and "c++ programs" as synonymous, but I'm
biased.
Post by Rich Felker
Post by Rob Landley
You can do sizeof(NULL) and (char *)(NULL+1)-(char *)(NULL) to get
the size of the type it points to?
NULL+1 is a constraint violation if NULL has pointer type (since the
only pointer type it's permitted to have is void *).
Compile time probe to set a constant with 0 for void? (I've lost track
of the entrance to the rathole: why do we need this info?)

So it's not required to be a pointer, but it's required to be a void
pointer. Weird. (I'm something like 3 years stale on any sort of deep
reading of the standards.)
Post by Rich Felker
Post by Rob Landley
Not sure what question you're asking...
Post by Rich Felker
int null_is_ptr_type()
{
char s[1][1+(int)NULL];
int i = 0;
return sizeof s[i++], i;
}
(int)NULL is 0 according to C99 so the NULL in there has no effect.
It does. (int)0 is an integer constant expression. (int)(void *)0
happens to be semantically constant, but it's not an integer constant
expression. Therefore, depending on the definition of NULL, s may be a
regular array type or a variable-length array type. In the latter
case, s[i++] has VLA type and thus sizeof is required to evaluate its
argument. GCC versions prior to 4.5 were buggy in this regard.
Really?

Toybox main.c is doing:

#define NEWTOY(name, opts, flags) opts ||
#define OLDTOY(name, oldname, opts, flags) opts ||
static const int NEED_OPTIONS =
#include "generated/newtoys.h"
0; // Ends the opts || opts || opts...

Which basically boils down to either:

NEED_OPTIONS = "STRING" || NULL || "STRING";

Or:

NEED_OPTIONS = NULL || NULL || NULL;

Then it does:

if (NEED_OPTIONS) call_option_parsing_stuff();

And then dead code elimination zaps the option parsing stuff if it's
only ever called behind and if (0). I tested this to make sure it
worked. Years ago I actually upgraded tinycc to make that behave the
same way gcc behaved so it could build this. (Yes, I could make it a
compile probe setting a config symbol before the main build, but I
didn't _need_ to.)

So I think you're saying is that the behavior I'm depending on changed?

Sigh. Yup. When I build toybox with just "true", gcc 4.2.1 (last gpl
release) drops out parse_optflag() but the ubuntu host toolchain no
longer does.

So A) your test is unreliable, B) the optimization I'm depending on is
unreliable. And it's unreliable _even_ if I replace "NULL" with 0 so it
_is_ clearly saying if (0) as a constant? What the...?

Hang on, I'm initializing this as a global so that it will FAIL if it's
not a constant. (And it still doesn't fail. What's failed is dead code
elimination in current gcc is no longer removing functions that can
never be accessed, so I have to go figure out what the right compiler
flags are called this week. SIGH.)

So fond of gcc. The LFS guys are currently discussing the 4.7.2 release
or whatever it is that just came out and requires a C++ compiler on the
host. (I'd link to the archives but their website is half-migrated
right now.)
Post by Rich Felker
Post by Rob Landley
And referring to "i++" and "i" in the same statement is explicitly
undefined behavior (comma is not a sequence point, the compiler is
Comma is a sequence point.
Post by Rob Landley
free to evaluate those in any order and different optimization flags
_will_ change that order; I got bit by that many moons ago).
No, you were doing something else wrong. To my knowledge there has
never been a compiler that did not honor the comma operator sequence
point, certainly not any GCC or clang.
I went and looked it up again: the comma operator and the commas in
function arguments are not the same thing. (They _used_ to be, but bad
optimizers broke it often enough the standards body caved.) Commas in
function calls are not sequence points (c99 3.19), commas outside of
function calls are. Wheee...
Post by Rich Felker
Rich
Rob
Luca Barbato
2013-01-13 14:56:15 UTC
Permalink
Post by Rob Landley
So fond of gcc. The LFS guys are currently discussing the 4.7.2 release
or whatever it is that just came out and requires a C++ compiler on the
host. (I'd link to the archives but their website is half-migrated right
now.)
It is known, if the people working on that are confident that having C++
as core language is what boosted clang instead of having a clear
separation of layers, good reusability and a clean API so be it.

(Meanwhile gccgo and gold still has a good chunk of neat shortcomings
making a good point that the language isn't a magic bullet)

And we are discussing on how bend a C runtime to fit the C++ runtime.

I do really hope Go will win more people and useful code and integration
will come up to make C++ less important.

lu
Rob Landley
2013-01-13 16:29:39 UTC
Permalink
Post by Rob Landley
Post by Rob Landley
So fond of gcc. The LFS guys are currently discussing the 4.7.2
release
Post by Rob Landley
or whatever it is that just came out and requires a C++ compiler on
the
Post by Rob Landley
host. (I'd link to the archives but their website is half-migrated
right
Post by Rob Landley
now.)
It is known, if the people working on that are confident that having C++
as core language is what boosted clang instead of having a clear
separation of layers, good reusability and a clean API so be it.
No, gcc was a hairball because Richard Stallman explicitly wanted it to
be (for example see https://lwn.net/Articles/259157/), he feared
allowing the pieces to be cleanly separated because then you could
decouple them and use a proprietary back-end with the gcc front-end,
and vice versa. (Which happened anyway, it's how llvm was developed in
the first place, the clang front-end was a replacement for the gcc
front end in llvm/gcc.)
Post by Rob Landley
(Meanwhile gccgo and gold still has a good chunk of neat shortcomings
making a good point that the language isn't a magic bullet)
When your code is a pile of scar tissue, starting over from scratch
provides massive initial progress, regardless of implementation details
of the new one. Of course code is often a mass of scar tissue for a
_reason_:

http://www.joelonsoftware.com/articles/fog0000000069.html

Then again, Spolsky isn't particularly versed in open source
development. Wasting huge quantities of effort that just gets thrown
away is _how_ we defeated brooks' law. We scale the same way genetic
algorithms do: try everything with no coordination and then do an
editorial pass on the slush pile to fight off Sturgeon's law. Rinse,
repeat. So his horror at wasted effort is misplaced for us. The insight
that a mature code base enbodies a bunch of hidden knowledge in the
pattern of scars is correct, but that just means when we do the next
one we need to update the standards so they _do_ properly document the
current requirements and rationale. And have a massive corpus of real
world test data to run through it, plus be prepared to receive
<strike>endless complaints</strike> feedback from an army of testers
who will break it in ingenous ways.
Post by Rob Landley
And we are discussing on how bend a C runtime to fit the C++ runtime.
I do really hope Go will win more people and useful code and
integration
will come up to make C++ less important.
C is a good language. Go doesn't need to replace C, no matter how much
C++ FUDs it.

C++ containing C and calling itself a good language is about as
relevant as a mud pie containing a glass of water and calling itself a
good beverage. (If you think all additions are improvements explain CSS
and region locking in DVDs.)

That said, people wrote useful programs in Cobol and ADA for many
years, and even after they sober up they'll still need legacy support
to run the results.

Rob
Luca Barbato
2013-01-13 17:14:48 UTC
Permalink
Post by Rob Landley
No, gcc was a hairball because Richard Stallman explicitly wanted it to
be (for example see https://lwn.net/Articles/259157/), he feared
allowing the pieces to be cleanly separated because then you could
decouple them and use a proprietary back-end with the gcc front-end, and
vice versa. (Which happened anyway, it's how llvm was developed in the
first place, the clang front-end was a replacement for the gcc front end
in llvm/gcc.)
I know and given how they discuss about using C++ exotic features in gcc
(or not) I really wonder if they do.
Post by Rob Landley
Post by Luca Barbato
And we are discussing on how bend a C runtime to fit the C++ runtime.
I do really hope Go will win more people and useful code and integration
will come up to make C++ less important.
C is a good language. Go doesn't need to replace C, no matter how much
C++ FUDs it.
Go needs to replace C++, at least it is plan9-sane/mirror-image-sane.
Post by Rob Landley
That said, people wrote useful programs in Cobol and ADA for many years,
and even after they sober up they'll still need legacy support to run
the results.
And that brings us back to why we are picking our collective brains on
supporting one of the many C++ mistakes.

lu
Strake
2013-01-13 15:23:56 UTC
Permalink
Post by Rob Landley
I read "incorrect programs" and "c++ programs" as synonymous, but I'm
biased.
A strong bias indeed, to assume that anything written in another
language is fault-free.

Cheers,
Strake
Luca Barbato
2013-01-13 17:17:53 UTC
Permalink
Post by Strake
Post by Rob Landley
I read "incorrect programs" and "c++ programs" as synonymous, but I'm
biased.
A strong bias indeed, to assume that anything written in another
language is fault-free.
Given a bit of preprocessed C++ code you can't tell if it is right or
wrong syntactically w/out having additional knowledge. That is one of
the funny gripes from the people trying to implement a C++ compiler.

So in a way it is always incorrect.

lu
Szabolcs Nagy
2013-01-13 17:47:32 UTC
Permalink
Post by Rob Landley
Post by Rich Felker
It does. (int)0 is an integer constant expression. (int)(void *)0
happens to be semantically constant, but it's not an integer constant
expression.
Really?
#define NEWTOY(name, opts, flags) opts ||
#define OLDTOY(name, oldname, opts, flags) opts ||
static const int NEED_OPTIONS =
#include "generated/newtoys.h"
0; // Ends the opts || opts || opts...
NEED_OPTIONS = "STRING" || NULL || "STRING";
NEED_OPTIONS = NULL || NULL || NULL;
if (NEED_OPTIONS) call_option_parsing_stuff();
And then dead code elimination zaps the option parsing stuff if it's
only ever called behind and if (0). I tested this to make sure it
worked. Years ago I actually upgraded tinycc to make that behave the
same way gcc behaved so it could build this. (Yes, I could make it a
compile probe setting a config symbol before the main build, but I
didn't _need_ to.)
So I think you're saying is that the behavior I'm depending on changed?
well,

(int)(void*)0 is not an "integer constant expression" and it
is not a "null pointer constant", it is not an "arithmetic
constant expression" nor an "address constant", but an
implementation is allowed to accept it as a "constant expression"
anyway
(as far as i can see it is not required to though)

!(void*)0 and (void*)0 || (void*)0 are similar

in initializers they may be accepted, but the standard
does not require them to be

gcc used to be less strict integer constant expressions,
but recently it follows the standard more closely
Post by Rob Landley
Sigh. Yup. When I build toybox with just "true", gcc 4.2.1 (last gpl
release) drops out parse_optflag() but the ubuntu host toolchain no
longer does.
i think gcc should be able to do the optimization

i guess gcc assumes that the value of 'static const' objects
may change or may not be available at compile-time for some
reason

may be the following works:

#define NEWTOY(name, opts, flags) opts ||
#define OLDTOY(name, oldname, opts, flags) opts ||
if (
#include "generated/newtoys.h"
0) call_option_parsing_stuff();
Rob Landley
2013-01-13 19:46:28 UTC
Permalink
Post by Rob Landley
Post by Rob Landley
So I think you're saying is that the behavior I'm depending on
changed?
well,
(int)(void*)0 is not an "integer constant expression" and it
is not a "null pointer constant",
C99 6.3.2.3: An integer constant expression with the value 0, or such
an expression cast to type void *, is called a null pointer constant.

7.17 #3: The macros are NULL which expands to an implementation-defined
null pointer constant;

So it uses "constant" in the name but either it's not a constant or
typecasting it twice makes it stop being a constant.
Post by Rob Landley
it is not an "arithmetic
constant expression" nor an "address constant", but an
implementation is allowed to accept it as a "constant expression"
anyway
(as far as i can see it is not required to though)
Actually it turned out the problem was I accidentally checked in some
debug code, which set -O0 (which apparently disables --gc-sections even
when explicitly specified).

So it's still working fine. Whether or not it should depends on whether
or not "constant" means it's a constant.
Post by Rob Landley
!(void*)0 and (void*)0 || (void*)0 are similar
in initializers they may be accepted, but the standard
does not require them to be
At one point, I had to dig through all this stuff:

http://lists.gnu.org/archive/html/tinycc-devel/2007-09/msg00128.html

Alas, that was about 5 years ago and I no longer remember the details.
Post by Rob Landley
gcc used to be less strict integer constant expressions,
but recently it follows the standard more closely
Post by Rob Landley
Sigh. Yup. When I build toybox with just "true", gcc 4.2.1 (last gpl
release) drops out parse_optflag() but the ubuntu host toolchain no
longer does.
i think gcc should be able to do the optimization
It can, it was -O0. Pilot error. :)

Rob
Rich Felker
2013-01-14 06:11:35 UTC
Permalink
Post by Rob Landley
Post by Rob Landley
Post by Rob Landley
So I think you're saying is that the behavior I'm depending on
changed?
well,
(int)(void*)0 is not an "integer constant expression" and it
is not a "null pointer constant",
C99 6.3.2.3: An integer constant expression with the value 0, or
such an expression cast to type void *, is called a null pointer
constant.
7.17 #3: The macros are NULL which expands to an
implementation-defined null pointer constant;
So it uses "constant" in the name but either it's not a constant or
typecasting it twice makes it stop being a constant.
Basically, the latter. It may still be a constant, but it's neither an
integer constant expression (this is a very restricted category of
expressions) not a null pointer constant.

In any case, this thread has gotten WAY off-topic, going all over the
place into territory about the merits and demerits of different
languages and anti-FSF politics. Those topics may be worth discussing
in some contexts, but it seems to have left everybody really confused
about the issues at hand, which are:

- whether we should work around broken programs that pass NULL to
variadic functions

- and if so, how

The emerging consensus seems to be using

#define NULL 0L

unconditionally in both C and C++ mode.

Rich
Vasily Kulikov
2013-01-14 08:45:27 UTC
Permalink
Hi,
Post by Rich Felker
In any case, this thread has gotten WAY off-topic, going all over the
place into territory about the merits and demerits of different
languages and anti-FSF politics. Those topics may be worth discussing
in some contexts, but it seems to have left everybody really confused
- whether we should work around broken programs that pass NULL to
variadic functions
- and if so, how
The emerging consensus seems to be using
#define NULL 0L
unconditionally in both C and C++ mode.
If such slick and unobvious places of C/POSIX/C++/gcc/etc. applications
are explicitly detected and handled, then probably it worth implementing
some checker in libc/toolchain which is detected (probably at runtime)
and warning is emitted at runtime/compile-time? gcc'isms, UBs, etc.

In musl libc it can be implemented as -DI_WANT_TO_DETECT_GCCISMS.

Thanks,
--
Vasily Kulikov
http://www.openwall.com - bringing security into open computing environments
Rich Felker
2013-01-14 14:03:34 UTC
Permalink
Post by Vasily Kulikov
Hi,
Post by Rich Felker
In any case, this thread has gotten WAY off-topic, going all over the
place into territory about the merits and demerits of different
languages and anti-FSF politics. Those topics may be worth discussing
in some contexts, but it seems to have left everybody really confused
- whether we should work around broken programs that pass NULL to
variadic functions
- and if so, how
The emerging consensus seems to be using
#define NULL 0L
unconditionally in both C and C++ mode.
If such slick and unobvious places of C/POSIX/C++/gcc/etc. applications
are explicitly detected and handled, then probably it worth implementing
some checker in libc/toolchain which is detected (probably at runtime)
and warning is emitted at runtime/compile-time? gcc'isms, UBs, etc.
In musl libc it can be implemented as -DI_WANT_TO_DETECT_GCCISMS.
At the very least, this would have to be a macro in the reserved
namespace. However, I'm skeptical of using musl as a tool for checking
this, especially since the check only works on 64-bit systems and does
not help the compiler produce a warning/error, but only causes random,
hard-to-diagnose crashes. It looks like cppcheck is adding (or has
already added?) a test for incorrectly passing NULL to variadic
functions, which is probably where the check belongs.

Rich
Vasily Kulikov
2013-01-14 14:30:25 UTC
Permalink
Post by Rich Felker
Post by Vasily Kulikov
In musl libc it can be implemented as -DI_WANT_TO_DETECT_GCCISMS.
At the very least, this would have to be a macro in the reserved
namespace. However, I'm skeptical of using musl as a tool for checking
this, especially since the check only works on 64-bit systems and does
not help the compiler produce a warning/error, but only causes random,
hard-to-diagnose crashes. It looks like cppcheck is adding (or has
already added?) a test for incorrectly passing NULL to variadic
functions, which is probably where the check belongs.
My thought related to this specific bug was a bit more complex:

1) on each call of a variadic function save the list of all types
2) on each call to va_arg(ap, T) check whether the current argument was
pushed as T in the saved list

It would catch not only NULL/(void *)NULL, but also int/long or
void*/long bugs.

Now I see that while it is possible to implement (2) in libc redefining
va_XXX() macros, but it looks like (1) has to be implemented in compiler.

So, yeah, it is not a musl issue.

Thanks,
--
Vasily Kulikov
http://www.openwall.com - bringing security into open computing environments
Szabolcs Nagy
2013-01-14 15:02:24 UTC
Permalink
Post by Vasily Kulikov
1) on each call of a variadic function save the list of all types
2) on each call to va_arg(ap, T) check whether the current argument was
pushed as T in the saved list
It would catch not only NULL/(void *)NULL, but also int/long or
void*/long bugs.
Now I see that while it is possible to implement (2) in libc redefining
va_XXX() macros, but it looks like (1) has to be implemented in compiler.
this is what i mean when i wrote 'instrumentation tool' earlier
and it can be probably done in asan if it's not already there

and i agree with

#define NULL 0L

it is a valid definition for c and c++
and it does not cause unexpected failures in broken code

the only drawback i see is that some trivial errors are
not catched by the type checker in c with this definition
(using NULL in int/arithmetic context), but that's not a
big loss probably
Rich Felker
2013-01-14 15:14:17 UTC
Permalink
Post by Vasily Kulikov
Post by Rich Felker
Post by Vasily Kulikov
In musl libc it can be implemented as -DI_WANT_TO_DETECT_GCCISMS.
At the very least, this would have to be a macro in the reserved
namespace. However, I'm skeptical of using musl as a tool for checking
this, especially since the check only works on 64-bit systems and does
not help the compiler produce a warning/error, but only causes random,
hard-to-diagnose crashes. It looks like cppcheck is adding (or has
already added?) a test for incorrectly passing NULL to variadic
functions, which is probably where the check belongs.
1) on each call of a variadic function save the list of all types
2) on each call to va_arg(ap, T) check whether the current argument was
pushed as T in the saved list
It would catch not only NULL/(void *)NULL, but also int/long or
void*/long bugs.
Now I see that while it is possible to implement (2) in libc redefining
va_XXX() macros, but it looks like (1) has to be implemented in compiler.
Both require support by the compiler. It's impossible to implement
va_arg without a compiler builtin. Traditionally it was done with
UB-invoking pointer arithmetic on archs where the arguments are passed
on the stack, but even this is invalid and will break under compiler
optimizations such as inlining or reordering local vars for cache
locality purposes.

Also, unfortunately, it's an ABI issue. Making this change would
create a completely separate ABI.

I think static analysis is really the way to go; unfortunately, it
requires some degree of additional markup to specify the argument
types contract.

Rich

Rob Landley
2013-01-14 13:19:31 UTC
Permalink
Post by Rich Felker
The emerging consensus seems to be using
#define NULL 0L
unconditionally in both C and C++ mode.
Works for me. (Supplementing C99 with LP64 is valid for Linux, BSD, and
MacOS X. Windows getting it wrong is their problem.)

Rob
Rob Landley
2013-01-12 05:56:02 UTC
Permalink
Post by John Spencer
glibc defines NULL as __null: a magic variable supplied by GCC and
compatibles which always has pointer context.
musl defines NULL to 0 in C++.
this is correct per the standard, but breaks a lot of software on
64bit archs,
because it promotes to int.
The C99 standard section 7.17 defines the NULL macro as:

expands to an implementation-defined null pointer constant

Which means it has pointer type. So either we can typecast it to void
*, or we can rely on the LP64 standard (Linux, FreeBSD, and macosX all
support) which says that long and pointer are always the same size on
both 32 bit and 64 bit, so trivial fix would be #define NULL to (0L)

http://www.unix.org/whitepapers/64bit.html

Rob
Rich Felker
2013-01-12 06:42:54 UTC
Permalink
Post by Rob Landley
Post by John Spencer
glibc defines NULL as __null: a magic variable supplied by GCC and
compatibles which always has pointer context.
musl defines NULL to 0 in C++.
this is correct per the standard, but breaks a lot of software on
64bit archs,
because it promotes to int.
expands to an implementation-defined null pointer constant
Which means it has pointer type. So either we can typecast it to
Nope, C is weirder than you think. A "null pointer constant" is
defined as an integer constant expression with value zero, or such an
expression cast to void *. So it need not have pointer type.
Post by Rob Landley
void *, or we can rely on the LP64 standard (Linux, FreeBSD, and
macosX all support) which says that long and pointer are always the
same size on both 32 bit and 64 bit, so trivial fix would be #define
NULL to (0L)
Yes, using 0L on both C and C++ is the solution I'm leaning towards..

Rich
Continue reading on narkive:
Loading...