Discussion:
[musl] broken shared executables on armeb (illegal instruction)
Jason A. Donenfeld
2018-09-30 21:53:19 UTC
Permalink
Hello,

There appears to be a problem with shared linking on big-endian
ARM (armeb).

First I'll show that static linking works correctly:

$ armeb-pc-linux-gnueabi-gcc -v
Using built-in specs.
COLLECT_GCC=armeb-pc-linux-gnueabi-gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/armeb-pc-linux-gnueabi/8.2.0/lto-wrapper
Target: armeb-pc-linux-gnueabi
Configured with: /var/tmp/portage/cross-armeb-pc-linux-gnueabi/gcc-8.2.0-r3/work/gcc-8.2.0/configure --host=x86_64-pc-linux-gnu --target=armeb-pc-linux-gnueabi --build=x86_64-pc-linux-gnu --prefix=/usr --bindir=/usr/x86_64-pc-linux-gnu/armeb-pc-linux-gnueabi/gcc-bin/8.2.0 --includedir=/usr/lib/gcc/armeb-pc-linux-gnueabi/8.2.0/include --datadir=/usr/share/gcc-data/armeb-pc-linux-gnueabi/8.2.0 --mandir=/usr/share/gcc-data/armeb-pc-linux-gnueabi/8.2.0/man --infodir=/usr/share/gcc-data/armeb-pc-linux-gnueabi/8.2.0/info --with-gxx-include-dir=/usr/lib/gcc/armeb-pc-linux-gnueabi/8.2.0/include/g++-v8 --with-python-dir=/share/gcc-data/armeb-pc-linux-gnueabi/8.2.0/python --enable-languages=c,c++,fortran --enable-obsolete --enable-secureplt --disable-werror --with-system-zlib --enable-nls --without-included-gettext --enable-checking=release --with-bugurl=https://bugs.gentoo.org/ --with-pkgversion='Gentoo 8.2.0-r3 p1.3' --disable-esp --enable-libstdcxx-time --enable-poison-system-directories --with-sysroot=/usr/armeb-pc-linux-gnueabi --disable-bootstrap --enable-__cxa_atexit --enable-clocale=gnu --disable-multilib --disable-altivec --disable-fixed-point --with-float=soft --enable-libgomp --disable-libmudflap --disable-libssp --disable-libmpx --disable-systemtap --enable-vtable-verify --enable-libvtv --enable-lto --without-isl --enable-libsanitizer --enable-default-pie --enable-default-ssp
Thread model: posix
gcc version 8.2.0 (Gentoo 8.2.0-r3 p1.3)
$ tar xf musl-1.1.20.tar.gz
$ cd musl-1.1.20/
$ export CFLAGS="-O2 -march=armv7-a -mtune=cortex-a15 -mabi=aapcs-linux"
$ CC=armeb-pc-linux-gnueabi-gcc ./configure --prefix=$PWD/prefix --enable-static --disable-shared --build=armeb-pc-linux-gnueabi
[...]
$ make -j$(nproc)
[...]
$ make install
[...]
$ cd prefix/
$ printf '#include <stdio.h>\nint main(){puts("hello world");}' | bin/musl-gcc -xc -o helloworld $CFLAGS -
/usr/libexec/gcc/armeb-pc-linux-gnueabi/ld: /usr/lib/gcc/armeb-pc-linux-gnueabi/8.2.0/libgcc.a(_dvmd_lnx.o): in function `__aeabi_idiv0':
/var/tmp/portage/cross-armeb-pc-linux-gnueabi/gcc-8.2.0-r3/work/gcc-8.2.0/libgcc/config/arm/lib1funcs.S:1545: undefined reference to `raise'
collect2: error: ld returned 1 exit status
[This appears to be a well-known bug in some other mailing list post. Working around with the next command:]
$ printf '#include <stdio.h>\nint main(){puts("hello world");}' | bin/musl-gcc -xc -o helloworld $CFLAGS -static -
$ cp /usr/bin/qemu-armeb .
$ sudo chroot $(readlink -f .) /qemu-armeb /helloworld
hello world

Now let's try with shared linking, and you'll see it generates a
broken binary:

$ tar xf musl-1.1.20.tar.gz
$ cd musl-1.1.20/
$ export CFLAGS="-O2 -march=armv7-a -mtune=cortex-a15 -mabi=aapcs-linux"
$ CC=armeb-pc-linux-gnueabi-gcc ./configure --prefix=$PWD/prefix --disable-static --enable-shared --build=armeb-pc-linux-gnueabi
[...]
$ make -j$(nproc)
[...]
$ make install
[...]
$ cd prefix/
$ printf '#include <stdio.h>\nint main(){puts("hello world");}' | bin/musl-gcc -xc -o helloworld $CFLAGS -
$ cd lib/
$ ln -s libc.so ld-musl-armeb.so.1
$ cd ..
$ cp /usr/bin/qemu-armeb .
$ sudo chroot $(readlink -f .) /qemu-armeb /helloworld
Illegal instruction

I've experienced similar failures when trying to boot with armeb
executables as init with a real kernel that are generated this way.
I've also experienced this with both my own toolchain (above) and
with linaro's toolchain.

I expect the commands above should result in an easily reproducable bug.

Any idea what's up?

Thanks,
Jason
Rich Felker
2018-09-30 22:17:54 UTC
Permalink
Post by Jason A. Donenfeld
Hello,
There appears to be a problem with shared linking on big-endian
ARM (armeb).
$ armeb-pc-linux-gnueabi-gcc -v
Using built-in specs.
COLLECT_GCC=armeb-pc-linux-gnueabi-gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/armeb-pc-linux-gnueabi/8.2.0/lto-wrapper
Target: armeb-pc-linux-gnueabi
Configured with: /var/tmp/portage/cross-armeb-pc-linux-gnueabi/gcc-8.2.0-r3/work/gcc-8.2.0/configure --host=x86_64-pc-linux-gnu --target=armeb-pc-linux-gnueabi --build=x86_64-pc-linux-gnu --prefix=/usr --bindir=/usr/x86_64-pc-linux-gnu/armeb-pc-linux-gnueabi/gcc-bin/8.2.0 --includedir=/usr/lib/gcc/armeb-pc-linux-gnueabi/8.2.0/include --datadir=/usr/share/gcc-data/armeb-pc-linux-gnueabi/8.2.0 --mandir=/usr/share/gcc-data/armeb-pc-linux-gnueabi/8.2.0/man --infodir=/usr/share/gcc-data/armeb-pc-linux-gnueabi/8.2.0/info --with-gxx-include-dir=/usr/lib/gcc/armeb-pc-linux-gnueabi/8.2.0/include/g++-v8 --with-python-dir=/share/gcc-data/armeb-pc-linux-gnueabi/8.2.0/python --enable-languages=c,c++,fortran --enable-obsolete --enable-secureplt --disable-werror --with-system-zlib --enable-nls --without-included-gettext --enable-checking=release --with-bugurl=https://bugs.gentoo.org/ --with-pkgversion='Gentoo 8.2.0-r3 p1.3' --disable-esp --enable-libstdcxx-time --enable-poison-system-directories --with-sysroot=/usr/armeb-pc-linux-gnueabi --disable-bootstrap --enable-__cxa_atexit --enable-clocale=gnu --disable-multilib --disable-altivec --disable-fixed-point --with-float=soft --enable-libgomp --disable-libmudflap --disable-libssp --disable-libmpx --disable-systemtap --enable-vtable-verify --enable-libvtv --enable-lto --without-isl --enable-libsanitizer --enable-default-pie --enable-default-ssp
Thread model: posix
gcc version 8.2.0 (Gentoo 8.2.0-r3 p1.3)
$ tar xf musl-1.1.20.tar.gz
$ cd musl-1.1.20/
$ export CFLAGS="-O2 -march=armv7-a -mtune=cortex-a15 -mabi=aapcs-linux"
$ CC=armeb-pc-linux-gnueabi-gcc ./configure --prefix=$PWD/prefix --enable-static --disable-shared --build=armeb-pc-linux-gnueabi
[...]
$ make -j$(nproc)
[...]
$ make install
[...]
$ cd prefix/
$ printf '#include <stdio.h>\nint main(){puts("hello world");}' | bin/musl-gcc -xc -o helloworld $CFLAGS -
/var/tmp/portage/cross-armeb-pc-linux-gnueabi/gcc-8.2.0-r3/work/gcc-8.2.0/libgcc/config/arm/lib1funcs.S:1545: undefined reference to `raise'
collect2: error: ld returned 1 exit status
[This appears to be a well-known bug in some other mailing list post. Working around with the next command:]
This looks like you're trying to dynamic-link anyway...?
Post by Jason A. Donenfeld
$ printf '#include <stdio.h>\nint main(){puts("hello world");}' | bin/musl-gcc -xc -o helloworld $CFLAGS -static -
$ cp /usr/bin/qemu-armeb .
$ sudo chroot $(readlink -f .) /qemu-armeb /helloworld
hello world
Now let's try with shared linking, and you'll see it generates a
$ tar xf musl-1.1.20.tar.gz
$ cd musl-1.1.20/
$ export CFLAGS="-O2 -march=armv7-a -mtune=cortex-a15 -mabi=aapcs-linux"
Overriding the ABI seems like a really bad idea. What ABI is your
toolchain defaulting to?
Post by Jason A. Donenfeld
$ CC=armeb-pc-linux-gnueabi-gcc ./configure --prefix=$PWD/prefix --disable-static --enable-shared --build=armeb-pc-linux-gnueabi
[...]
$ make -j$(nproc)
[...]
$ make install
[...]
$ cd prefix/
$ printf '#include <stdio.h>\nint main(){puts("hello world");}' | bin/musl-gcc -xc -o helloworld $CFLAGS -
$ cd lib/
$ ln -s libc.so ld-musl-armeb.so.1
$ cd ..
$ cp /usr/bin/qemu-armeb .
$ sudo chroot $(readlink -f .) /qemu-armeb /helloworld
Illegal instruction
Have you fun with -singlestep -d in_asm,nochain so you can see what
instruction it faults on?
Post by Jason A. Donenfeld
I've experienced similar failures when trying to boot with armeb
executables as init with a real kernel that are generated this way.
I've also experienced this with both my own toolchain (above) and
with linaro's toolchain.
I expect the commands above should result in an easily reproducable bug.
Maybe, but I'm not aware of anyone else having seen it. It very well
could be specific to your toolchain. The gcc wrapper is not the
recommended way to use musl.
Post by Jason A. Donenfeld
Any idea what's up?
Might it possibly be qemu-armeb defaulting to some very primitive ISA
level (not armv7) whereas you built for armv7-a?

Rich
Jason A. Donenfeld
2018-09-30 23:11:15 UTC
Permalink
Hey Rich,

Thanks for the insight.
Post by Rich Felker
Post by Jason A. Donenfeld
$ printf '#include <stdio.h>\nint main(){puts("hello world");}' | bin/musl-gcc -xc -o helloworld $CFLAGS -
/var/tmp/portage/cross-armeb-pc-linux-gnueabi/gcc-8.2.0-r3/work/gcc-8.2.0/libgcc/config/arm/lib1funcs.S:1545: undefined reference to `raise'
collect2: error: ld returned 1 exit status
[This appears to be a well-known bug in some other mailing list post. Working around with the next command:]
This looks like you're trying to dynamic-link anyway...?
Yes. It's this old "bug", fwiw: https://www.openwall.com/lists/musl/2018/05/09/1
Post by Rich Felker
Overriding the ABI seems like a really bad idea. What ABI is your
toolchain defaulting to?
Good thinking: armv5. Though notably I don't have the same issue with
little endian. And passing '-cpu cortex-a15' or the like to qemu-user
doesn't fix that. Maybe I'll ask around over on the qemu mailing list,
though.
Post by Rich Felker
Have you fun with -singlestep -d in_asm,nochain so you can see what
instruction it faults on?
Interestingly, when in armv5 mode, it works fine and the disassembly
looks correct. But when in armv7 mode, it seems to be interpreting all
of the instructions in the wrong endian, right up to the failing one:

----------------
IN:
0xff79561c: 00b0a0e3 adcseq sl, r0, r3, ror #1

----------------
IN:
0xff795620: 00e0a0e3 rsceq sl, r0, r3, ror #1

----------------
IN:
0xff795624: 10109fe5 andsne r9, r0, r5, ror #31

----------------
IN:
0xff795628: 01108fe0 tsteq r0, r0, ror #31

----------------
IN:
0xff79562c: 0d00a0e1 stceq 0, cr10, [r0, #-900]

So this looks like what's actually happening is gcc goes into -mbe8
mode with armv7-a, which is to be expected. But QEMU is always in BE32
mode. Passing -mbe32 to the cflags "fixes" the issue, though it's
still unclear how to run BE8 code in qemu. But anyway, it's clear this
is probably not a musl issue anymore at this point. So thanks for the
pointer.

The real issue I'm facing is not being able to start a userland in the
kernel in big endian mode, and this persists even using the above
tricks (-mbe8 and -march=armv5 and so forth). I'll keep plugging away,
but indeed this probably isn't musl related.

Thanks again,
Jason
Jason A. Donenfeld
2018-09-30 23:36:37 UTC
Permalink
Got it all sorted. For future reference for readers of this thread,
the issue appeared to be a linker one, where the -mbe8 doesn't appear
to translate into --be8 for the linker, perhaps due to the musl specs
file or maybe for another reason. The fix is easy enough: add
"-Wl,--be8" to CFLAGS.

Loading...