LFS/BLFS Multilib/Multiarch Note
Introduction
Linux From Scratch is a great project that provides you with step-by-step instructions for building your own custom Linux system, entirely from source code. To Keep It Simple and Stupid as an educational project, LFS does not contain instruction to build multilibs.
However, for those who use LFS as their main system, multilib support may be essential. For example, some non-free softwares are only provided in 32-bit binaries, and others may want to use x32 ABI to improve the system performance.
There are various modifications which add multilib support to LFS procedure. Basically they just add the instruction to build multilib in LFS book. But as LFS an educational project, we should focus on why we should use these instruction. And, for a complete multilib environment we also have to build multilib for BLFS book. A Multilib BLFS book would be a massive work and I don’t think someone can maintain it.
So, I would explain how multilib system works, and show the basic idea of how to add multilib to LFS system.
Multilib/Multiarch Basic
How Multilib works?
The basic idea of multilib is simple. The kernel will parse the header of the executables and find out whether the code is LP64 (x86-64), LP32 (x86) or L64P32 (x32) in it. Then the kernel can arrange the address space and provide syscall interface specified for this ABI then the code with all the three ABIs will work. For example, if we have a static linked x32 program compiled in a multilib system with
gcc -mx32 foo.c -o foo -static
To run the program foo
in a 64-bit LFS system, just enable X86_X32
in the kernel config of the LFS system, and recompile the kernel. Then you
can copy foo
into the LFS system and run it. It will work. For running
traditional x86 code in 64-bit LFS system, just enable another kernel
config IA32_EMULATION
.
Unfortunately many programs are dynamically linked. When we try to run
a dynamically linked 32-bit program bar
on a 64-bit LFS system, the
kernel will parse the header of bar
. The path to dynamic linker is
hard coded in the ELF file bar
:
$ readelf -l bar | grep INTERP -A1
INTERP 0x000154 0x08048154 0x08048154 0x00013 0x00013 R 0x1
[Requesting program interpreter: /lib/ld-linux.so.2]
But we don’t have a /lib/ld-linux.so.2
in the 64-bit LFS system. So
the execve
syscall to execute bar
just return with ENOENT
.
Now forget LFS for a minute. On a travial binary distribution, we solve
the problem by installing ld-linux.so.2
to /lib
with a package manager,
or manually. Then we’ll hit another problem. The shared object name
libc.so.6
is hard coded in the ELF file bar
:
readelf -d bar | grep NEEDED
0x00000001 (NEEDED) Shared library: [libc.so.6]
So ld-linux.so.2
immediately knows bar
needs a shared library named
libc.so.6
. But the path to it is not hard coded so ld-linux.so.2
has
to find it. If bar
has no rpath and LD_LIBRARY_PATH
is not set,
ld-linux.so.2
would just search the library paths hard coded in itself,
and those specified by /etc/ld.so.conf
. If we just copied a
ld-linux.so.2
into a brand new 64-bit LFS system and try to execute bar
,
ld-linux.so.2
won’t find a 32-bit libc.so.6
. Then it would fail and
the execve
call will return ENOENT
. We have to copy a 32-bit
libc.so.6
and put it into a place ld-linux.so.2
would search, and
do it for all shared libraries bar
needs. Finally we can execute bar
successfully :).
So, a multilib enough to run a foreign 32-bit binary should contain
a working 32-bit dynamic linker at /lib/ld-linux.so.2
and essential
32-bit shared libraries for this binary. And a multilib enough to run
a foreign x32 binary should contain a working x32 dynamic linker at
/libx32/ld-linux-x32.so.2
and essential x32 shared libraries. Now our
task is to build them from source to get a Multilib Linux From
Scratch.
Where to put the multilib?
The dynamic linker path is specified by the ABI and hard coded in ELF files. The three ABIs supported by a 64-bit Linux kernel on x86-64 processors and their dynamic linker paths are:
ABI | Dynamic linker path |
---|---|
32-bit | /lib/ld-linux.so.2 |
64-bit | /lib64/ld-linux-x86-64.so.2 |
x32 | /libx32/ld-linux-x32.so.2 |
We can choose to install the dynamic linkers to other location but then
we must create a symlink for it at the specified location. For example,
the original LFS book installs ld-linux-x86-64.so.2
to /lib
, then
symlink it to /lib64
.
However, the location of other libraries are arbitary. You just can’t put
32-bit and 64-bit libraries in a same directory. After all, the main
Glibc library is named libc.so.6
so 32-bit and 64-bit Glibc will certainly
collide in the same directory. Just choose three different directories to
contain the libraries in three different ABIs. For example, you can make
the library location to seem like the dynamic linker location:
ABI | Library path in / | Library path in /usr |
---|---|---|
32-bit | /lib | /usr/lib |
64-bit | /lib64 | /usr/lib64 |
x32 | /libx32 | /usr/libx32 |
For another possibility, you may have a mostly 64-bit system with a few 32-bit applications and want the “main” library path to be 64-bit:
ABI | Library path in / | Library path in /usr |
---|---|---|
32-bit | /lib32 | /usr/lib32 |
64-bit | /lib | /usr/lib |
x32 | /libx32 | /usr/libx32 |
But then there will be a 32-bit ld-linux.so.2
in /lib
, along with many
64-bit libraries. Seems strange :(.
Other users (for example me) prefer Debian-like multiarch directories:
ABI | Library path in / | Library path in /usr |
---|---|---|
32-bit | /lib/i386-linux-gnu | /usr/lib/i386-linux-gnu |
64-bit | /lib/x86_64-linux-gnu | /usr/lib/x86_64-linux-gnu |
x32 | /lib/x86_64-linux-gnux32 | /usr/lib/x86_64-linux-gnux32 |
And even the following “strange” totally non-FHS layout is possible:
ABI | Library path |
---|---|
32-bit | /system32/lib |
64-bit | /system64/lib |
x32 | /systemx32/lib |
Then how to let the dynamic linkers to find them? The intuitive way is
creating an /etc/ld.so.conf
contains all the directories:
# Begin /etc/ld.so.conf.d/01-multilib.conf
/usr/lib64
/usr/lib
/usr/libx32
# End
Then all three dynamic linkers will search all three directories. But that’s not a problem since they’ll ignore ABI incompatible libraries.
A better way is hard coding the correct location into the dynamic linkers. LFS builds them from source (in Glibc package) so it is possible. Then each dynamic linker will only search the directory containing the compatible libraries specified at build time. We’ll show how to do that later.
Changes in LFS
Do We Need Temporary Multilib?
At first you may wonder, maybe I can skip multilib in Chapter 5 and only
build multilib in Chapter 6. It’s possible in theory but somehow tricky.
To build 32-bit multilib of Glibc in Chapter 6, we need a 32-bit
libgcc which is part of GCC Pass 2 in Chapter 5. So we have to use
--enable-multilib
for GCC Pass 2. But then the multilib of libstdc++
and other GCC runtime libraries would be also enabled. They rely on
multilib of Glibc in Chapter 5 so we have to build 32-bit and x32
Glibc in Chapter 5 to provide it. And, some libraries in Chapter 5 is
temporarily linked to in Chapter 6. So we should build all multilib
in Chapter 5 unless we know the multilib from one package is absolutely
unnecessary (for example libmagic
in package File).
DJ’s multilib LFS book build multilib in an additional Chapter 10 so Chapter 5 need not to be modified. But in Chapter 10 a temporary toolchain for multilib still has to be built.
If you are tough enough, you can try to hack GCC building procedure to build 32-bit and x32 libgcc after GCC Pass 2 manually. I have not done this and I don’t suggest to do this. I just choose to build all multilib of temporary packages in Chapter 05.
Toolchain packages
Binutils
The symlink from lib64
to lib
should be skipped.
If the library directories is same as dynamic linker directories
(32-bit in lib
, 64-bit in lib64
, and x32 in lib32
), no other changes
are needed. Otherwise, we should edit ld/genscripts.sh
in binutils source code to ensure correct library directories in linker
scripts. Manually fixing the generated linker scripts is also possible.
GCC
The option --disable-multilib
should be changed to
--enable-multilib-list=64,32,x32
. Then GCC will built runtime libraries
(libgcc, libstdc++, etc.) automatically for them. This change should also
be applied for Libstdc++ in Chapter 5.
If the library directories is same as dynamic linker directories, no
other changes are needed. Otherwise, we should edit
gcc/config/i386/t-linux64
in the source code to let GCC know them.
Even if the multilib directory layout is not multiarch, we should use
--enable-multiarch
. Then Python 3 package can use
gcc -print-multiarch
to discriminate shared objects and other platform
specific files with different suffixes in their names.
Glibc
Glibc need to be built three times, each for an ABI. The complete configure line in Chapter 5 should be changed like:
CC="$LFS_TGT-gcc $BUILD_MULTI" \
CXX="$LFS_TGT-g++ $BUILD_MULTI" \
../configure \
--prefix=/tools \
--host=$LFS_TGT_MULTI \
--libdir=/tools/$LIBDIR_MULTI \
--build=$(../scripts/config.guess) \
--enable-kernel=3.2 \
--with-headers=/tools/include \
libc_cv_forced_unwind=yes \
libc_cv_c_cleanup=yes
The variables with suffix _MULTI
varies with ABI:
ABI | BUILD_MULTI |
LFS_TGT_MULTI |
LIBDIR_MULTI |
---|---|---|---|
32-bit | -m32 |
i686-lfs-linux-gnu |
32-bit library directory |
64-bit | -m64 |
x86_64-lfs-linux-gnu |
64-bit library directory |
x32 | -mx32 |
x86_64-lfs-linux-gnu |
x32 library directory |
Explaination of variables and options:
--host=$LFS_TGT_MULTI
: LFS book has already explained that this is necessary for cross compiling. For 32-bit multilib we have to use the valuei686-lfs-linux-gnu
to tell the building system to use i686 (32-bit) assembly code instead of 64-bit assembly incompatable with 32-bit ABI.CC=$LFS_TGT-gcc $BUILD_MULTI
andCXX=$LFS_TGT-g++ $BUILD_MULTI
: By default--host=$LFS_TGT_MULTI
makes the building system to use$LFS_TGT_MULTI-gcc
as C compiler. That’s enough for the original LFS book without multilib, but we don’t have ai686-lfs-linux-gnu-gcc
now andx86_64-lfs-linux-gnu-gcc
would normally generate code in 64-bit ABI. So we have to override the C/C++ compiler for 32-bit and x32. The-m64
for 64-bit could be omitted.
And, for the final GCC in Chapter 6:
CC="gcc $BUILD_MULTI -isystem $GCC_INCDIR -isystem /usr/include" \
CXX="g++ $BUILD_MULTI" \
../configure --prefix=/usr \
--host=$HOST_MULTI
--libdir=/usr/$LIBDIR_MULTI
--disable-werror \
--enable-kernel=3.2 \
--enable-stack-protector=string \
libc_cv_slibdir=/$LIBDIR_MULTI \
libc_cv_complocaledir=/usr/lib/locale
Explaination of new options and variables:
--host=$HOST_MULTI
: Should bei686-pc-linux-gnu
for 32-bit, andx86_64-pc-linux-gnu
for x32 and 64-bit. It tells Glibc to use i686 assembly for 32-bit version. It’s necessary for 32-bit and can be omitted for 64-bit and x32 (sinceconfig.guess
returns thex86_64-pc-linux-gnu
orx86_64-unknown-linux-gnu
when kernel is 64-bit).libc_cv_slibdir=/$LIBDIR_MULTI
: Tell Glibc to install shared libraries to correct (customized) location.libc_cv_complocaledir=/usr/lib/locale
: Tell Glibc to use standard/usr/lib/locale
for locale archives, instead of/usr/$LIBDIR_MULTI/locale
. This would save disk space and make the locale archive consistent for 32-bit, 64-bit and x32 applications.
We should install 64-bit Glibc after 32-bit and x32 version. Then the
64-bit executable binarys will overwrite the 32-bit and x32 ones. And then
we need to edit the ldd
script. For security reason it only handles
the executables with correct dynamic linker path which is specified by the
RTLDLIST
variable. Since we have finally installed the 64-bit version,
the RTLDLIST
variable only contains one path to 64-bit dynamic linker.
Open /usr/bin/ldd
with an editor and modify RTLDLIST
to be three
dynamic linker paths sperated by space.
If we are not using the standard library directories, the dynamic linkers
would be in wrong place (/$LIBDIR_MULTI/ld-*.so
). We should symlink the
dynamic linkers to correct location.
Other packages
Most packages could be configured for multilib with:
CC="gcc $BUILD_MULTI" CXX="g++ $BUILD_MULTI" \
${original_configure_line_in_book} \
--libdir=$LIBDIR_MULTI
--host=$HOST_MULTI
One may think CC="gcc -m32"
is enough for 32-bit, but several packages
has platform specific code so it’s better to use --host=i686-pc-linux-gnu
.
Again, --host
can be omitted for 64-bit and x32.
Here ${original_configure_line_in_book}
is the original confiugre line
from the LFS book. Actually we can simplify the line by creating some
pesudo-cross compiler wrappers like:
#!/bin/sh
exec gcc -m32 "$@"
If you put this shell script as /usr/bin/i686-pc-linux-gnu-gcc
, and
do similar thing for i686-pc-linux-gnu-g++
, you can skip the setting of
CC
and CXX
since configure
will automatically find the wrapper
scripts for i686-pc-linux-gnu
host. For x32 ABI, though the canonical
host triplet is also x86_64-pc-linux-gnu
, but we can use a customized
triplet like x86_64-x32-linux-gnu
or x86_64-pc-linux-gnux32
.
Multilib of most LFS packages can be built and installed this way. But some packages need special handling.
GMP
GMP configure script recongnize “customized” host triplets for optimization
depending on the hardware. For example when I run config.guess
in GMP
package on my laptop I get the triplet ivybridge-pc-linux-gnu
. So if
we use --host=x86_64-pc-linux-gnu
or --host=i686-pc-linux-gnu
, the
script will configure for generic x86-64 or x86 CPU. This will make
GMP slower. So, instead of using --host
, we should use the environment
$ABI
for GMP. GMP configure script recongizes ABI=32
, ABI=64
, and
ABI=x32
. For example, ABI=32
would tell GMP to use 32-bit x86
assembly code, and add -m32
to compiler flags.
And gmp.h
is platform specific. We must rename them to gmp-64.h
,
gmp-32.h
and gmp-x32.h
for three ABIs and create a gmp.h
wrapping
them:
#if defined(__x86_64__) && defined(__LP64__)
#include "gmp-64.h"
#elif defined(__x86_64__)
#include "gmp-x32.h"
#else
#include "gmp-32.h"
#endif
Bzip2
Bzip2 has no configure script. Most annoying, the library path (relative
to install prefix) is hard coded to be lib
in Makefile
. We have to
use sed
to edit it, or install it manually.
Pkg-config
By default pkg-config
only know one .pc
file path in library paths,
which is /usr/lib/pkgconfig
. So the .pc
files in /usr/lib64/pkgconfig
will be useless. Unfortunately most modern systems need to be mainly
64-bit. They have many libraries with only 64-bit version so they only
have pkgconfig files in /usr/lib64/pkgconfig
. We can override this using
configure option --with-pc-path=...
. We can add multiple directories
and seperate them by :
.
And, if pkg-config
detects an pkgconfig file for 64-bit, it will output
-L/usr/lib64 -lfoo
for libfoo. -L/usr/lib64
is unnecessary and may
cause problem. We can use --with-system-library-path=...
to tell
pkg-config which -L
ldflags should be skipped.
I recommend to compile i686-pc-linux-gnu-pkg-config
and
x86_64-x32-linux-gnu-pkgconfig
for 32-bit and x32. Some BLFS packages
have additional ABI-specific information in the pkgconfig files
(for example glib
and gobject-introspection
).
i686-pc-linux-gnu-pkg-config
should be configured to search 32-bit
and shared (/usr/share/pkgconfig
) pkgconfig files only, and
x86_64-x32-linux-gnu-pkg-config
should be configured to search
x32 and shared pkgconfig files only. --host=i686-pc-linux-gnu
will
tell configure script of other packages to use
i686-pc-linux-gnu-pkg-config
instead of pkgconfig
. if it exists.
How to add the prefix i686-pc-linux-gnu-
? Use the configure option
--program-prefix
building pkg-config.
Ncurses
Ncurses installs ncursesw6-config
. Since we install the 64-bit version
last, /usr/bin/ncursesw6-config
would be the version from 64-bit. Its
output would contain -L/usr/lib64
. It’s annoying for 32-bit and x32.
We have to use sed
to edit it and remove the -L
output.
Some other packages also has *-config
scripts. They need to be modified
too.
Libffi
Do not modify headers install path since the headers are ABI specific.
If you don’t have i686-pc-linux-gnu-pkg-config
, you may need
PKG_CONFIG_PATH=/usr/lib/pkgconfig
for the packages need libffi (for
example, Python 3).
OpenSSL
OpenSSL has customized configure system. Fortunately it’s easy to use.
It’s Configure
script (not config
) accepts linux-x86_64
for 64-bit,
linux-x86
for 32-bit, and linux-x32
for x32. Still we have to remember
to change --libdir
.
Python 3
Pythons tends to have problem when cross compiling. So do not use --host
for it.
Python 3 only use /usr/lib/python3.x
as package directory. If we use
--libdir=/usr/lib64
for Python, we would get a broken installation.
To discriminate shared objects in one package with different ABIs,
Python 3 hard code the multiarch name from gcc -print-multiarch
at
build time and suffix the shared objects with it. For example, there are
_ssl.cpython-36m-x86_64-linux-gnu.so
for 64-bit, and
_ssl.cpython-36m-i386-linux-gnu.so
for 32-bit.
The header pyconfig.h
is ABI specific and need to be renamed and wrapped.
And, since we can’t use --libdir
, we have to manually move
libpython3.6m.so to the correct library path.
Meson
Meson itself is in pure Python and has no binary libraries. But to use
Meson building system to build multilib, we need to tell Meson some
information of the ABI. After installation of Meson, create
/usr/share/meson/native/x86
for 32-bit:
[binaries]
c = '/usr/bin/i686-pc-linux-gnu-gcc'
cpp = '/usr/bin/i686-pc-linux-gnu-g++'
pkgconfig = '/usr/bin/i686-pc-linux-gnu-pkg-config'
ar = '/usr/bin/ar'
strip = '/usr/bin/strip'
exe_wrapper = ''
[properties]
sizeof_void* = 4
sizeof_long = 4
[host_machine]
system = 'linux'
cpu_family = 'x86'
cpu = 'i686'
endian = 'little'
Then we can use meson --native-file x86
for 32-bit. exec_wrapper = ""
tells Meson we can run the generated executables natively. Without it
some BLFS packages refuse to build.
/usr/share/meson/cross
and --cross-file
, just like a
pseudo-cross building with autoconf configure
script. But it turned out
some packages refuse to build certain parts (for example gir
files) when
they are cross compiled. And, meson
now uses host pkg-config to locate
g-ir-scanner
and g-ir-compiler
. So we have to stop pretending cross
building for 32-bit.
Changes in BLFS
Most BLFS packages can be built for multilib like normal LFS packages. Still some packages need special case.
Python 2
I have a multiarch patch for Python 2. With it we can build Python 2 just like building Python 3 in LFS.
Cmake
I suggest to edit the default library pathes in
/usr/share/cmake-${version}/Modules/GNUInstallDir.cmake
so we don’t need to set them manually each time. But it only supports
32-bit and 64-bit (no x32 support) now. So we still nned a
-DCMAKE_INSTALL_LIBDIR=/usr/lib32
for x32. I don’t know how to hack
cmake to support installing x32 libraries to /usr/lib32
.
Some packages using cmake doesn not use GNUInstallDirs.cmake
but use
a config variable LIB_SUFFIX
. Which can be used to specify library
path like -DLIB_SUFFIX=64
(result in /usr/lib64
). But there are
still packages hard coding lib
in CMakeLists.txt
. They are quite
annoying and need some sed
.
Gstreamer
We have to set libexecdir
of Gstreamer same as libdir
because
it has some ABI specific helper programs.
Gobject-introspection
It’s very tricky. My approach is install i686-pc-linux-gnu-g-ir-scanner
etc. alongside with the normal g-ir-scanner
. The Python code of gobject-
introspection is installed in the library path so we can hold all three
versions. Then hack the code of 32-bit and x32 version so they’ll find
correct compiler and pkg-config. And, edit gobject-introspection-1.0.pc
so other packages can find correct gobject-introspection with (prefixed)
pkg-config.
However Meson building system always searches g-ir-scanner
etc. from
$PATH
instead of calling pkg-config
. I hacked Meson code to
force it search gobject-introspection tools using pkg-config. It seems
working well.
Rustc and librsvg
Rustc is a compiler so it doesn’t need multilib itself. But we have to
build multilib for its runtime libraries (just like libstdc++ from GCC).
Simply modify config.toml
in BLFS book will do the job:
# see config.toml.example for more possible options
[llvm]
targets = "X86"
# When using system llvm prefer shared libraries
link-shared = true
[build]
# install cargo as well as rust
extended = true
target = ["x86_64-unknown-linux-gnu", "i686-unknown-linux-gnu"]
[install]
prefix = "/usr"
docdir = "share/doc/rustc-1.25.0"
[rust]
channel = "stable"
rpath = false
# get reasonably clean output from the test harness
quiet-tests = true
# BLFS does not install the FileCheck executable from llvm,
# so disable codegen tests
codegen-tests = false
[target.x86_64-unknown-linux-gnu]
# delete this *section* if you are not using system llvm.
# NB the output of llvm-config (i.e. help options) may be
# dumped to the screen when config.toml is parsed.
llvm-config = "/usr/bin/llvm-config"
[target.i686-unknown-linux-gnu]
llvm-config = "/usr/bin/llvm-config"
linker = "i386-linux-gnu-gcc"
But rustc will install many 64-bit libraries in /usr/lib
. I don’t like
this behavior so I remove all of them and add
/usr/lib/rustlib/${arch}/lib
to /etc/ld.so.conf
. Rustc seems also
supporting x32 now but I’ve not tested.
After a rustc with multilib has been installed, cross compiling multilib for librsvg is simple.
Other Packages
Google Go Compiler
It needs no multilib - the compiler will compile 32-bit runtime as needed.
Set GOARCH=386
then it would produce 32-bit code. But it doesn’t support
x32 now.