toolchaintechnotes: Large overhaul

- Document the autoconf behavior about "the cross-compilation mode," to
  explain the necessity of --build=$(path/to/config.guess) added for
  #5304.
- Mention the libtool fallout regrading cross-compilation.
- Remove the explanation for CC_FOR_TARGET, which is already removed
  much earlier.
- Note the cross-toolchain cannot be used anymore after installing gcc
  pass 2.
- "Stage 3" (i.e. the final LFS system) is NOT optional.
This commit is contained in:
Xi Ruoyao 2025-03-25 21:32:54 +08:00
parent 7cd3a3fec1
commit 5e3bef69d1
No known key found for this signature in database
GPG Key ID: ACAAD20E19E710E3

View File

@ -3,6 +3,9 @@
"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [ "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
<!ENTITY % general-entities SYSTEM "../general.ent"> <!ENTITY % general-entities SYSTEM "../general.ent">
%general-entities; %general-entities;
<!ENTITY host-triplet
"<replaceable>&lt;the host triplet&gt;</replaceable>">
]> ]>
<sect1 id="ch-tools-toolchaintechnotes" xreflabel="Toolchain Technical Notes"> <sect1 id="ch-tools-toolchaintechnotes" xreflabel="Toolchain Technical Notes">
@ -44,6 +47,14 @@
book for a cross-toolchain for some purpose other book for a cross-toolchain for some purpose other
than building LFS, unless you really understand what you are doing. than building LFS, unless you really understand what you are doing.
</para> </para>
<para>
It's known installing GCC pass 2 will break the cross-toolchain.
We don't consider it a bug because GCC pass 2 is the last package
to be cross-compiled in the book, and we won't <quote>fix</quote>
it until we really need to cross-compile some package after GCC
pass 2 in the future.
</para>
</note> </note>
<para>Cross-compilation involves some concepts that deserve a section of <para>Cross-compilation involves some concepts that deserve a section of
@ -197,14 +208,104 @@
page</ulink>.</para> page</ulink>.</para>
</note> </note>
<para>In order to fake a cross-compilation in LFS, the name of the host triplet <para>
is slightly adjusted by changing the &quot;vendor&quot; field in the There are two key points for a cross-compilation:
<envar>LFS_TGT</envar> variable so it says &quot;lfs&quot;. We also use the </para>
<parameter>--with-sysroot</parameter> option when building the cross-linker and
cross-compiler, to tell them where to find the needed host files. This <itemizedlist>
ensures that none of the other programs built in <xref <listitem>
linkend="chapter-temporary-tools"/> can link to libraries on the build <para>
machine. Only two stages are mandatory, plus one more for tests.</para> When producing and processing the machine code supposed to be
executed on <quote>the host,</quote> the cross-toolchain must be
used. Note that the native toolchain from <quote>the build</quote>
may be still invoked to generate machine code supposed to be
executed on <quote>the build.</quote> For example, the build system
may compile a generator with the native toolchain, then generate
a C source file with the generator, and finally compile the C
source file with the cross-toolchain so the generated code will
be able to run on <quote>the host.</quote>
</para>
<para>
With an autoconf-based build system, this requirement is ensured
by using the <parameter>--host</parameter> switch to specify
<quote>the host</quote> triplet. With this switch the build
system will use the toolchain components prefixed
with <literal>&host-triplet;</literal>
for generating and processing the machine code for
<quote>the host</quote>; e.g. the compiler will be
<command>&host-triplet;-gcc</command> and the
<command>readelf</command> tool will be
<command>&host-triplet;-readelf</command>.
</para>
</listitem>
<listitem>
<para>
The build system should not attempt to run any generated machine
code supposed to be executed on <quote>the host.</quote> For
example, when building an utility natively, its man page can be
generated by running the utility with the
<parameter>--help</parameter> switch and processing the output,
but generally it's not possible to do so as the utility may fail
to run on <quote>the build</quote>: it's obviously impossible to
run ARM64 machine code on a x86 CPU (without an emulator).
</para>
<para>
With an autoconf-based build system, this requirement is
satisfied in <quote>the cross-compilation mode</quote> where
the optional features requiring to run machine code for
<quote>the host</quote> are disabled. When <quote>the
host</quote> triplet is explicitly specified, <quote>the
cross-compilation mode</quote> is enabled if and only if either
the <command>configure</command> script fails to run a dummy
program compiled into <quote>the host</quote> machine code, or
<quote>the build</quote> triplet is explicitly specified via the
<parameter>--build</parameter> switch and it's different from
<quote>the host</quote> triplet.
</para>
</listitem>
</itemizedlist>
<para>In order to cross-compile a package for the LFS temporary system,
the name of the system triplet is slightly adjusted by changing the
&quot;vendor&quot; field in the <envar>LFS_TGT</envar> variable so it
says &quot;lfs&quot; and <envar>LFS_TGT</envar> is then specified as
<quote>the host</quote> triplet via <parameter>--host</parameter>, so
the cross-toolchain will be used for generating and processing the
machine code running as a part of the LFS temporary system. And, we
also need to enable <quote>the cross-compilation mode</quote>: despite
<quote>the host</quote> machine code, i.e. the machine code for the LFS
temporary system, is able to execute on the current CPU, it may refer
to a library not available on the <quote>the build</quote> (the host
distro), or some code or data non-exist or defined differently in the
library even if it happens to be available. When cross-compiling a
package for the LFS temporary system, we cannot rely on the
<command>configure</command> script to detect this issue with the
dummy program: the dummy only uses a few components in
<systemitem class='library'>libc</systemitem> that the host distro
<systemitem class='library'>libc</systemitem> likely provides (unless,
maybe the host distro uses a different
<systemitem class='library'>libc</systemitem> implementaion like Musl),
so it won't fail like how the really useful programs would likely.
Thus we must explicitly specify <quote>the build</quote> triplet to
enable <quote>the cross-compilation mode.</quote> The value we use is
just the default, i.e. the original system triplet from
<command>config.guess</command> output, but <quote>the cross-compilation
mode</quote> depends on an explicit specification as we've
discussed.</para>
<para>We use the <parameter>--with-sysroot</parameter> option when
building the cross-linker and cross-compiler, to tell them where to find
the needed files for <quote>the host.</quote> This nearly ensures that
none of the other programs built in
<xref linkend="chapter-temporary-tools"/> can link to libraries on
<quote>the build.</quote> The word <quote>nearly</quote> is used because
<command>libtool</command>, a <quote>compatibility</quote> wrapper of
the compiler and the linker for autoconf-based build systems,
can try to be too clever and mistakenly pass options allowing the linker
to find libraries of <quote>the host.</quote>
To prevent this fallout, we need to delete the libtool archive
(<filename class='extension'>.la</filename>) files and fix up an
outdated libtool copy shipped with the Binutils code.</para>
<informaltable align="center"> <informaltable align="center">
<tgroup cols="5"> <tgroup cols="5">
@ -228,7 +329,7 @@
</row> </row>
<row> <row>
<entry>3</entry><entry>lfs</entry><entry>lfs</entry><entry>lfs</entry> <entry>3</entry><entry>lfs</entry><entry>lfs</entry><entry>lfs</entry>
<entry>Rebuild and test cc-lfs using cc-lfs on lfs.</entry> <entry>Rebuild (and maybe test) cc-lfs using cc-lfs on lfs.</entry>
</row> </row>
</tbody> </tbody>
</tgroup> </tgroup>
@ -256,30 +357,12 @@
<para>The upshot of the preceding paragraph is that cc1 is unable to <para>The upshot of the preceding paragraph is that cc1 is unable to
build a fully functional libstdc++ with the degraded libgcc, but cc1 build a fully functional libstdc++ with the degraded libgcc, but cc1
is the only compiler available for building the C/C++ libraries is the only compiler available for building the C/C++ libraries
during stage 2. There are two reasons we don't immediately use the during stage 2. As we've discussed, we cannot run cc-lfs on pc (the
compiler built in stage 2, cc-lfs, to build those libraries.</para> host distro) because it may require some library, code, or data not
available on <quote>the build</quote> (the host distro).
<itemizedlist> So when we build gcc stage 2, we instruct the building system to
<listitem> rebuild libgcc and libstdc++ with cc1, but we also override the library
<para> search path to link libstdc++ against the newly
Generally speaking, cc-lfs cannot run on pc (the host system). Even though the
triplets for pc and lfs are compatible with each other, an executable
for lfs must depend on glibc-&glibc-version;; the host distro
may utilize either a different implementation of libc (for example, musl), or
a previous release of glibc (for example, glibc-2.13).
</para>
</listitem>
<listitem>
<para>
Even if cc-lfs can run on pc, using it on pc would create
a risk of linking to the pc libraries, since cc-lfs is a native
compiler.
</para>
</listitem>
</itemizedlist>
<para>So when we build gcc stage 2, we instruct the building system to
rebuild libgcc and libstdc++ with cc1, but we link libstdc++ to the newly
rebuilt libgcc instead of the old, degraded build. This makes the rebuilt rebuilt libgcc instead of the old, degraded build. This makes the rebuilt
libstdc++ fully functional.</para> libstdc++ fully functional.</para>
@ -290,12 +373,11 @@
package on a completed LFS system, the reinstalled content of the package package on a completed LFS system, the reinstalled content of the package
should be the same as the content of the same package when first installed in should be the same as the content of the same package when first installed in
&ch-final;. The temporary packages installed in &ch-tmp-cross; or &ch-final;. The temporary packages installed in &ch-tmp-cross; or
&ch-tmp-chroot; cannot satisfy this requirement, because some of them &ch-tmp-chroot; cannot satisfy this requirement, because some optional
are built without optional dependencies, and autoconf cannot features of them are disabled because of either the missing
perform some feature checks in &ch-tmp-cross; because of cross-compilation, dependencies or the <quote>cross-compilation mode.</quote>
causing the temporary packages to lack optional features, Additionally, a minor reason for rebuilding the packages is to run the
or use suboptimal code routines. Additionally, a minor reason for test suites.</para>
rebuilding the packages is to run the test suites.</para>
</sect2> </sect2>
@ -359,11 +441,10 @@ checking what linker to use... /mnt/lfs/tools/i686-lfs-linux-gnu/bin/ld</compute
<para>Next comes glibc. The most important <para>Next comes glibc. The most important
considerations for building glibc are the compiler, binary tools, and considerations for building glibc are the compiler, binary tools, and
kernel headers. The compiler and binary tools are generally not an issue kernel headers. The compiler and binary tools are not an issue
since glibc will always use those relating to the <parameter>--host</parameter> as <parameter>--host=$LFS_TGT</parameter> makes the build system to use
parameter passed to its configure script; e.g., in our case, the compiler those tools prefixed with <literal>$LFS_TGT-</literal> as we've
will be <command>$LFS_TGT-gcc</command> and the <command>readelf</command> discussed. The kernel headers can
tool will be <command>$LFS_TGT-readelf</command>. The kernel headers can
be a bit more complicated. Therefore, we take no risks and use be a bit more complicated. Therefore, we take no risks and use
the available configure switch to enforce the correct selection. After the available configure switch to enforce the correct selection. After
the run of <command>configure</command>, check the contents of the the run of <command>configure</command>, check the contents of the
@ -384,12 +465,7 @@ checking what linker to use... /mnt/lfs/tools/i686-lfs-linux-gnu/bin/ld</compute
LFS compiler is installed. First binutils-pass2 is built, LFS compiler is installed. First binutils-pass2 is built,
in the same <envar>DESTDIR</envar> directory as the other programs, in the same <envar>DESTDIR</envar> directory as the other programs,
then the second pass of gcc is constructed, omitting some then the second pass of gcc is constructed, omitting some
non-critical libraries. Due to some weird logic in gcc's non-critical libraries.</para>
configure script, <envar>CC_FOR_TARGET</envar> ends up as
<command>cc</command> when the host is the same as the target, but
different from the build system. This is why
<parameter>CC_FOR_TARGET=$LFS_TGT-gcc</parameter> is declared explicitly
as one of the configuration options.</para>
<para>Upon entering the chroot environment in <xref <para>Upon entering the chroot environment in <xref
linkend="chapter-chroot-temporary-tools"/>, linkend="chapter-chroot-temporary-tools"/>,