mirror of
https://git.linuxfromscratch.org/lfs.git
synced 2025-06-18 19:29:21 +01:00
474 lines
24 KiB
XML
474 lines
24 KiB
XML
<?xml version="1.0" encoding="UTF-8"?>
|
|
<!DOCTYPE sect1 PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
|
|
"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
|
|
<!ENTITY % general-entities SYSTEM "../general.ent">
|
|
%general-entities;
|
|
|
|
<!ENTITY host-triplet
|
|
"<replaceable><the host triplet></replaceable>">
|
|
]>
|
|
|
|
<sect1 id="ch-tools-toolchaintechnotes" xreflabel="Toolchain Technical Notes">
|
|
<?dbhtml filename="toolchaintechnotes.html"?>
|
|
|
|
<title>Toolchain Technical Notes</title>
|
|
|
|
<para>This section explains some of the rationale and technical details
|
|
behind the overall build method. Don't try to immediately
|
|
understand everything in this section. Most of this information will be
|
|
clearer after performing an actual build. Come back and re-read this chapter
|
|
at any time during the build process.</para>
|
|
|
|
<para>The overall goal of <xref linkend="chapter-cross-tools"/> and <xref
|
|
linkend="chapter-temporary-tools"/> is to produce a temporary area
|
|
containing a set of tools that are known to be good, and that are isolated from the host system.
|
|
By using the <command>chroot</command> command, the compilations in the remaining chapters
|
|
will be isolated within that environment, ensuring a clean, trouble-free
|
|
build of the target LFS system. The build process has been designed to
|
|
minimize the risks for new readers, and to provide the most educational value
|
|
at the same time.</para>
|
|
|
|
<para>This build process is based on
|
|
<emphasis>cross-compilation</emphasis>. Cross-compilation is normally used
|
|
to build a compiler and its associated toolchain for a machine different from
|
|
the one that is used for the build. This is not strictly necessary for LFS,
|
|
since the machine where the new system will run is the same as the one
|
|
used for the build. But cross-compilation has one great advantage:
|
|
anything that is cross-compiled cannot depend on the host environment.</para>
|
|
|
|
<sect2 id="cross-compile" xreflabel="About Cross-Compilation">
|
|
|
|
<title>About Cross-Compilation</title>
|
|
|
|
<note>
|
|
<para>
|
|
The LFS book is not (and does not contain) a general tutorial to
|
|
build a cross- (or native) toolchain. Don't use the commands in the
|
|
book for a cross-toolchain for some purpose other
|
|
than building LFS, unless you really understand what you are doing.
|
|
</para>
|
|
|
|
<para>
|
|
It's known installing GCC pass 2 will break the cross-toolchain.
|
|
We don't consider it a bug because GCC pass 2 is the last package
|
|
to be cross-compiled in the book, and we won't <quote>fix</quote>
|
|
it until we really need to cross-compile some package after GCC
|
|
pass 2 in the future.
|
|
</para>
|
|
</note>
|
|
|
|
<para>Cross-compilation involves some concepts that deserve a section of
|
|
their own. Although this section may be omitted on a first reading,
|
|
coming back to it later will help you gain a fuller understanding of
|
|
the process.</para>
|
|
|
|
<para>Let us first define some terms used in this context.</para>
|
|
|
|
<variablelist>
|
|
<varlistentry><term>The build</term><listitem>
|
|
<para>is the machine where we build programs. Note that this machine
|
|
is also referred to as the <quote>host.</quote></para></listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry><term>The host</term><listitem>
|
|
<para>is the machine/system where the built programs will run. Note
|
|
that this use of <quote>host</quote> is not the same as in other
|
|
sections.</para></listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry><term>The target</term><listitem>
|
|
<para>is only used for compilers. It is the machine the compiler
|
|
produces code for. It may be different from both the build and
|
|
the host.</para></listitem>
|
|
</varlistentry>
|
|
|
|
</variablelist>
|
|
|
|
<para>As an example, let us imagine the following scenario (sometimes
|
|
referred to as <quote>Canadian Cross</quote>). We have a
|
|
compiler on a slow machine only, let's call it machine A, and the compiler
|
|
ccA. We also have a fast machine (B), but no compiler for (B), and we
|
|
want to produce code for a third, slow machine (C). We will build a
|
|
compiler for machine C in three stages.</para>
|
|
|
|
<informaltable align="center">
|
|
<tgroup cols="5">
|
|
<colspec colnum="1" align="center"/>
|
|
<colspec colnum="2" align="center"/>
|
|
<colspec colnum="3" align="center"/>
|
|
<colspec colnum="4" align="center"/>
|
|
<colspec colnum="5" align="left"/>
|
|
<thead>
|
|
<row><entry>Stage</entry><entry>Build</entry><entry>Host</entry>
|
|
<entry>Target</entry><entry>Action</entry></row>
|
|
</thead>
|
|
<tbody>
|
|
<row>
|
|
<entry>1</entry><entry>A</entry><entry>A</entry><entry>B</entry>
|
|
<entry>Build cross-compiler cc1 using ccA on machine A.</entry>
|
|
</row>
|
|
<row>
|
|
<entry>2</entry><entry>A</entry><entry>B</entry><entry>C</entry>
|
|
<entry>Build cross-compiler cc2 using cc1 on machine A.</entry>
|
|
</row>
|
|
<row>
|
|
<entry>3</entry><entry>B</entry><entry>C</entry><entry>C</entry>
|
|
<entry>Build compiler ccC using cc2 on machine B.</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</informaltable>
|
|
|
|
<para>Then, all the programs needed by machine C can be compiled
|
|
using cc2 on the fast machine B. Note that unless B can run programs
|
|
produced for C, there is no way to test the newly built programs until machine
|
|
C itself is running. For example, to run a test suite on ccC, we may want to add a
|
|
fourth stage:</para>
|
|
|
|
<informaltable align="center">
|
|
<tgroup cols="5">
|
|
<colspec colnum="1" align="center"/>
|
|
<colspec colnum="2" align="center"/>
|
|
<colspec colnum="3" align="center"/>
|
|
<colspec colnum="4" align="center"/>
|
|
<colspec colnum="5" align="left"/>
|
|
<thead>
|
|
<row><entry>Stage</entry><entry>Build</entry><entry>Host</entry>
|
|
<entry>Target</entry><entry>Action</entry></row>
|
|
</thead>
|
|
<tbody>
|
|
<row>
|
|
<entry>4</entry><entry>C</entry><entry>C</entry><entry>C</entry>
|
|
<entry>Rebuild and test ccC using ccC on machine C.</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</informaltable>
|
|
|
|
<para>In the example above, only cc1 and cc2 are cross-compilers, that is,
|
|
they produce code for a machine different from the one they are run on.
|
|
The other compilers ccA and ccC produce code for the machine they are run
|
|
on. Such compilers are called <emphasis>native</emphasis> compilers.</para>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="lfs-cross">
|
|
<title>Implementation of Cross-Compilation for LFS</title>
|
|
|
|
<note>
|
|
<para>All the cross-compiled packages in this book use an
|
|
autoconf-based building system. The autoconf-based building system
|
|
accepts system types in the form cpu-vendor-kernel-os,
|
|
referred to as the system triplet. Since the vendor field is often
|
|
irrelevant, autoconf lets you omit it.</para>
|
|
|
|
<para>An astute reader may wonder
|
|
why a <quote>triplet</quote> refers to a four component name. The
|
|
kernel field and the os field began as a single
|
|
<quote>system</quote> field. Such a three-field form is still valid
|
|
today for some systems, for example,
|
|
<literal>x86_64-unknown-freebsd</literal>. But
|
|
two systems can share the same kernel and still be too different to
|
|
use the same triplet to describe them. For example, Android running on a
|
|
mobile phone is completely different from Ubuntu running on an ARM64
|
|
server, even though they are both running on the same type of CPU (ARM64) and
|
|
using the same kernel (Linux).</para>
|
|
|
|
<para>Without an emulation layer, you cannot run an
|
|
executable for a server on a mobile phone or vice versa. So the
|
|
<quote>system</quote> field has been divided into kernel and os fields, to
|
|
designate these systems unambiguously. In our example, the Android
|
|
system is designated <literal>aarch64-unknown-linux-android</literal>,
|
|
and the Ubuntu system is designated
|
|
<literal>aarch64-unknown-linux-gnu</literal>.</para>
|
|
|
|
<para>The word <quote>triplet</quote> remains embedded in the lexicon. A simple way to determine your
|
|
system triplet is to run the <command>config.guess</command>
|
|
script that comes with the source for many packages. Unpack the binutils
|
|
sources, run the script <userinput>./config.guess</userinput>, and note
|
|
the output. For example, for a 32-bit Intel processor the
|
|
output will be <emphasis>i686-pc-linux-gnu</emphasis>. On a 64-bit
|
|
system it will be <emphasis>x86_64-pc-linux-gnu</emphasis>. On most
|
|
Linux systems the even simpler <command>gcc -dumpmachine</command> command
|
|
will give you similar information.</para>
|
|
|
|
<para>You should also be aware of the name of the platform's dynamic linker, often
|
|
referred to as the dynamic loader (not to be confused with the standard
|
|
linker <command>ld</command> that is part of binutils). The dynamic linker
|
|
provided by package glibc finds and loads the shared libraries needed by a
|
|
program, prepares the program to run, and then runs it. The name of the
|
|
dynamic linker for a 32-bit Intel machine is <filename
|
|
class="libraryfile">ld-linux.so.2</filename>; it's <filename
|
|
class="libraryfile">ld-linux-x86-64.so.2</filename> on 64-bit systems. A
|
|
sure-fire way to determine the name of the dynamic linker is to inspect a
|
|
random binary from the host system by running: <userinput>readelf -l
|
|
<name of binary> | grep interpreter</userinput> and noting the
|
|
output. The authoritative reference covering all platforms is in
|
|
<ulink url='https://sourceware.org/glibc/wiki/ABIList'>a Glibc wiki
|
|
page</ulink>.</para>
|
|
</note>
|
|
|
|
<para>
|
|
There are two key points for a cross-compilation:
|
|
</para>
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
When producing and processing the machine code supposed to be
|
|
executed on <quote>the host,</quote> the cross-toolchain must be
|
|
used. Note that the native toolchain from <quote>the build</quote>
|
|
may be still invoked to generate machine code supposed to be
|
|
executed on <quote>the build.</quote> For example, the build system
|
|
may compile a generator with the native toolchain, then generate
|
|
a C source file with the generator, and finally compile the C
|
|
source file with the cross-toolchain so the generated code will
|
|
be able to run on <quote>the host.</quote>
|
|
</para>
|
|
<para>
|
|
With an autoconf-based build system, this requirement is ensured
|
|
by using the <parameter>--host</parameter> switch to specify
|
|
<quote>the host</quote> triplet. With this switch the build
|
|
system will use the toolchain components prefixed
|
|
with <literal>&host-triplet;</literal>
|
|
for generating and processing the machine code for
|
|
<quote>the host</quote>; e.g. the compiler will be
|
|
<command>&host-triplet;-gcc</command> and the
|
|
<command>readelf</command> tool will be
|
|
<command>&host-triplet;-readelf</command>.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
The build system should not attempt to run any generated machine
|
|
code supposed to be executed on <quote>the host.</quote> For
|
|
example, when building a utility natively, its man page can be
|
|
generated by running the utility with the
|
|
<parameter>--help</parameter> switch and processing the output,
|
|
but generally it's not possible to do so for a cross-compilation
|
|
as the utility may fail
|
|
to run on <quote>the build</quote>: it's obviously impossible to
|
|
run ARM64 machine code on a x86 CPU (without an emulator).
|
|
</para>
|
|
<para>
|
|
With an autoconf-based build system, this requirement is
|
|
satisfied in <quote>the cross-compilation mode</quote> where
|
|
the optional features requiring to run machine code for
|
|
<quote>the host</quote> during the build time are disabled. When <quote>the
|
|
host</quote> triplet is explicitly specified, <quote>the
|
|
cross-compilation mode</quote> is enabled if and only if either
|
|
the <command>configure</command> script fails to run a dummy
|
|
program compiled into <quote>the host</quote> machine code, or
|
|
<quote>the build</quote> triplet is explicitly specified via the
|
|
<parameter>--build</parameter> switch and it's different from
|
|
<quote>the host</quote> triplet.
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
<para>In order to cross-compile a package for the LFS temporary system,
|
|
the name of the system triplet is slightly adjusted by changing the
|
|
"vendor" field in the <envar>LFS_TGT</envar> variable so it
|
|
says "lfs" and <envar>LFS_TGT</envar> is then specified as
|
|
<quote>the host</quote> triplet via <parameter>--host</parameter>, so
|
|
the cross-toolchain will be used for generating and processing the
|
|
machine code running as a part of the LFS temporary system. And, we
|
|
also need to enable <quote>the cross-compilation mode</quote>: despite
|
|
<quote>the host</quote> machine code, i.e. the machine code for the LFS
|
|
temporary system, is able to execute on the current CPU, it may refer
|
|
to a library not available on the <quote>the build</quote> (the host
|
|
distro), or some code or data non-exist or defined differently in the
|
|
library even if it happens to be available. When cross-compiling a
|
|
package for the LFS temporary system, we cannot rely on the
|
|
<command>configure</command> script to detect this issue with the
|
|
dummy program: the dummy only uses a few components in
|
|
<systemitem class='library'>libc</systemitem> that the host distro
|
|
<systemitem class='library'>libc</systemitem> likely provides (unless,
|
|
maybe the host distro uses a different
|
|
<systemitem class='library'>libc</systemitem> implementation like Musl),
|
|
so it won't fail like how the really useful programs would likely.
|
|
Thus we must explicitly specify <quote>the build</quote> triplet to
|
|
enable <quote>the cross-compilation mode.</quote> The value we use is
|
|
just the default, i.e. the original system triplet from
|
|
<command>config.guess</command> output, but <quote>the cross-compilation
|
|
mode</quote> depends on an explicit specification as we've
|
|
discussed.</para>
|
|
|
|
<para>We use the <parameter>--with-sysroot</parameter> option when
|
|
building the cross-linker and cross-compiler, to tell them where to find
|
|
the needed files for <quote>the host.</quote> This nearly ensures that
|
|
none of the other programs built in
|
|
<xref linkend="chapter-temporary-tools"/> can link to libraries on
|
|
<quote>the build.</quote> The word <quote>nearly</quote> is used because
|
|
<command>libtool</command>, a <quote>compatibility</quote> wrapper of
|
|
the compiler and the linker for autoconf-based build systems,
|
|
can try to be too clever and mistakenly pass options allowing the linker
|
|
to find libraries of <quote>the build.</quote>
|
|
To prevent this fallout, we need to delete the libtool archive
|
|
(<filename class='extension'>.la</filename>) files and fix up an
|
|
outdated libtool copy shipped with the Binutils code.</para>
|
|
|
|
<informaltable align="center">
|
|
<tgroup cols="5">
|
|
<colspec colnum="1" align="center"/>
|
|
<colspec colnum="2" align="center"/>
|
|
<colspec colnum="3" align="center"/>
|
|
<colspec colnum="4" align="center"/>
|
|
<colspec colnum="5" align="left"/>
|
|
<thead>
|
|
<row><entry>Stage</entry><entry>Build</entry><entry>Host</entry>
|
|
<entry>Target</entry><entry>Action</entry></row>
|
|
</thead>
|
|
<tbody>
|
|
<row>
|
|
<entry>1</entry><entry>pc</entry><entry>pc</entry><entry>lfs</entry>
|
|
<entry>Build cross-compiler cc1 using cc-pc on pc.</entry>
|
|
</row>
|
|
<row>
|
|
<entry>2</entry><entry>pc</entry><entry>lfs</entry><entry>lfs</entry>
|
|
<entry>Build compiler cc-lfs using cc1 on pc.</entry>
|
|
</row>
|
|
<row>
|
|
<entry>3</entry><entry>lfs</entry><entry>lfs</entry><entry>lfs</entry>
|
|
<entry>Rebuild (and maybe test) cc-lfs using cc-lfs on lfs.</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</informaltable>
|
|
|
|
<para>In the preceding table, <quote>on pc</quote> means the commands are run
|
|
on a machine using the already installed distribution. <quote>On
|
|
lfs</quote> means the commands are run in a chrooted environment.</para>
|
|
|
|
<para>This is not yet the end of the story. The C language is not
|
|
merely a compiler; it also defines a standard library. In this book, the
|
|
GNU C library, named glibc, is used (there is an alternative, "musl"). This library must
|
|
be compiled for the LFS machine; that is, using the cross-compiler cc1.
|
|
But the compiler itself uses an internal library providing complex
|
|
subroutines for functions not available in the assembler instruction set. This
|
|
internal library is named libgcc, and it must be linked to the glibc
|
|
library to be fully functional. Furthermore, the standard library for
|
|
C++ (libstdc++) must also be linked with glibc. The solution to this
|
|
chicken and egg problem is first to build a degraded cc1-based libgcc,
|
|
lacking some functionalities such as threads and exception handling, and then
|
|
to build glibc using this degraded compiler (glibc itself is not
|
|
degraded), and also to build libstdc++. This last library will lack some of the
|
|
functionality of libgcc.</para>
|
|
|
|
<para>The upshot of the preceding paragraph is that cc1 is unable to
|
|
build a fully functional libstdc++ with the degraded libgcc, but cc1
|
|
is the only compiler available for building the C/C++ libraries
|
|
during stage 2. As we've discussed, we cannot run cc-lfs on pc (the
|
|
host distro) because it may require some library, code, or data not
|
|
available on <quote>the build</quote> (the host distro).
|
|
So when we build gcc stage 2, we override the library
|
|
search path to link libstdc++ against the newly
|
|
rebuilt libgcc instead of the old, degraded build. This makes the rebuilt
|
|
libstdc++ fully functional.</para>
|
|
|
|
<para>In &ch-final; (or <quote>stage 3</quote>), all the packages needed for
|
|
the LFS system are built. Even if a package has already been installed into
|
|
the LFS system in a previous chapter, we still rebuild the package. The main reason for
|
|
rebuilding these packages is to make them stable: if we reinstall an LFS
|
|
package on a completed LFS system, the reinstalled content of the package
|
|
should be the same as the content of the same package when first installed in
|
|
&ch-final;. The temporary packages installed in &ch-tmp-cross; or
|
|
&ch-tmp-chroot; cannot satisfy this requirement, because some optional
|
|
features of them are disabled because of either the missing
|
|
dependencies or the <quote>cross-compilation mode.</quote>
|
|
Additionally, a minor reason for rebuilding the packages is to run the
|
|
test suites.</para>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="other-details">
|
|
|
|
<title>Other Procedural Details</title>
|
|
|
|
<para>The cross-compiler will be installed in a separate <filename
|
|
class="directory">$LFS/tools</filename> directory, since it will not
|
|
be part of the final system.</para>
|
|
|
|
<para>Binutils is installed first because the <command>configure</command>
|
|
runs of both gcc and glibc perform various feature tests on the assembler
|
|
and linker to determine which software features to enable or disable. This
|
|
is more important than one might realize at first. An incorrectly configured
|
|
gcc or glibc can result in a subtly broken toolchain, where the impact of
|
|
such breakage might not show up until near the end of the build of an
|
|
entire distribution. A test suite failure will usually highlight this error
|
|
before too much additional work is performed.</para>
|
|
|
|
<para>Binutils installs its assembler and linker in two locations,
|
|
<filename class="directory">$LFS/tools/bin</filename> and <filename
|
|
class="directory">$LFS/tools/$LFS_TGT/bin</filename>. The tools in one
|
|
location are hard linked to the other. An important facet of the linker is
|
|
its library search order. Detailed information can be obtained from
|
|
<command>ld</command> by passing it the <parameter>--verbose</parameter>
|
|
flag. For example, <command>$LFS_TGT-ld --verbose | grep SEARCH</command>
|
|
will illustrate the current search paths and their order. (Note that this
|
|
example can be run as shown only while logged in as user
|
|
<systemitem class="username">lfs</systemitem>. If you come back to this
|
|
page later, replace <command>$LFS_TGT-ld</command> with
|
|
<command>ld</command>).</para>
|
|
|
|
<para>The next package installed is gcc. An example of what can be
|
|
seen during its run of <command>configure</command> is:</para>
|
|
|
|
<screen><computeroutput>checking what assembler to use... /mnt/lfs/tools/i686-lfs-linux-gnu/bin/as
|
|
checking what linker to use... /mnt/lfs/tools/i686-lfs-linux-gnu/bin/ld</computeroutput></screen>
|
|
|
|
<para>This is important for the reasons mentioned above. It also
|
|
demonstrates that gcc's configure script does not search the PATH
|
|
directories to find which tools to use. However, during the actual
|
|
operation of <command>gcc</command> itself, the same search paths are not
|
|
necessarily used. To find out which standard linker <command>gcc</command>
|
|
will use, run: <command>$LFS_TGT-gcc -print-prog-name=ld</command>. (Again,
|
|
remove the <command>$LFS_TGT-</command> prefix if coming back to this
|
|
later.)</para>
|
|
|
|
<para>Detailed information can be obtained from <command>gcc</command> by
|
|
passing it the <parameter>-v</parameter> command line option while compiling
|
|
a program. For example, <command>$LFS_TGT-gcc -v
|
|
<replaceable>example.c</replaceable></command> (or without <command>
|
|
$LFS_TGT-</command> if coming back later) will show
|
|
detailed information about the preprocessor, compilation, and assembly
|
|
stages, including <command>gcc</command>'s search paths for included
|
|
headers and their order.</para>
|
|
|
|
<para>Next up: sanitized Linux API headers. These allow the
|
|
standard C library (glibc) to interface with features that the Linux
|
|
kernel will provide.</para>
|
|
|
|
<para>Next comes glibc. This is the first package that we cross-compile.
|
|
We use the <parameter>--host=$LFS_TGT</parameter> option to make
|
|
the build system to use those tools prefixed with
|
|
<literal>$LFS_TGT-</literal>, and the
|
|
<parameter>--build=$(../scripts/config.guess)</parameter> option to
|
|
enable <quote>the cross-compilation mode</quote> as we've discussed.
|
|
The <envar>DESTDIR</envar> variable is used to force installation into
|
|
the LFS file system.</para>
|
|
|
|
<para>As mentioned above, the standard C++ library is compiled next, followed in
|
|
<xref linkend="chapter-temporary-tools"/> by other programs that must
|
|
be cross-compiled to break circular dependencies at build time.
|
|
The steps for those packages are similar to the steps for glibc.</para>
|
|
|
|
<para>At the end of <xref linkend="chapter-temporary-tools"/> the native
|
|
LFS compiler is installed. First binutils-pass2 is built,
|
|
in the same <envar>DESTDIR</envar> directory as the other programs,
|
|
then the second pass of gcc is constructed, omitting some
|
|
non-critical libraries.</para>
|
|
|
|
<para>Upon entering the chroot environment in <xref
|
|
linkend="chapter-chroot-temporary-tools"/>,
|
|
the temporary installations of programs needed for the proper
|
|
operation of the toolchain are performed. From this point onwards, the
|
|
core toolchain is self-contained and self-hosted. In
|
|
<xref linkend="chapter-building-system"/>, final versions of all the
|
|
packages needed for a fully functional system are built, tested, and
|
|
installed.</para>
|
|
|
|
</sect2>
|
|
|
|
</sect1>
|