From 5e3bef69d18bfd5618e9697f405f82fb7e292687 Mon Sep 17 00:00:00 2001 From: Xi Ruoyao Date: Tue, 25 Mar 2025 21:32:54 +0800 Subject: [PATCH 1/6] toolchaintechnotes: Large overhaul - Document the autoconf behavior about "the cross-compilation mode," to explain the necessity of --build=$(path/to/config.guess) added for #5304. - Mention the libtool fallout regrading cross-compilation. - Remove the explanation for CC_FOR_TARGET, which is already removed much earlier. - Note the cross-toolchain cannot be used anymore after installing gcc pass 2. - "Stage 3" (i.e. the final LFS system) is NOT optional. --- part3intro/toolchaintechnotes.xml | 176 +++++++++++++++++++++--------- 1 file changed, 126 insertions(+), 50 deletions(-) diff --git a/part3intro/toolchaintechnotes.xml b/part3intro/toolchaintechnotes.xml index 1dad94103..616af0bd9 100644 --- a/part3intro/toolchaintechnotes.xml +++ b/part3intro/toolchaintechnotes.xml @@ -3,6 +3,9 @@ "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [ %general-entities; + + <the host triplet>"> ]> @@ -44,6 +47,14 @@ book for a cross-toolchain for some purpose other than building LFS, unless you really understand what you are doing. + + + It's known installing GCC pass 2 will break the cross-toolchain. + We don't consider it a bug because GCC pass 2 is the last package + to be cross-compiled in the book, and we won't fix + it until we really need to cross-compile some package after GCC + pass 2 in the future. + Cross-compilation involves some concepts that deserve a section of @@ -197,14 +208,104 @@ page. - In order to fake a cross-compilation in LFS, the name of the host triplet - is slightly adjusted by changing the "vendor" field in the - LFS_TGT variable so it says "lfs". We also use the - --with-sysroot option when building the cross-linker and - cross-compiler, to tell them where to find the needed host files. This - ensures that none of the other programs built in can link to libraries on the build - machine. Only two stages are mandatory, plus one more for tests. + + There are two key points for a cross-compilation: + + + + + + When producing and processing the machine code supposed to be + executed on the host, the cross-toolchain must be + used. Note that the native toolchain from the build + may be still invoked to generate machine code supposed to be + executed on the build. For example, the build system + may compile a generator with the native toolchain, then generate + a C source file with the generator, and finally compile the C + source file with the cross-toolchain so the generated code will + be able to run on the host. + + + With an autoconf-based build system, this requirement is ensured + by using the --host switch to specify + the host triplet. With this switch the build + system will use the toolchain components prefixed + with &host-triplet; + for generating and processing the machine code for + the host; e.g. the compiler will be + &host-triplet;-gcc and the + readelf tool will be + &host-triplet;-readelf. + + + + + The build system should not attempt to run any generated machine + code supposed to be executed on the host. For + example, when building an utility natively, its man page can be + generated by running the utility with the + --help switch and processing the output, + but generally it's not possible to do so as the utility may fail + to run on the build: it's obviously impossible to + run ARM64 machine code on a x86 CPU (without an emulator). + + + With an autoconf-based build system, this requirement is + satisfied in the cross-compilation mode where + the optional features requiring to run machine code for + the host are disabled. When the + host triplet is explicitly specified, the + cross-compilation mode is enabled if and only if either + the configure script fails to run a dummy + program compiled into the host machine code, or + the build triplet is explicitly specified via the + --build switch and it's different from + the host triplet. + + + + + In order to cross-compile a package for the LFS temporary system, + the name of the system triplet is slightly adjusted by changing the + "vendor" field in the LFS_TGT variable so it + says "lfs" and LFS_TGT is then specified as + the host triplet via --host, so + the cross-toolchain will be used for generating and processing the + machine code running as a part of the LFS temporary system. And, we + also need to enable the cross-compilation mode: despite + the host machine code, i.e. the machine code for the LFS + temporary system, is able to execute on the current CPU, it may refer + to a library not available on the the build (the host + distro), or some code or data non-exist or defined differently in the + library even if it happens to be available. When cross-compiling a + package for the LFS temporary system, we cannot rely on the + configure script to detect this issue with the + dummy program: the dummy only uses a few components in + libc that the host distro + libc likely provides (unless, + maybe the host distro uses a different + libc implementaion like Musl), + so it won't fail like how the really useful programs would likely. + Thus we must explicitly specify the build triplet to + enable the cross-compilation mode. The value we use is + just the default, i.e. the original system triplet from + config.guess output, but the cross-compilation + mode depends on an explicit specification as we've + discussed. + + We use the --with-sysroot option when + building the cross-linker and cross-compiler, to tell them where to find + the needed files for the host. This nearly ensures that + none of the other programs built in + can link to libraries on + the build. The word nearly is used because + libtool, a compatibility wrapper of + the compiler and the linker for autoconf-based build systems, + can try to be too clever and mistakenly pass options allowing the linker + to find libraries of the host. + To prevent this fallout, we need to delete the libtool archive + (.la) files and fix up an + outdated libtool copy shipped with the Binutils code. @@ -228,7 +329,7 @@ 3lfslfslfs - Rebuild and test cc-lfs using cc-lfs on lfs. + Rebuild (and maybe test) cc-lfs using cc-lfs on lfs. @@ -256,30 +357,12 @@ The upshot of the preceding paragraph is that cc1 is unable to build a fully functional libstdc++ with the degraded libgcc, but cc1 is the only compiler available for building the C/C++ libraries - during stage 2. There are two reasons we don't immediately use the - compiler built in stage 2, cc-lfs, to build those libraries. - - - - - Generally speaking, cc-lfs cannot run on pc (the host system). Even though the - triplets for pc and lfs are compatible with each other, an executable - for lfs must depend on glibc-&glibc-version;; the host distro - may utilize either a different implementation of libc (for example, musl), or - a previous release of glibc (for example, glibc-2.13). - - - - - Even if cc-lfs can run on pc, using it on pc would create - a risk of linking to the pc libraries, since cc-lfs is a native - compiler. - - - - - So when we build gcc stage 2, we instruct the building system to - rebuild libgcc and libstdc++ with cc1, but we link libstdc++ to the newly + during stage 2. As we've discussed, we cannot run cc-lfs on pc (the + host distro) because it may require some library, code, or data not + available on the build (the host distro). + So when we build gcc stage 2, we instruct the building system to + rebuild libgcc and libstdc++ with cc1, but we also override the library + search path to link libstdc++ against the newly rebuilt libgcc instead of the old, degraded build. This makes the rebuilt libstdc++ fully functional. @@ -290,12 +373,11 @@ package on a completed LFS system, the reinstalled content of the package should be the same as the content of the same package when first installed in &ch-final;. The temporary packages installed in &ch-tmp-cross; or - &ch-tmp-chroot; cannot satisfy this requirement, because some of them - are built without optional dependencies, and autoconf cannot - perform some feature checks in &ch-tmp-cross; because of cross-compilation, - causing the temporary packages to lack optional features, - or use suboptimal code routines. Additionally, a minor reason for - rebuilding the packages is to run the test suites. + &ch-tmp-chroot; cannot satisfy this requirement, because some optional + features of them are disabled because of either the missing + dependencies or the cross-compilation mode. + Additionally, a minor reason for rebuilding the packages is to run the + test suites. @@ -359,11 +441,10 @@ checking what linker to use... /mnt/lfs/tools/i686-lfs-linux-gnu/bin/ldNext comes glibc. The most important considerations for building glibc are the compiler, binary tools, and - kernel headers. The compiler and binary tools are generally not an issue - since glibc will always use those relating to the --host - parameter passed to its configure script; e.g., in our case, the compiler - will be $LFS_TGT-gcc and the readelf - tool will be $LFS_TGT-readelf. The kernel headers can + kernel headers. The compiler and binary tools are not an issue + as --host=$LFS_TGT makes the build system to use + those tools prefixed with $LFS_TGT- as we've + discussed. The kernel headers can be a bit more complicated. Therefore, we take no risks and use the available configure switch to enforce the correct selection. After the run of configure, check the contents of the @@ -384,12 +465,7 @@ checking what linker to use... /mnt/lfs/tools/i686-lfs-linux-gnu/bin/ldDESTDIR directory as the other programs, then the second pass of gcc is constructed, omitting some - non-critical libraries. Due to some weird logic in gcc's - configure script, CC_FOR_TARGET ends up as - cc when the host is the same as the target, but - different from the build system. This is why - CC_FOR_TARGET=$LFS_TGT-gcc is declared explicitly - as one of the configuration options. + non-critical libraries. Upon entering the chroot environment in , From 7e4fd2e198b3811c3df3b61b9d27e2fe84e27969 Mon Sep 17 00:00:00 2001 From: Xi Ruoyao Date: Thu, 27 Mar 2025 20:56:55 +0800 Subject: [PATCH 2/6] glibc: Drop --with-headers in pass 1 I cannot see why this is ever needed. The default is "the compiler default" which should be correct as the compiler has been configured --with-sysroot. And the explanation for this switch is just repeating a common misunderstanding. In fact glibc **never** attempts to figure out what features the kernel has from the headers. Instead it depends on the kernel-features.h files in the source tree and the --with-kernel value to determine the kernel features that it can rely on. --- chapter05/glibc.xml | 11 ----------- part3intro/toolchaintechnotes.xml | 26 +++++++++----------------- 2 files changed, 9 insertions(+), 28 deletions(-) diff --git a/chapter05/glibc.xml b/chapter05/glibc.xml index 2dccf93bf..2c7de998c 100644 --- a/chapter05/glibc.xml +++ b/chapter05/glibc.xml @@ -91,7 +91,6 @@ cd build --host=$LFS_TGT \ --build=$(../scripts/config.guess) \ --enable-kernel=&min-kernel; \ - --with-headers=$LFS/usr/include \ --disable-nscd \ libc_cv_slibdir=/usr/lib @@ -116,16 +115,6 @@ cd build - - --with-headers=$LFS/usr/include - - This tells Glibc to compile itself against the headers - recently installed to the $LFS/usr/include directory, so that - it knows exactly what features the kernel has and can optimize - itself accordingly. - - - libc_cv_slibdir=/usr/lib diff --git a/part3intro/toolchaintechnotes.xml b/part3intro/toolchaintechnotes.xml index 616af0bd9..7afb2d9ac 100644 --- a/part3intro/toolchaintechnotes.xml +++ b/part3intro/toolchaintechnotes.xml @@ -439,27 +439,19 @@ checking what linker to use... /mnt/lfs/tools/i686-lfs-linux-gnu/bin/ld - Next comes glibc. The most important - considerations for building glibc are the compiler, binary tools, and - kernel headers. The compiler and binary tools are not an issue - as --host=$LFS_TGT makes the build system to use - those tools prefixed with $LFS_TGT- as we've - discussed. The kernel headers can - be a bit more complicated. Therefore, we take no risks and use - the available configure switch to enforce the correct selection. After - the run of configure, check the contents of the - config.make file in the build directory for all important details. - These items highlight an important aspect of the glibc - package—it is very self-sufficient in terms of its build machinery, - and generally does not rely on toolchain defaults. + Next comes glibc. This is the first package that we cross-compile. + We use the --host=$LFS_TGT option to make + the build system to use those tools prefixed with + $LFS_TGT-, and the + --build=$(../scripts/config.guess) option to + enable the cross-compilation mode as we've discussed. + The DESTDIR variable is used to force installation into + the LFS file system. As mentioned above, the standard C++ library is compiled next, followed in by other programs that must be cross-compiled to break circular dependencies at build time. - The install step of all those packages uses the - DESTDIR variable to force installation - in the LFS filesystem. + The steps for those packages are similar to the steps for glibc. At the end of the native LFS compiler is installed. First binutils-pass2 is built, From 87e90fb63352a28de548a3eb8688b3b343733464 Mon Sep 17 00:00:00 2001 From: Xi Ruoyao Date: Thu, 27 Mar 2025 23:06:24 +0800 Subject: [PATCH 3/6] glibc: Make the sanity check more complete for pass 1 Fixes #5651. --- chapter05/glibc.xml | 100 ++++++++++++++++++++++++++++++++++++-------- 1 file changed, 83 insertions(+), 17 deletions(-) diff --git a/chapter05/glibc.xml b/chapter05/glibc.xml index 2c7de998c..baec8f3a8 100644 --- a/chapter05/glibc.xml +++ b/chapter05/glibc.xml @@ -189,32 +189,98 @@ cd build sed '/RTLDLIST=/s@/usr@@g' -i $LFS/usr/bin/ldd - - At this point, it is imperative to stop and ensure that the basic - functions (compiling and linking) of the new toolchain are working as - expected. To perform a sanity check, run the following commands: + Now that our cross toolchain is in place, it is important to ensure + that compiling and linking will work as expected. We do this by performing + some sanity checks: -echo 'int main(){}' | $LFS_TGT-gcc -xc - -readelf -l a.out | grep ld-linux +echo 'int main(){}' | $LFS_TGT-gcc -x c - -v -Wl,--verbose &> dummy.log +readelf -l a.out | grep ': /lib' - If everything is working correctly, there should be no errors, - and the output of the last command will be of the form: + There should be no errors, + and the output of the last command will be (allowing for + platform-specific differences in the dynamic linker name): [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2] - Note that for 32-bit machines, the interpreter name will be - /lib/ld-linux.so.2. + Note that this path should not contain + /mnt/lfs (or the value of + the LFS variable if you used a different one). The path is + resolved when the compiled program is executed, and that should only happen + after we enter the chroot environment where the kernel would consider + $LFS as the root directory + (/). - If the output is not as shown above, or there is no output at all, - then something is wrong. Investigate and retrace the steps to find out - where the problem is and correct it. This issue must be resolved before - continuing. + Now make sure that we're set up to use the correct start files: - Once all is well, clean up the test file: +grep -E -o "$LFS/lib.*/S?crt[1in].*succeeded" dummy.log -rm -v a.out + The output of the last command should be: - +/mnt/lfs/lib/../lib/Scrt1.o succeeded +/mnt/lfs/lib/../lib/crti.o succeeded +/mnt/lfs/lib/../lib/crtn.o succeeded + + Verify that the compiler is searching for the correct header + files: + +grep -B3 "^ $LFS/usr/include" dummy.log + + This command should return the following output: + +#include <...> search starts here: + /mnt/lfs/tools/lib/gcc/x86_64-lfs-linux-gnu/&gcc-version;/include + /mnt/lfs/tools/lib/gcc/x86_64-lfs-linux-gnu/&gcc-version;/include-fixed + /mnt/lfs/usr/include + + Again, the directory named after your target triplet may be + different than the above, depending on your system architecture. + + Next, verify that the new linker is being used with the correct search paths: + +grep 'SEARCH.*/usr/lib' dummy.log |sed 's|; |\n|g' + + References to paths that have components with '-linux-gnu' should + be ignored, but otherwise the output of the last command should be: + +SEARCH_DIR("=/mnt/lfs/tools/x86_64-lfs-linux-gnu/lib64") +SEARCH_DIR("=/usr/local/lib64") +SEARCH_DIR("=/lib64") +SEARCH_DIR("=/usr/lib64") +SEARCH_DIR("=/mnt/lfs/tools/x86_64-lfs-linux-gnu/lib") +SEARCH_DIR("=/usr/local/lib") +SEARCH_DIR("=/lib") +SEARCH_DIR("=/usr/lib"); + + A 32-bit system may use a few other directories, but anyway + the important facet here is all the pathes should begin with an equal sign + (=), which would be replaced with the sysroot + directory that we've configured for the linker. + + Next make sure that we're using the correct libc: + +grep "/lib.*/libc.so.6 " dummy.log + + The output of the last command should be: + +attempt to open /mnt/lfs/usr/lib/libc.so.6 succeeded + + Make sure GCC is using the correct dynamic linker: + +grep found dummy.log + + The output of the last command should be (allowing for + platform-specific differences in dynamic linker name): + +found ld-linux-x86-64.so.2 at /mnt/lfs/usr/lib/ld-linux-x86-64.so.2 + + If the output does not appear as shown above or is not received + at all, then something is seriously wrong. Investigate and retrace the + steps to find out where the problem is and correct it. Any + issues should be resolved before continuing with the process. + + Once everything is working correctly, clean up the test files: + +rm -v a.out dummy.log Building the packages in the next chapter will serve as an additional check that the toolchain has been built properly. If some From e55a481032fdd993414e43ef86ea0661b25a1762 Mon Sep 17 00:00:00 2001 From: Xi Ruoyao Date: Thu, 27 Mar 2025 23:09:40 +0800 Subject: [PATCH 4/6] gcc: Use the same style for the sanity check compile command as glibc pass1 There seems no valid reason to use a different style here. --- chapter08/gcc.xml | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/chapter08/gcc.xml b/chapter08/gcc.xml index f0616e2cd..763f55ced 100644 --- a/chapter08/gcc.xml +++ b/chapter08/gcc.xml @@ -223,8 +223,7 @@ su tester -c "PATH=$PATH make -k check" that compiling and linking will work as expected. We do this by performing some sanity checks: -echo 'int main(){}' > dummy.c -cc dummy.c -v -Wl,--verbose &> dummy.log +echo 'int main(){}' | cc dummy.c -x c - -v -Wl,--verbose &> dummy.log readelf -l a.out | grep ': /lib' There should be no errors, @@ -319,7 +318,7 @@ SEARCH_DIR("/usr/lib"); Once everything is working correctly, clean up the test files: -rm -v dummy.c a.out dummy.log +rm -v a.out dummy.log Finally, move a misplaced file: From 576a368232298ace4680bb4254203558f7a47f03 Mon Sep 17 00:00:00 2001 From: Xi Ruoyao Date: Thu, 27 Mar 2025 23:13:58 +0800 Subject: [PATCH 5/6] glibc: Fix a full stop vs. quote issue in pass 1 --- chapter05/glibc.xml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/chapter05/glibc.xml b/chapter05/glibc.xml index baec8f3a8..34138e432 100644 --- a/chapter05/glibc.xml +++ b/chapter05/glibc.xml @@ -178,7 +178,7 @@ cd build class="directory">/) directory. Here we specify that the package is installed in $LFS, which will become the root directory in . + "ch-tools-chroot" role='.'/> From 8cd3ab533d7680caf60cb18d3ea2f1d639c02f04 Mon Sep 17 00:00:00 2001 From: Xi Ruoyao Date: Thu, 27 Mar 2025 23:30:41 +0800 Subject: [PATCH 6/6] toolchaintechnotes: Typos Just found the typos translating my own words :(. --- chapter05/glibc.xml | 2 +- part3intro/toolchaintechnotes.xml | 10 +++++----- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/chapter05/glibc.xml b/chapter05/glibc.xml index 34138e432..11d4cfce4 100644 --- a/chapter05/glibc.xml +++ b/chapter05/glibc.xml @@ -252,7 +252,7 @@ SEARCH_DIR("=/lib") SEARCH_DIR("=/usr/lib"); A 32-bit system may use a few other directories, but anyway - the important facet here is all the pathes should begin with an equal sign + the important facet here is all the paths should begin with an equal sign (=), which would be replaced with the sysroot directory that we've configured for the linker. diff --git a/part3intro/toolchaintechnotes.xml b/part3intro/toolchaintechnotes.xml index 7afb2d9ac..a9af3c542 100644 --- a/part3intro/toolchaintechnotes.xml +++ b/part3intro/toolchaintechnotes.xml @@ -245,7 +245,8 @@ example, when building an utility natively, its man page can be generated by running the utility with the --help switch and processing the output, - but generally it's not possible to do so as the utility may fail + but generally it's not possible to do so for a cross-compilation + as the utility may fail to run on the build: it's obviously impossible to run ARM64 machine code on a x86 CPU (without an emulator). @@ -253,7 +254,7 @@ With an autoconf-based build system, this requirement is satisfied in the cross-compilation mode where the optional features requiring to run machine code for - the host are disabled. When the + the host during the build time are disabled. When the host triplet is explicitly specified, the cross-compilation mode is enabled if and only if either the configure script fails to run a dummy @@ -302,7 +303,7 @@ libtool, a compatibility wrapper of the compiler and the linker for autoconf-based build systems, can try to be too clever and mistakenly pass options allowing the linker - to find libraries of the host. + to find libraries of the build. To prevent this fallout, we need to delete the libtool archive (.la) files and fix up an outdated libtool copy shipped with the Binutils code. @@ -360,8 +361,7 @@ during stage 2. As we've discussed, we cannot run cc-lfs on pc (the host distro) because it may require some library, code, or data not available on the build (the host distro). - So when we build gcc stage 2, we instruct the building system to - rebuild libgcc and libstdc++ with cc1, but we also override the library + So when we build gcc stage 2, we override the library search path to link libstdc++ against the newly rebuilt libgcc instead of the old, degraded build. This makes the rebuilt libstdc++ fully functional.