From 5536f7440f2f4a12782e8d741cbbba5f1c3cfea8 Mon Sep 17 00:00:00 2001 From: Archaic Date: Mon, 26 Dec 2005 19:00:06 +0000 Subject: [PATCH] Applied Alexander Patrakov's patch which adds UTF-8 capability to the development branch of the LFS Book. git-svn-id: http://svn.linuxfromscratch.org/LFS/trunk/BOOK@7235 4aa44e1e-78dd-0310-a6d2-fbcd4c07a689 --- chapter01/changelog.xml | 13 +- chapter03/packages.xml | 46 ++++++- chapter03/patches.xml | 57 ++++++++ chapter05/gawk.xml | 13 +- chapter05/glibc.xml | 17 ++- chapter06/chapter06.xml | 3 +- chapter06/coreutils.xml | 20 +++ chapter06/diffutils.xml | 6 + chapter06/gawk.xml | 16 ++- chapter06/gdbm.xml | 73 +++++++++++ chapter06/glibc.xml | 45 ++++--- chapter06/grep.xml | 10 ++ chapter06/groff.xml | 21 ++- chapter06/kbd.xml | 19 ++- chapter06/man-db.xml | 268 +++++++++++++++++++++++++++++++++++++ chapter06/man.xml | 180 ------------------------- chapter06/ncurses.xml | 90 +++++++++++-- chapter06/readline.xml | 4 +- chapter06/sysklogd.xml | 6 + chapter06/sysvinit.xml | 23 +--- chapter06/texinfo.xml | 8 ++ chapter06/udev.xml | 1 + chapter06/vim.xml | 14 +- chapter07/bootscripts.xml | 6 + chapter07/console.xml | 269 +++++++++++++++++++++++++++----------- chapter07/profile.xml | 53 +++++--- chapter08/fstab.xml | 39 ++++++ chapter08/kernel.xml | 16 +-- general.ent | 8 +- patches.ent | 22 +++- 30 files changed, 1011 insertions(+), 355 deletions(-) create mode 100644 chapter06/gdbm.xml create mode 100644 chapter06/man-db.xml diff --git a/chapter01/changelog.xml b/chapter01/changelog.xml index e5be147f8..3f0245f2c 100644 --- a/chapter01/changelog.xml +++ b/chapter01/changelog.xml @@ -44,7 +44,7 @@ First a summary, then a detailed log. Gettext &gettext-version; Glibc &glibc-version; -Groff &groff-version; + GRUB &grub-version; @@ -59,7 +59,7 @@ First a summary, then a detailed log. Linux-Libc-Headers &linux-libc-headers-version; M4 &m4-version; -Man &man-version; +Man-DB &man-db-version; Man-pages &man-pages-version; @@ -83,14 +83,22 @@ First a summary, then a detailed log. +Downgraded to: + +Groff &groff-version;-&groff-patchlevel; + + + Added: &bzip2-bzgrep-patch; &bzip2-docs-patch; &gawk-segfault-patch; &gcc-specs-patch; +GDBM-&gdbm-version; &inetutils-gcc4_fixes-patch; &kbd-gcc4_fixes-patch; +MAN-DB-&man-db-version; &mktemp-tempfile-patch; &perl-libc-patch; &shadow-configure-patch; @@ -107,6 +115,7 @@ First a summary, then a detailed log. glibc-2.3.4-fix_test-1.patch inetutils-1.4.2-kernel_headers-1.patch iproute2-2.6.11-050330-remove_db-1.patch +Man-1.6b mktemp-1.5-add_tempfile-2.patch perl-5.8.6-libc-1.patch vim-6.3-security_fix-1.patch diff --git a/chapter03/packages.xml b/chapter03/packages.xml index 61b0fb82c..a08517671 100644 --- a/chapter03/packages.xml +++ b/chapter03/packages.xml @@ -136,6 +136,13 @@ url="http://www.linuxfromscratch.org/lfs/download.html#ftp"/>. + +GDBM (&gdbm-version;) - 228 KB: + + + + + Gettext (&gettext-version;) - 4,668 KB: @@ -158,12 +165,25 @@ url="http://www.linuxfromscratch.org/lfs/download.html#ftp"/>. -Groff (&groff-version;) - 2,096 KB: +Groff (&groff-version;) - 2,260 KB: + +Groff Debian Patch - 129 KB: + + +Groff Debian Patch (&groff-version;-&groff-patchlevel;) +may no longer be available at the +listed location. The site administrators of the master download +location occasionally remove older versions when new ones are +released. There is no alternative download location yet. + + + + GRUB (&grub-version;) - 772 KB: @@ -228,6 +248,13 @@ url="http://www.linuxfromscratch.org/lfs/download.html#ftp"/>. + +Replacement console script for LFS-Bootscripts (&lfs-bootscripts-version;) - 3 KB: + + + + + Libtool (&libtool-version;) - 1,642 KB: @@ -264,9 +291,9 @@ url="http://www.linuxfromscratch.org/lfs/download.html#ftp"/>. -Man (&man-version;) - 205 KB: +Man-DB (&man-db-version;) - 816 KB: - + @@ -298,6 +325,19 @@ url="http://www.linuxfromscratch.org/lfs/download.html#ftp"/>. + + Patch (&patch-version;) - 156 KB: diff --git a/chapter03/patches.xml b/chapter03/patches.xml index 85a728d11..239a4c58e 100644 --- a/chapter03/patches.xml +++ b/chapter03/patches.xml @@ -29,6 +29,13 @@ needed to build an LFS system: + +Coreutils Internationalization Fixes Patch - 110 KB: + + + + + Coreutils Suppress Uptime, Kill, Su Patch - 15 KB: @@ -43,6 +50,13 @@ needed to build an LFS system: + +Diffutils Internationalization Fixes Patch - 18 KB: + + + + + Expect Spawn Patch - 7 KB: @@ -71,12 +85,26 @@ needed to build an LFS system: + +Grep RedHat Fixes Patch - 56 KB: + + + + + Gzip Security Patch - 2 KB: + +Kbd Backspace/Delete Fix Patch - 1 KB: + + + + + Kbd GCC-4.x Fix Patch - 1 KB: @@ -98,6 +126,13 @@ needed to build an LFS system: + +Linux kernel UTF-8 Composing Patch - 3 KB: + + + + + Mktemp Tempfile Patch - 4 KB: @@ -105,6 +140,13 @@ needed to build an LFS system: + +Ncurses Fixes Patch - 9 KB: + + + + + Perl Libc Patch - 1 KB: @@ -112,6 +154,13 @@ needed to build an LFS system: + +Sysklogd 8-Bit Cleanness Patch - 1 KB: + + + + + Shadow Configure Script Patch - 1KB: @@ -140,6 +189,14 @@ needed to build an LFS system: + +Texinfo Multibyte Fixes Patch - 1 KB: + + + + + + Texinfo Tempfile Fix Patch - 2 KB: diff --git a/chapter05/gawk.xml b/chapter05/gawk.xml index c70d6d1cd..c017de61b 100644 --- a/chapter05/gawk.xml +++ b/chapter05/gawk.xml @@ -31,11 +31,14 @@ ./configure --prefix=/tools -The configure script doesn't detect some functionality correctly. The -following commands correct this problem: - -echo "#define HAVE_LANGINFO_CODESET 1" >> config.h -echo "#define HAVE_LC_MESSAGES 1" >> config.h +Due to a bug in the ./configure script, Gawk fails +to detect certain aspects of locale support in glibc. This +bug leads to, e.g., Gettext testsuite failures. Work around this issue +by appending the missing macro definitions to config.h: +cat >>config.h <<"EOF" +#define HAVE_LANGINFO_CODESET 1 +#define HAVE_LC_MESSAGES 1 +EOF Compile the package: diff --git a/chapter05/glibc.xml b/chapter05/glibc.xml index 51d44ad24..68fd4ce01 100644 --- a/chapter05/glibc.xml +++ b/chapter05/glibc.xml @@ -27,6 +27,17 @@ Installation of Glibc +The glibc-libidn tarball adds support for internationalized +(non-ASCII) domain names to Glibc. While this facility is not +useful in this chapter, the installation commands of + +glibc (wrongly) check glibc for +this feature. Unpack the tarball from within the Glibc source +directory in order to avoid this bogus failure: + +tar jxf ../glibc-libidn-&glibc-version;.tar.bz2 + + The Glibc documentation recommends building Glibc outside of the source directory in a dedicated build directory: @@ -36,7 +47,7 @@ cd ../glibc-build Next, prepare Glibc for compilation: ../glibc-&glibc-version;/configure --prefix=/tools \ - --disable-profile --enable-add-ons \ + --disable-profile --enable-add-ons=nptl,libidn \ --enable-kernel=2.6.0 --with-binutils=/tools/bin \ --without-gd --with-headers=/tools/include \ --without-selinux @@ -52,9 +63,9 @@ necessary. ---enable-add-ons +--enable-add-ons=nptl,libidn This tells Glibc to use the NPTL add-on as its threading -library. +library, and adds support for non-ASCII domain names. diff --git a/chapter06/chapter06.xml b/chapter06/chapter06.xml index 82d457cbb..6c5b872ae 100644 --- a/chapter06/chapter06.xml +++ b/chapter06/chapter06.xml @@ -34,6 +34,7 @@ + @@ -55,7 +56,7 @@ - + diff --git a/chapter06/coreutils.xml b/chapter06/coreutils.xml index 8b06c567b..5c3980f2f 100644 --- a/chapter06/coreutils.xml +++ b/chapter06/coreutils.xml @@ -41,6 +41,26 @@ other packages later: patch -Np1 -i ../&coreutils-suppress-patch; +POSIX requires that programs from Coreutils recognize character +boundaries correctly even in multibyte locales. The following patch +fixes this non-compliance and other internationalization-related bugs: + +patch -Np1 -i ../&coreutils-i18n-patch; + +In order for the tests added by this patch to pass, the permissions for +the test file have to be changed: + +chmod +x tests/sort/sort-mb-tests + +In the past, many bugs were found in this patch. When reporting +new bugs to Coreutils maintainers, please check first if they are reproducible +without this patch. + +It has been found that translated messages sometimes overflow a buffer +in the who -Hu command. Increase the buffer size: + +sed -i 's,_LEN 6,_LEN 20,' src/who.c + Now prepare Coreutils for compilation: ./configure --prefix=/usr diff --git a/chapter06/diffutils.xml b/chapter06/diffutils.xml index ade0ece11..016e9f7e0 100644 --- a/chapter06/diffutils.xml +++ b/chapter06/diffutils.xml @@ -29,6 +29,12 @@ Gettext, Glibc, Grep, Make, and Sed Installation of Diffutils +POSIX requires that the diff command treats whitespace +characters according to the current locale. The following patch fixes the +non-complinace issue: + +patch -Np1 -i ../&diffutils-i18n-patch; + Prepare Diffutils for compilation: ./configure --prefix=/usr diff --git a/chapter06/gawk.xml b/chapter06/gawk.xml index 41df7f965..d25196e41 100644 --- a/chapter06/gawk.xml +++ b/chapter06/gawk.xml @@ -28,8 +28,8 @@ Diffutils, GCC, Gettext, Glibc, Grep, Make, and Sed Installation of Gawk -Patch Gawk to fix a bug which causes it to segfault when invoked on a -non-existent file: +Under some circumstances, Gawk-&gawk-version; attempts to free a chunk +of memory that was not allocated. This bug is fixed by the following patch: patch -Np1 -i ../&gawk-segfault-patch; @@ -37,11 +37,15 @@ non-existent file: ./configure --prefix=/usr --libexecdir=/usr/lib -The configure script doesn't detect some functionality correctly. The -following commands correct this problem: +Due to a bug in the ./configure script, Gawk fails +to detect certain aspects of locale support in glibc. This +bug leads to, e.g., Gettext testsuite failures. Work around this issue +by appending the missing macro definitions to config.h: -echo "#define HAVE_LANGINFO_CODESET 1" >> config.h -echo "#define HAVE_LC_MESSAGES 1" >> config.h +cat >>config.h <<"EOF" +#define HAVE_LANGINFO_CODESET 1 +#define HAVE_LC_MESSAGES 1 +EOF Compile the package: diff --git a/chapter06/gdbm.xml b/chapter06/gdbm.xml new file mode 100644 index 000000000..a2eee4fac --- /dev/null +++ b/chapter06/gdbm.xml @@ -0,0 +1,73 @@ + + + %general-entities; +]> + +GDBM-&gdbm-version; + + +GDBM + + +<para>The GDBM package contains the GNU Database Manager.</para> + +<segmentedlist> +<segtitle>&buildtime;</segtitle> +<segtitle>&diskspace;</segtitle> +<seglistitem><seg>0.08 SBU</seg><seg>2.75 MB</seg></seglistitem> +</segmentedlist> + +<segmentedlist> +<segtitle>&dependencies;</segtitle> +<seglistitem><seg>Not checked yet.</seg></seglistitem> +</segmentedlist> +</sect2> + +<sect2 role="installation"> +<title>Installation of GDBM + +Prepare GDBM for compilation: + +./configure --prefix=/usr + +Compile the package: + +make + +Install the package: + +make BINOWN=root BINGRP=root install install-compat + + + + +Contents of GDBM + + +Installed libraries +libgdbm.[so,a] and libgdbm_compat.[so,a] + + +Short Descriptions + + + + +libgdbm.[so,a] + +contains functions to manipulate a hashed database. + + + + +libgdbm_compat.[so,a] + +provide compatibility with older dbm routines + + + + + + + diff --git a/chapter06/glibc.xml b/chapter06/glibc.xml index c09759789..8ab39c726 100644 --- a/chapter06/glibc.xml +++ b/chapter06/glibc.xml @@ -47,6 +47,21 @@ and linker cannot be adjusted before the Glibc install because the Glibc autoconf tests would give false results and defeat the goal of achieving a clean build. +The glibc-libidn tarball adds support for internationalized +domain names (IDN) to Glibc. Note that many programs that +support IDN require the full libidn library from +, not this add-on. +Unpack the tarball from within the Glibc source +directory: + +tar jxf ../glibc-libidn-&glibc-version;.tar.bz2 + +In the vi_VN.TCVN locale, bash enters infinite loop at startup. It is +unknown whether this is a bash bug or a glibc problem. Disable installation +of this locale in order to avoid the problem: + +sed -i '/vi_VN.TCVN/d' localedata/SUPPORTED + The Glibc documentation recommends building Glibc outside of the source directory in a dedicated build directory: @@ -56,7 +71,7 @@ cd ../glibc-build Prepare Glibc for compilation: ../glibc-&glibc-version;/configure --prefix=/usr \ - --disable-profile --enable-add-ons \ + --disable-profile --enable-add-ons=nptl,libidn \ --enable-kernel=2.6.0 --libexecdir=/usr/lib/glibc The meaning of the new configure options: @@ -129,6 +144,11 @@ with: make localedata/install-locales +It is possible to create and install additional locales such as +ru_RU.CP1251 by means of the localedef command, as +explained in the INSTALL file in the Glibc source. + + To save time, an alternative to running the previous command (which generates and installs every locale listed in the glibc-&glibc-version;/localedata/SUPPORTED file) is to install only those @@ -142,6 +162,7 @@ instructions, instead of the install-locales target used above, will install the minimum set of locales necessary for the tests to run successfully: + mkdir -pv /usr/lib/locale localedef -i de_DE -f ISO-8859-1 de_DE localedef -i de_DE@euro -f ISO-8859-15 de_DE@euro @@ -152,24 +173,16 @@ localedef -i es_MX -f ISO-8859-1 es_MX localedef -i fa_IR -f UTF-8 fa_IR localedef -i fr_FR -f ISO-8859-1 fr_FR localedef -i fr_FR@euro -f ISO-8859-15 fr_FR@euro +localedef -i fr_FR.UTF-8 -f UTF-8 fr_FR localedef -i it_IT -f ISO-8859-1 it_IT localedef -i ja_JP -f EUC-JP ja_JP -Some locales installed by the make -localedata/install-locales command above are not properly -supported by some applications that are in the LFS and BLFS books. -Because of the various problems that arise due to application -programmers making assumptions that break in such locales, LFS should -not be used in locales that utilize multibyte character sets -(including UTF-8) or right-to-left writing order. Numerous unofficial -and unstable patches are required to fix these problems, and it has -been decided by the LFS developers not to support such complex locales at this -time. This applies to the ja_JP and fa_IR locales as well—they have been -installed only for GCC and Gettext tests to pass, and the -watch program (part of the Procps package) does not work -properly in them. Various attempts to circumvent these restrictions are -documented in internationalization-related hints. - +The first localedef above combines the +/usr/share/i18n/locales/de_DE charset-independent +locale definition with the +/usr/share/i18n/charmaps/ISO-8859-1.gz charmap definition +and appends the result to the +/usr/lib/locale/locale-archive file. Configuring Glibc diff --git a/chapter06/grep.xml b/chapter06/grep.xml index 8322c6d45..b772ad294 100644 --- a/chapter06/grep.xml +++ b/chapter06/grep.xml @@ -28,6 +28,16 @@ Diffutils, GCC, Gettext, Glibc, Make, Sed, and Texinfo Installation of Grep +The original Grep package has many bugs, especially in the support of +multibyte locales. RedHat fixed some of them by the following patch: + +patch -Np1 -i ../&grep-fixes-patch; + +In order for the tests added by this patch to pass, the permissions for +the test file have to be changed: + +chmod +x tests/fmbtest.sh + Prepare Grep for compilation: ./configure --prefix=/usr --bindir=/bin diff --git a/chapter06/groff.xml b/chapter06/groff.xml index a2d8cb7e2..a8883e930 100644 --- a/chapter06/groff.xml +++ b/chapter06/groff.xml @@ -28,10 +28,29 @@ Gawk, GCC, Glibc, Grep, Make, and Sed Installation of Groff +Apply the patch that adds the "ascii8" and "nippon" devices to Groff: + +zcat ../&groff-debian-patch; | patch -Np1 + +These devices are used by Man-DB when formatting non-English manual +pages that are not in the ISO-8859-1 encoding. There is no working patch for +Groff-1.19.x that adds this functionality at the time of this writing. + + + +Many screen fonts don't have Unicode dashes in them. Tell groff to use +the ASCII hyphen instead: + +sed -i 's,2010,002D,' font/devutf8/R.proto +sed -i 's,2212,002D,' font/devutf8/R.proto + Groff expects the environment variable PAGE to contain the default paper size. For users in the United States, PAGE=letter is appropriate. Elsewhere, -PAGE=A4 may be more suitable. +PAGE=A4 may be more suitable. +The default paper size can be changed after installation by writing +the word "A4" or "letter" to the /etc/papersize +file. Prepare Groff for compilation: diff --git a/chapter06/kbd.xml b/chapter06/kbd.xml index 52e8a734c..703f959df 100644 --- a/chapter06/kbd.xml +++ b/chapter06/kbd.xml @@ -28,6 +28,15 @@ Diffutils, Flex, GCC, Gettext, Glibc, Grep, Gzip, M4, Make, and Sed Installation of Kbd +The behaviour of Backspace and Delete keys is not consistent across the +keymaps in the Kbd package. The following patch fixes this issue for +i386 keymaps: + +patch -Np1 -i ../&kbd-backspace-patch; + +After patching, the Backspace key generates the character with code 127, +and the Delete key generates a well-known escape sequence. + Patch Kbd to fix a bug in setfont that is triggered when compiling with GCC-&gcc-version;: @@ -47,6 +56,11 @@ when compiling with GCC-&gcc-version;: make install +For some languages, e.g. Belarusian, the Kbd package doesn't provide +a useful keymap (the stock "by" keymap assumes the ISO-8859-5 encoding, +while everybody uses CP1251 instead). Users of such languages +have to download working keymaps separately. + Contents of Kbd @@ -274,8 +288,9 @@ pressed on the keyboard unicode_start -Puts the keyboard and console in UNICODE mode. Never use it on LFS, -because applications are not configured to support UNICODE. +Puts the keyboard and console in UNICODE mode. Don't use this program +unless your keymap file is in the ISO-8859-1 encoding. For other encodings, +this utility produces incorrect results. unicode_start diff --git a/chapter06/man-db.xml b/chapter06/man-db.xml new file mode 100644 index 000000000..13a421a21 --- /dev/null +++ b/chapter06/man-db.xml @@ -0,0 +1,268 @@ + + + %general-entities; +]> + +Man-DB-&man-db-version; + + +Man-DB + + +<para>The Man-DB package contains programs for finding and viewing man pages.</para> + +<segmentedlist> +<segtitle>&buildtime;</segtitle> +<segtitle>&diskspace;</segtitle> +<seglistitem><seg>0.1 SBU</seg><seg>1.1 MB</seg></seglistitem> +</segmentedlist> + +<segmentedlist> +<segtitle>&dependencies;</segtitle> +<seglistitem><seg>Bash, Binutils, Coreutils, Gawk, GCC, +Glibc, Gettext, GDBM, Grep, Make, and Sed</seg></seglistitem> +</segmentedlist> +</sect2> + +<sect2 role="installation"> +<title>Installation of Man-DB + +Three adjustments need to be made to the sources of Man-DB. + +The first one changes the location of translated manual pages that come +with Man-DB, in order for them to be accessible in both traditional and +UTF-8 locales: + +mv man/de{_DE.88591,} && +mv man/es{_ES.88591,} && +mv man/it{_IT.88591,} && +mv man/ja{_JP.eucJP,} && +sed -i 's,\*_\*,??,' man/Makefile.in + +The second change is a sed substitution to delete the +/usr/man lines in the +man_db.conf file to prevent redundant results when +using programs such as whatis: + +sed -i '/\t\/usr\/man/d' src/man_db.conf.in + +The third change accounts for programs that Man-DB should be able +to find at runtime, but that haven't been installed yet: + +cat >>include/manconfig.h.in <<"EOF" +#define WEB_BROWSER "exec /usr/bin/lynx" +#define COL "/usr/bin/col" +#define VGRIND "/usr/bin/vgrind" +#define GRAP "/usr/bin/grap" +EOF + +The col program is a part of the Util-linux package, +lynx is a text-based web browser +(see BLFS for installation instructions), +vgrind converts program sources to Groff input, +and grap is useful for typesetting graphs in Groff documents. +The vgrind and grap programs are +not normally needed for viewing manual pages. They are +not part of LFS or BLFS, but you should be able to install them yourself +after finishing LFS if you wish to do so. + +Prepare Man-DB for compilation: + +./configure --prefix=/usr --enable-mb-groff --disable-setuid + +The meaning of the configure options: + + + +--enable-mb-groff +This tells the man program to +use the "ascii8" and "nippon" Groff devices for formatting non-ISO-8859-1 +manual pages. + + +--disable-setuid +This disables making the man program +setuid to user "man". + + + +Compile the package: + +make + +Install the package: + +make install + +Additional information with regards to the compression of +man and info pages can be found in the BLFS book at +. + + + +Non-English Manual Pages in LFS + +Linux distributions have different policies concerning the chracter +encoding in which manual pages are stored on the hard disk. E.g., RedHat +stores all manual pages in UTF-8, while Debian uses language-specific +(mostly 8-bit) encodings. This leads to incompatibility of packages with +manual pages designed for different distributions. + +LFS uses the same conventions as Debian. The correspondence between +language codes and the expected encoding of manual pages is listed below. +Man-DB automatically converts them to the locale encoding "on the fly" +while viewing. + + +Expected character encoding of manual pages + + +Language (code)Encoding + + +Danish (da)ISO-8859-1 +German (de)ISO-8859-1 +English (en)ISO-8859-1 +Spanish (es)ISO-8859-1 +Finnish (fi)ISO-8859-1 +French (fr)ISO-8859-1 +Irish (ga)ISO-8859-1 +Galician (gl)ISO-8859-1 +Indonesian (id)ISO-8859-1 +Icelandic (is)ISO-8859-1 +Italian (it)ISO-8859-1 +Dutch (nl)ISO-8859-1 + +Norwegian (no)ISO-8859-1 + +Portuguese (pt)ISO-8859-1 +Swedish (sv)ISO-8859-1 + +Czech (cs)ISO-8859-2 +Croatian (hr)ISO-8859-2 +Hungarian (hu)ISO-8859-2 +Japanese (ja)EUC-JP +Korean (ko)EUC-KR +Polish (pl)ISO-8859-2 +Russian (ru)KOI8-R +Slovak (sk)ISO-8859-2 +Turkish (tr)ISO-8859-9 + + +
+ +Manual pages in languages not in the list are not supported. +Norwegian doesn't work now because of the transition from no_NO to nb_NO +locale, and Korean is non-functional because of incomplete Groff patch. + + +If upstream distributes the manual pages in the same encoding as +Man-DB expects, the manual pages can be copied to +/usr/share/man/[language code]. +E.g., French manual pages +() +can be installed with the following command: + +mkdir -p /usr/share/man/fr && +cp -r man? /usr/share/man/fr + +If upstream distributes manual pages in UTF-8 (i.e. "for RedHat") +instead of the encoding listed in the table above, they have to be +downconverted from UTF-8 to the encoding listed in the table before +installation. E.g., Spanish manual pages +() +can be installed with the following commands: + +mkdir -p /usr/share/man/es && +find man? -type f | \ +grep -v 'man7/iso_8859-2.7' | grep -v 'man7/iso_8859-7.7' | \ +while read F ; do + iconv -f UTF-8 -t ISO-8859-1 $F >tmp ; mv tmp $F +done && +cp -r man? /usr/share/man/es + +The need to exclude man7/iso_8859-2.7 +and man7/iso_8859-7.7 files from the conversion process +because they are already in ISO-8859-1 is a packaging bug in +man-pages-es-1.55. Future versions should not require this kludge. + +
+ +Contents of Man-DB + + +Installed programs +accessdb, apropos, catman, lexgrog, man, mandb, manpath, +and whatis + + +Short Descriptions + + + + + +accessdb + +Dumps the whatis database contents in human-readable form. +accessdb + + + + +apropos + +Searches the whatis database and displays the short descriptions +of system commands that contain a given string +apropos + + + + +catman + +Creates or updates the pre-formatted manual pages +catman + + + + +lexgrog + +Displays one-line summary information about a given manual page. +lexgrog + + + + +man + +Formats and displays the requested on-line man page +man + + + + +mandb + +Creates or updates the whatis database +mandb + + + + +whatis + +Searches the whatis database and displays the short descriptions +of system commands that contain the given keyword as a separate +word +whatis + + + + + + +
+ diff --git a/chapter06/man.xml b/chapter06/man.xml index 371985d2e..e69de29bb 100644 --- a/chapter06/man.xml +++ b/chapter06/man.xml @@ -1,180 +0,0 @@ - - - %general-entities; -]> - -Man-&man-version; - - -Man - - -<para>The Man package contains programs for finding and viewing man pages.</para> - -<segmentedlist> -<segtitle>&buildtime;</segtitle> -<segtitle>&diskspace;</segtitle> -<seglistitem><seg>0.1 SBU</seg><seg>1.3 MB</seg></seglistitem> -</segmentedlist> - -<segmentedlist> -<segtitle>&dependencies;</segtitle> -<seglistitem><seg>Bash, Binutils, Coreutils, Gawk, GCC, -Glibc, Grep, Make, and Sed</seg></seglistitem> -</segmentedlist> -</sect2> - -<sect2 role="installation"> -<title>Installation of Man - -Two adjustments need to be made to the sources of Man. - -The first is a sed substitution to add the --R switch to the PAGER -variable so that escape sequences are properly handled by Less: - -sed -i 's@-is@&R@g' configure - -The second is also a sed substitution to comment out the -MANPATH /usr/man line in the -man.conf file to prevent redundant results when -using programs such as whatis: - -sed -i 's@MANPATH./usr/man@#&@g' src/man.conf.in - -Prepare Man for compilation: - -./configure -confdir=/etc - -The meaning of the configure options: - - - --confdir=/etc -This tells the man program to look for the -man.conf configuration file in the /etc directory. - - - -Compile the package: - -make - -This package does not come with a test suite. - -Install the package: - -make install - -If you will be working on a terminal that does not support text -attributes such as color and bold, you can disable Select Graphic Rendition -(SGR) escape sequences by editing the man.conf file and -adding the -c option to the NROFF -variable. If you use multiple terminal types for one computer it may be better -to selectively add the GROFF_NO_SGR environment variable for the -terminals that do not support SGR. - -If the character set of the locale uses 8-bit characters, search for the -line beginning with NROFF in /etc/man.conf, -and verify that it matches the following: - -NROFF /usr/bin/nroff -Tlatin1 -mandoc - -Note that latin1 should be used even if it is not -the character set of the locale. The reason is that, according to the -specification, groff has no means of typesetting -characters outside International Organization for Standards -(ISO) 8859-1 without some strange escape codes. When formatting man -pages, groff thinks that they are in the ISO 8859-1 -encoding and this -Tlatin1 switch tells -groff to use the same encoding for output. Since -groff does no recoding of input characters, the -formatted result is really in the same encoding as input, and therefore -it is usable as the input for a pager. - -This does not solve the problem of a non-working -man2dvi program for localized man pages in -non-ISO 8859-1 locales. Also, it does not work with multibyte -character sets. The first problem does not currently have a solution. -The second issue is not of concern because the LFS installation does -not support multibyte character sets. - -Additional information with regards to the compression of -man and info pages can be found in the BLFS book at -. - - - - -Contents of Man - - -Installed programs -apropos, makewhatis, man, -man2dvi, man2html, and whatis - - -Short Descriptions - - - - -apropos - -Searches the whatis database and displays the short descriptions -of system commands that contain a given string -apropos - - - - -makewhatis - -Builds the whatis database; it reads all the man pages -in the MANPATH and writes the name and a short description in the -whatis database for each page -makewhatis - - - - -man - -Formats and displays the requested on-line man page -man - - - - -man2dvi - -Converts a man page into dvi format -man2dvi - - - - -man2html - -Converts a man page into HTML -man2html - - - - -whatis - -Searches the whatis database and displays the short descriptions -of system commands that contain the given keyword as a separate -word -whatis - - - - - - - - diff --git a/chapter06/ncurses.xml b/chapter06/ncurses.xml index 50e2bdc33..be459b36d 100644 --- a/chapter06/ncurses.xml +++ b/chapter06/ncurses.xml @@ -28,10 +28,49 @@ Gawk, GCC, Glibc, Grep, Make, and Sed Installation of Ncurses + + +Since the release of Ncurses-&ncurses-version;, a memory leak and some +display bugs were found and fixed upstream. Apply those fixes: + +patch -Np1 -i ../&ncurses-fixes-patch; Prepare Ncurses for compilation: -./configure --prefix=/usr --with-shared --without-debug +./configure --prefix=/usr --with-shared --without-debug --enable-widec + +The meaning of the configure options: + + + +--enable-widec +This switch causes wide-character libraries +(e.g. libncursesw.so.&ncurses-version;) +to be built instead of normal ones +(e.g. libncurses.so.&ncurses-version;). +Those wide-character libraries are usable in both multibyte and traditional 8-bit +locales, while normal libraries work properly only in 8-bit locales. +Wide-character and normal libraries are source-compatible, but not +binary-compatible. + + + + Compile the package: @@ -49,18 +88,48 @@ Gawk, GCC, Glibc, Grep, Make, and Sed Fix a library that should not be executable: -chmod -v 644 /usr/lib/libncurses++.a +chmod -v 644 /usr/lib/libncurses++w.a Move the libraries to the /lib directory, where they are expected to reside: -mv -v /usr/lib/libncurses.so.5* /lib +mv -v /usr/lib/libncursesw.so.5* /lib -Because the libraries have been moved, a few symlinks point to -non-existent files. Recreate those symlinks: +Because the libraries have been moved, one symlink points to +a non-existent file. Recreate it: -ln -sfv ../../lib/libncurses.so.5 /usr/lib/libncurses.so -ln -sfv libncurses.so /usr/lib/libcurses.so +ln -sfv ../../lib/libncursesw.so.5 /usr/lib/libncursesw.so + +Many applications still expect the linker to be able to find +non-wide-character Ncurses libraries. Trick such applications into linking with +wide-character libraries by means of symlinks and linker scripts: + +for lib in curses ncurses form panel menu ; do \ + rm -vf /usr/lib/lib${lib}.so ; \ + echo "INPUT(-l${lib}w)" >/usr/lib/lib${lib}.so ; \ + ln -sfv lib${lib}w.a /usr/lib/lib${lib}.a ; \ +done && +ln -sfv libncurses++w.a /usr/lib/libncurses++.a + +Finally, make sure that really old applications that look for +-lcurses at build time are still +buildable: + +echo "INPUT(-lncursesw)" >/usr/lib/libcursesw.so && +ln -sfv libncurses.so /usr/lib/libcurses.so && +ln -sfv libncursesw.a /usr/lib/libcursesw.a && +ln -sfv libncurses.a /usr/lib/libcurses.a + +The instructions above don't create non-wide-character Ncurses +libraries since nothing in LFS and BLFS would link against them at runtime. +If you must have such libraries because of some binary-only application, +build them with the following commands: +make distclean && +./configure --prefix=/usr --with-shared --without-normal \ + --without-debug --without-cxx-binding && +make sources libs && +cp -av lib/lib*.so.5* /usr/lib + @@ -71,8 +140,10 @@ ln -sfv libncurses.so /usr/lib/libcurses.so Installed libraries captoinfo (link to tic), clear, infocmp, infotocap (link to tic), reset (link to tset), tack, tic, toe, tput, and tset -libcurses.[a,so] (link to libncurses.[a,so]), libform.[a,so], libmenu.[a,so], -libncurses++.a, libncurses.[a,so], and libpanel.[a,so] +libcursesw.[a,so] (symlink and linker script to libncursesw.[a,so]), +libformw.[a,so], libmenuw.[a,so], +libncurses++w.a, libncursesw.[a,so], libpanelw.[a,so] and their +non-wide-character counterparts without "w" in the library names. Short Descriptions @@ -212,4 +283,3 @@ menu displayed during the kernel's make menuconfig
- diff --git a/chapter06/readline.xml b/chapter06/readline.xml index 5d4e08c29..235f38e8c 100644 --- a/chapter06/readline.xml +++ b/chapter06/readline.xml @@ -43,7 +43,9 @@ GCC, Glibc, Grep, Make, Ncurses, and Sed SHLIB_LIBS=-lncurses This option forces Readline to link against the -libncurses library. +libncurses +(really, libncursesw) +library. diff --git a/chapter06/sysklogd.xml b/chapter06/sysklogd.xml index fb422e1c8..c7b1176ff 100644 --- a/chapter06/sysklogd.xml +++ b/chapter06/sysklogd.xml @@ -33,6 +33,12 @@ Sysklogd with Linux 2.6 series kernels patch -Np1 -i ../&sysklogd-fixes-patch; +The following patch makes sysklogd treat bytes in the 0x80--0x9f range +literally in the messages being logged, instead of replacing them with octal +codes. Such replacement caused damage to messages in UTF-8 encoding. + +patch -Np1 -i ../&sysklogd-8bit-patch; + Compile the package: make diff --git a/chapter06/sysvinit.xml b/chapter06/sysvinit.xml index a322e69ec..a82fdd9a8 100644 --- a/chapter06/sysvinit.xml +++ b/chapter06/sysvinit.xml @@ -84,26 +84,15 @@ ca:12345:ctrlaltdel:/sbin/shutdown -t1 -a -r now su:S016:once:/sbin/sulogin -1:2345:respawn:/sbin/agetty -I '\033(K' tty1 9600 -2:2345:respawn:/sbin/agetty -I '\033(K' tty2 9600 -3:2345:respawn:/sbin/agetty -I '\033(K' tty3 9600 -4:2345:respawn:/sbin/agetty -I '\033(K' tty4 9600 -5:2345:respawn:/sbin/agetty -I '\033(K' tty5 9600 -6:2345:respawn:/sbin/agetty -I '\033(K' tty6 9600 +1:2345:respawn:/sbin/agetty tty1 9600 +2:2345:respawn:/sbin/agetty tty2 9600 +3:2345:respawn:/sbin/agetty tty3 9600 +4:2345:respawn:/sbin/agetty tty4 9600 +5:2345:respawn:/sbin/agetty tty5 9600 +6:2345:respawn:/sbin/agetty tty6 9600 # End /etc/inittab EOF - -The -I '\033(K' option tells -agetty to send this escape sequence to the terminal -before doing anything else. This escape sequence switches the console -character set to a user-defined one, which can be modified by running -the setfont program. The console -initscript from the LFS-Bootscripts package calls the setfont -program during system startup. Sending this escape sequence is -necessary for people who use non-ISO 8859-1 screen fonts, but it does -not affect native English speakers. -
diff --git a/chapter06/texinfo.xml b/chapter06/texinfo.xml index fc3138517..0a1ff24c7 100644 --- a/chapter06/texinfo.xml +++ b/chapter06/texinfo.xml @@ -29,6 +29,14 @@ Diffutils, GCC, Gettext, Glibc, Grep, Make, Ncurses, and Sed Installation of Texinfo +The info program makes assumptions such as "a string +occupies the same number of character cells on the screen and bytes in memory" +and "one can break the string anywhere" that are incorrect in UTF-8 locales. +While the patch below is not the proper solution, it at least hides the problem +by falling back to English messages when a multibyte locale is in use: + +patch -Np1 -i ../&texinfo-multibyte-patch; + Texinfo allows local users to overwrite arbitrary files via a symlink attack on temporary files. Apply the following patch to fix this: diff --git a/chapter06/udev.xml b/chapter06/udev.xml index 3d89e3187..8b77b1b58 100644 --- a/chapter06/udev.xml +++ b/chapter06/udev.xml @@ -78,6 +78,7 @@ the configuration files here: install -m644 -D -v docs/writing_udev_rules/index.html /usr/share/doc/udev-&udev-version;/index.html + Run the udevstart program to create our full complement of device nodes. diff --git a/chapter06/vim.xml b/chapter06/vim.xml index 2083d4786..bd21f2f32 100644 --- a/chapter06/vim.xml +++ b/chapter06/vim.xml @@ -53,7 +53,7 @@ class="directory">/etc: --enable-multibyte -This optional but highly recommended switch enables support for +This switch enables support for editing files in multibyte character encodings. This is needed if using a locale with a multibyte character set. This switch is also helpful to be able to edit text files initially created in Linux distributions like Fedora Core that @@ -75,6 +75,18 @@ redirecting the output to a log file. make install +In UTF-8 locales, the vimtutor program +tries to convert the tutorials from ISO-8859-1 to UTF-8. Since +some tutorials are not in ISO-8859-1, the text in them is thus made unreadable. +If you unpacked the vim-&vim-version;-lang.tar.gz +archive and are going to use a UTF-8 based locale, remove non-ISO-8859-1 +tutorials. An English tutorial will be used instead. + + +rm -f /usr/share/vim/vim64/tutor/tutor.{gr,pl,ru,sk} +rm -f /usr/share/vim/vim64/tutor/tutor.??.* + Many users are used to using vi instead of vim. To allow execution of vim when users habitually enter vi, create a diff --git a/chapter07/bootscripts.xml b/chapter07/bootscripts.xml index 775215e7e..6e884ac77 100644 --- a/chapter07/bootscripts.xml +++ b/chapter07/bootscripts.xml @@ -47,6 +47,12 @@ make install + The console script that comes with + LFS-Bootscripts-&lfs-bootscripts-version; doesn't support Unicode. Install + a replacement version: + +install -m755 ../console /etc/rc.d/init.d + diff --git a/chapter07/console.xml b/chapter07/console.xml index 315112366..b0b9417a3 100644 --- a/chapter07/console.xml +++ b/chapter07/console.xml @@ -17,96 +17,207 @@ This section discusses how to configure the console bootscript that sets up the keyboard map and the console font. If non-ASCII - characters (e.g., the British pound sign and Euro character) will not be used - and the keyboard is a U.S. one, skip this section. Without the configuration - file, the console bootscript will do nothing. + characters (e.g., the copyright sign, the British pound sign and Euro symbol) + will not be used and the keyboard is a U.S. one, skip this section. Without + the configuration file, the console bootscript will do + nothing. The console script reads the /etc/sysconfig/console file for configuration information. Decide which keymap and screen font will be used. Various language-specific - HOWTO's can also help with this (see . A pre-made - /etc/sysconfig/console file with known settings for several - countries was installed with the LFS-Bootscripts package, so the relevant - section can be uncommented if the country is supported. If still in doubt, look - in the /usr/share/kbd directory for valid - keymaps and screen fonts. Read loadkeys(1) and - setfont(8) to determine the correct arguments for - these programs. Once decided, create the configuration file with the following - command: + HOWTO's can also help with this, see . If still in + doubt, look in the /usr/share/kbd + directory for valid keymaps and screen fonts. Read + loadkeys(1) and setfont(8) manual + pages to determine the correct arguments for these programs. -cat >/etc/sysconfig/console <<"EOF" -KEYMAP="[arguments for loadkeys]" -FONT="[arguments for setfont]" + The /etc/sysconfig/console file should contain lines + of the form: VARIABLE="value". The following variables are recognized: + + + + + KEYMAP + + This variable specifies the arguments for the + loadkeys program, typically, the name of keymap + to load, e.g. "es". If this variable is not set, the bootscript will + not run the loadkeys program, and the default kernel + keymap will be used. + + + + + KEYMAP_CORRECTIONS + + This (rarely used) variable + specifies the arguments for the second call to the + loadkeys program. This is useful if the stock keymap + is not completely satisfactory and a small adjustment has to be made. E.g., + to include the Euro sign into a keymap that normally doesn't have it, + set this variable to "euro2". + + + + + FONT + + This variable specifies the arguments for the + setfont program. Typically, this includes the font + name, "-m", and the name of the application character map to load. + E.g., in order to load the "lat1-16" font together with the "8859-1" + application character map, set this variable to "lat1-16 -m 8859-1". + If this variable is not set, the bootscript will not run the + setfont program, and the default VGA font will be + used together with the default application character map. + + + + + UNICODE + + Set this variable to "1", "yes" or "true" in order to put the + console into UTF-8 mode. This is useful in UTF-8 based locales and + harmful otherwise. + + + + + LEGACY_CHARSET + + For many keyboard layouts, there is no stock Unicode keymap in + the Kbd package. The console bootscript will + convert an available keymap to UTF-8 on the fly if this variable is + set to the encoding of the available non-UTF-8 keymap. Note, however, + that dead keys and composing will not work in UTF-8 mode without the + special kernel patch. + + + + + BROKEN_COMPOSE + + Set this to "0" if you are going to apply that kernel patch in + Chapter 8. Note that you also have to add the character set expected + by composition rules in your keymap to the FONT variable after the + "-m" switch. + + + + + + Support for compiling the keymap directly into the kernel has been + removed because there were reports that it leads to incorrect results. + + Some examples: + + + + + For a non-Unicode setup, only the KEYMAP and FONT variables are + generally needed. E.g., for a Polish setup, one would use: + +cat > /etc/sysconfig/console << "EOF" +# Begin /etc/sysconfig/console + +KEYMAP="pl2" +FONT="lat2a-16 -m 8859-2" + +# End /etc/sysconfig/console EOF + - For example, for Spanish users who also want to use the Euro - character (accessible by pressing AltGr+E), the following settings are - correct: + + As mentioned above, it is sometimes necessary to adjust a + stock keymap slightly. The following example adds the Euro symbol to the + German keymap: -cat >/etc/sysconfig/console <<"EOF" -KEYMAP="es euro2" -FONT="lat9-16 -u iso01" +cat > /etc/sysconfig/console << "EOF" +# Begin /etc/sysconfig/console + +KEYMAP="de-latin1" +KEYMAP_CORRECTIONS="euro2" +FONT="lat0-16 -m 8859-15" + +# End /etc/sysconfig/console EOF + + + Here is a Unicode-enabled example for Bulgarian, where a stock + UTF-8 keymap exists and defines no dead keys or composition rules: + +cat > /etc/sysconfig/console << "EOF" +# Begin /etc/sysconfig/console + +UNICODE="1" +KEYMAP="bg_bds-utf8" +FONT="LatArCyrHeb-16" + +# End /etc/sysconfig/console +EOF + + + + Due to the use of a 512-glyph LatArCyrHeb-16 font in the previous + example, bright colors are no longer available on the Linux console unless + a framebuffer is used. If one wants to have bright colors without + framebuffer and can live without characters not belonging to his language, + it is still possible to use a language-specific 256-glyph font, as + illustrated below. This would, however, also break single quotes in manual + pages. + + + +cat > /etc/sysconfig/console << "EOF" +# Begin /etc/sysconfig/console + +UNICODE="1" +KEYMAP="bg_bds-utf8" +FONT="cyr-sun16" + +# End /etc/sysconfig/console +EOF + + + + The following example illustrates keymap autoconversion from + ISO-8859-15 to UTF-8 and enabling dead keys in Unicode mode: + +cat > /etc/sysconfig/console << "EOF" +# Begin /etc/sysconfig/console + +UNICODE="1" +KEYMAP="de-latin1" +KEYMAP_CORRECTIONS="euro2" +LEGACY_CHARSET="iso-8859-15" +BROKEN_COMPOSE="0" +FONT="LatArCyrHeb-16 -m 8859-15" + +# End /etc/sysconfig/console +EOF + + + + For Chinese, Japanese, Korean and some other languages, the Linux + console cannot be configured to display the needed characters. Users + who need such languages should install the X Window System, fonts that + cover the necessary character ranges, and the proper input Method (e.g. + SCIM, it supports a wide variety of languages). + + + + + - The FONT line above is correct only for the ISO 8859-15 - character set. If using ISO 8859-1 and, therefore, a pound sign - instead of Euro, the correct FONT line would be: - -FONT="lat1-16" + The /etc/sysconfig/console file only controls + Linux text console localization. It has nothing to do with setting the proper + keyboard layout and terminal fonts in X Window System. - If the KEYMAP or FONT variable is not set, - the console initscript will not run the corresponding - program. - - In some keymaps, the Backspace and Delete keys send characters different - from ones in the default keymap built into the kernel. This confuses some - applications. For example, Emacs displays its help (instead of erasing the - character before the cursor) when Backspace is pressed. To check if the keymap - in use is affected (this works only for i386 keymaps): - -zgrep '\W14\W' [/path/to/your/keymap] - - If the keycode 14 is Backspace instead of Delete, create the - following keymap snippet to fix this issue: - -mkdir -pv /etc/kbd && cat > /etc/kbd/bs-sends-del <<"EOF" - keycode 14 = Delete Delete Delete Delete - alt keycode 14 = Meta_Delete - altgr alt keycode 14 = Meta_Delete - keycode 111 = Remove - altgr control keycode 111 = Boot - control alt keycode 111 = Boot -altgr control alt keycode 111 = Boot -EOF - - Tell the console script to load this - snippet after the main keymap: - -cat >>/etc/sysconfig/console <<"EOF" -KEYMAP_CORRECTIONS="/etc/kbd/bs-sends-del" -EOF - - To compile the keymap directly into the kernel instead of - setting it every time from the console bootscript, - follow the instructions given in - Doing this ensures that the keyboard will always work as expected, - even when booting into maintenance mode (by passing - init=/bin/sh to the kernel), because the - console bootscript will not be run in that - situation. Additionally, the kernel will not set the screen font - automatically. This should not pose many problems because ASCII characters - will be handled correctly, and it is unlikely that a user would need - to rely on non-ASCII characters while in maintenance mode. - - Since the kernel will set up the keymap, it is possible to omit - the KEYMAP variable from the - /etc/sysconfig/console configuration file. It can - also be left in place, if desired, without consequence. Keeping it - could be beneficial if running several different kernels where it is - difficult to ensure that the keymap is compiled into every one of - them. - diff --git a/chapter07/profile.xml b/chapter07/profile.xml index dd53a5141..ae7617ba7 100644 --- a/chapter07/profile.xml +++ b/chapter07/profile.xml @@ -69,17 +69,19 @@ for the desired language (e.g., en) and [CC] with the two-letter code for the appropriate country (e.g., GB). [charmap] should - be replaced with the canonical charmap for your chosen locale. + be replaced with the canonical charmap for your chosen locale. Optional + modifiers such as @euro may also be present. The list of all locales supported by Glibc can be obtained by running the following command: locale -a - Locales can have a number of synonyms, e.g. ISO-8859-1 + Charmaps can have a number of aliases, e.g. ISO-8859-1 is also referred to as iso8859-1 and iso88591. - Some applications cannot handle the various synonyms correctly, so it is - safest to choose the canonical name for a particular locale. To determine + Some applications cannot handle the various synonyms correctly (e.g. require + that "UTF-8" is written as "UTF-8", not "utf8"), so it is safest in most + cases to choose the canonical name for a particular locale. To determine the canonical name, run the following command, where [locale name] is the output given by locale -a for your preferred locale (en_GB.iso88591 in our example). @@ -115,6 +117,7 @@ LC_ALL=[locale name] locale int_prefix Further instructions assume that there are no such error messages from Glibc. + Some packages beyond LFS may also lack support for your chosen locale. One example is the X library (part of the X Window System), which outputs the following error message: @@ -139,23 +142,43 @@ LC_ALL=[locale name] locale int_prefix cat > /etc/profile << "EOF" # Begin /etc/profile -export LANG=[ll]_[CC].[charmap] +export LANG=[ll]_[CC].[charmap][@modifiers] export INPUTRC=/etc/inputrc # End /etc/profile EOF + The C (default) and en_US (the recommended + one for United States English users) locales are different. C + uses the US-ASCII 7-bit character set, and treats bytes with the high bit set + as invalid characters. That's why, e.g., the ls command + substitutes them with question marks in that locale. Also, an attempt to send + mail with such characters from Mutt or Pine results in non-RFC-conforming + messages being set (the charset in the outgoing mail is indicatsed as "unknown + 8-bit"). So you can use the C locale only if you are sure that + you will never need 8-bit characters. + + UTF-8 based locales are not supported well by many programs. E.g., the + watch program displays only ASCII characters in UTF-8 + locales and has no such restriction in traditional 8-bit locales like en_US. + Without patches and/or installing software beyond BLFS, in UTF-8 based locales + you will not be able to do such basic tasks as printing plain-text files from + the command line, recording Windows-readable CDs with filenames containing + non-ASCII characters, viewing ID3v1 tags in MP3 files and so on. It is also + impossible (without damaging non-ASCII characters) to connect using ssh from + the system using a UTF-8 based locale to a host that still uses a traditional + 8-bit locale, and vice versa. In short, use UTF-8 only if you are going to + use KDE or GNOME and never open the terminal, or if you are going to tolerate + bugs. + + - The C (default) and en_US (the - recommended one for United States English users) locales are different. + Bug reports reproducible only in UTF-8 locales and for which there + is no patch or other fix mentioned in the report, will be closed immediately, + without investigation, with the "WONTFIX" resolution and a "don't use this + program or revert to non-UTF-8 locale" comment. Patches that have ill + effects in non-UTF-8 locales (other than replacement of translated program + messages with English ones) will be rejected. - Setting the keyboard layout, screen font, and locale-related environment - variables are the only internationalization steps needed to support locales - that use ordinary single-byte encodings and left-to-right writing direction. - More complex cases (including UTF-8 based locales) require additional steps - and additional patches because many applications tend to not work properly - under such conditions. These steps and patches are not included in the LFS - book and such locales are not yet supported by LFS. - diff --git a/chapter08/fstab.xml b/chapter08/fstab.xml index 439057b4f..1487bbbea 100644 --- a/chapter08/fstab.xml +++ b/chapter08/fstab.xml @@ -65,4 +65,43 @@ EOF usbcore must be listed in /etc/sysconfig/modules. + Filesystems with MS-DOS or Windows origin (i.e.: vfat, ntfs, smbfs, cifs, + iso9660, udf) need the iocharset mount option in order for + non-ASCII characters in file names to be interpreted properly. The value + of this option should be the same as the character set of your locale, + adjusted in such a way that the kernel understands it. This works if the + relevant character set definition (found under File systems -> + Native Language Support) has been compiled into the kernel + or built as a module. The codepage option is also needed for + vfat and smbfs filesystems. It + should be set to the codepage number used under MS-DOS in your country. E.g., + in order to mount USB flash drives, a ru_RU.KOI8-R user would need the + following line in /etc/fstab: + +/dev/sda1 /media/flash vfat noauto,user,quiet,showexec,iocharset=koi8r,codepage=866 0 0 + + The corresponding line for ru_RU.UTF-8 users is: + +/dev/sda1 /media/flash vfat noauto,user,quiet,showexec,iocharset=utf8,codepage=866 0 0 + + In the latter case, the kernel emits the following message: + +FAT: utf8 is not a recommended IO charset for FAT filesystems, filesystem will be case sensitive! + + This negative recommendation should be ignored, since all other values + of the iocharset option result in wrong display of filenames in + UTF-8 locales. + + It is also possible to specify default codepage and iocharset values for + some filesystems during kernel configuration, the relevant parameters + are named + Default NLS Option (CONFIG_NLS_DEFAULT), + Default Remote NLS Option (CONFIG_SMB_NLS_DEFAULT), + Default codepage for FAT (CONFIG_FAT_DEFAULT_CODEPAGE), and + Default iocharset for FAT (CONFIG_FAT_DEFAULT_IOCHARSET). + There is no way to specify these settings for the + ntfs filesystem at kernel compilation time. + + diff --git a/chapter08/kernel.xml b/chapter08/kernel.xml index fcac33a39..4b9f0bcfd 100644 --- a/chapter08/kernel.xml +++ b/chapter08/kernel.xml @@ -48,6 +48,13 @@ in the kernel source tree for alternative methods to the way this book configures the kernel. + By default, Linux kernel generates wrong sequences of bytes when + dead keys are used in UTF-8 keyboard mode. Also, one cannot copy and paste + non-ASCII characters when UTF-8 mode is aciive. Fix these issues with the + patch: + +patch -Np1 -i ../&linux-utf8-patch; + Prepare for compilation by running the following command: make mrproper @@ -57,14 +64,7 @@ kernel compilation. Do not rely on the source tree being clean after un-tarring. - If, in it was decided to - compile the keymap into the kernel, issue the command below: - -loadkeys -m /usr/share/kbd/keymaps/[path to keymap] > \ - drivers/char/defkeymap.c - - For example, if using a Dutch keyboard, use - /usr/share/kbd/keymaps/i386/qwerty/nl.map.gz. + Configure the kernel via a menu-driven interface. BLFS has some information regarding particular kernel configuration requirements of diff --git a/general.ent b/general.ent index c7a92b259..b9538002e 100644 --- a/general.ent +++ b/general.ent @@ -22,6 +22,7 @@ + %patches-entities; @@ -44,10 +45,12 @@ + - + + @@ -63,11 +66,12 @@ - + + diff --git a/patches.ent b/patches.ent index 4d1331cd2..1f4fafa50 100644 --- a/patches.ent +++ b/patches.ent @@ -1,12 +1,15 @@ - + + + + @@ -15,24 +18,37 @@ + + + + + + + + + + + + - + - + +