Updated Man-DB text to account for recent Man-DB development. Many thanks to Alexander Patrakov for patientely guiding me through this.

git-svn-id: http://svn.linuxfromscratch.org/LFS/trunk/BOOK@8698 4aa44e1e-78dd-0310-a6d2-fbcd4c07a689
This commit is contained in:
DJ Lucas 2008-10-25 21:31:14 +00:00
parent 7d6c9a64b5
commit 7f89db8a15
2 changed files with 129 additions and 76 deletions

View File

@ -36,6 +36,17 @@
</listitem> </listitem>
--> -->
<listitem>
<para>2008-10-25</para>
<itemizedlist>
<listitem>
<para>[dj] - Updated the text on the Man-DB page to accout for recent
changes in Man-DB. Thanks to Alexander Patrakov for providing most
of the included text, explanations, and examples.</para>
</listitem>
</itemizedlist>
</listitem>
<listitem> <listitem>
<para>2008-10-23</para> <para>2008-10-23</para>
<itemizedlist> <itemizedlist>

View File

@ -111,71 +111,95 @@
<screen><userinput remap="install">make install</userinput></screen> <screen><userinput remap="install">make install</userinput></screen>
<para>Some packages provide UTF-8 manual pages, which previous versions of
<application>Man-DB</application> were unable to display. This limitation
has been fixed in recent versions, and <application>Man-DB</application>
can now convert manual pages from legacy encodings to UTF-8
(and vice-versa) on the fly. This used to be a rather annoying
problem across different distributions, as packages written for one
distribution would require changes to work on another. The following
script will allow you to convert manual pages to and from legacy and UTF-8
encodings.</para>
<screen><userinput remap="install">cat &gt;&gt; convert-mans &lt;&lt; "EOF"
<literal>#!/bin/sh -e
FROM="$1"
TO="$2"
shift ; shift
while [ $# -gt 0 ]
do
FILE="$1"
shift
iconv -f "$FROM" -t "$TO" "$FILE" >.tmp.iconv
mv .tmp.iconv "$FILE"
done</literal>
EOF
install -m755 convert-mans /usr/bin</userinput></screen>
<para>Additional information regarding the compression of
man and info pages can be found in the BLFS book at
<ulink url="&blfs-root;view/cvs/postlfs/compressdoc.html"/>.</para>
</sect2> </sect2>
<sect2> <sect2>
<title>Non-English Manual Pages in LFS</title> <title>Non-English Manual Pages in LFS</title>
<!--
<para>Some packages provide UTF-8 manual pages, which previous versions of
<application>Man-DB</application> were unable to display correctly because
the expected (8-bit) encoding for each language was hard-coded in the
source of <application>Man-DB</application>.
<application>Man-DB</application> now uses the extension of the directory
name in order to determine the encoding of the manual pages stored within.
If no extension exists, <application>Man-DB</application> uses a built-in
table (see below) to determine the encoding. E.g., because of "UTF-8" in
the directory name, it knows that all manual pages residing in
<filename class="directory">/usr/share/man/fr.UTF-8</filename> are UTF-8
encoded and, according to the built-in table, expects all manual pages
residing in <filename class="directory">/usr/share/man/ru</filename> to
be encoded using KOI8-R.</para>
<para>Linux distributions have different policies concerning the character <para>Linux distributions have different policies concerning the character
encoding in which manual pages are stored in the filesystem. E.g., RedHat encoding in which manual pages are stored in the filesystem. E.g., RedHat
stores all manual pages in UTF-8, while Debian previously used stores all manual pages in UTF-8, while Debian previously used
language-specific (mostly 8-bit) encodings. As mentioned above, this leads language-specific (mostly 8-bit) encodings. Many other distributions simply
to incompatibility of packages with manual pages designed for different ignore the problem all together. LFS also used the legacy encodings in
distributions.</para> previuos versions of the book. This was chosen because of the ease of
configuration associated with <application>Man-DB</application>.
Additionally, <application>Man-DB</application> provided support for
Chinese and Japanese locales, and limited support for Korean, whereas
<application>Man</application> did not at that time.</para>
<para>LFS previously used the same convention as Debian. This was chosen <para>In contrast, the setup in Fedora Core expects all manual pages
because <application>Man-DB</application> did not understand manual pages to be UTF-8 encoded, and stored in directories without suffixes.
stored in UTF-8 at the time of its introduction into LFS. For our purposes Disagreement about the expected encoding of manual pages amongst
at that time, <application>Man-DB</application> was preferable to distribution vendors, has led to confusion for upstream package maintainers.
<application>Man</application> as it worked without any additional Some packages contain, UTF-8 manual pages, while others ship with manual
configuration in any locale. This is still true today as pages in legacy encodings. Unlike the
<application>Man-DB</application> with Debian patched <application>Man</application>/<application>Groff</application> setup in
<application>Groff</application> will now dynamically convert UTF-8 encoded Fedora Core, <application>Man-DB</application> can make very good decisions
manual pages to the user's locale. Additionally, this combination provides about the on disk encoding and present the information to the user in their
support for Chinese and Japanese locales, and limited support for Korean, prefered format, without complex configurations.</para>
whereas <application>Man</application> does not. The current offering of
<application>Man</application> as used in RedHat requires major
modifications to both the <application>Man</application> and
<application>Groff</application> packages, and still falls short on
Chinese, Japanese, and Korean encodings.</para>
<para>Finally, most distributions, including Debian, are rapidly migrating <para><application>Man-DB</application> has, for the most part, made this
to all UTF-8 encoded manual pages. Upstream packagers will very likely drop problem completely transparent to end users, as long as the manual pages
legacy encodings in favor of UTF-8, though adoption has been slow due to are installed into the correct directory. There may be times, however,
the hacks required to make the current <application>Man</application> and where one encoding is preferred over the other. For this purpose, the
<application>Groff</application> packages work correctly together.</para> <command>convert-mans</command> script was written. It will convert manual
pages to another encoding before (or after) installation. Install the
<command>convert-mans</command> script with the following
instructions:</para>
-->
<para>Some packages provide non-English manual pages. They are displayed
correctly only if their location and encoding matches the expectation of
the "man" program. However, different Linux distributions have different
policies (expressed in the choice of the <command>man</command> program,
its configuration and patches applied to it) concerning the character
encoding in which manual pages are stored in the filesystem.</para>
<para>The relationship between language codes and the expected encoding <para>E.g., Debian previously required Russian manual pages to be encoded
of legacy manual pages is listed below.</para> in KOI8-R and to be placed in
<filename class="directory">/usr/share/man/ru</filename>. Now, in addition,
their <command>man</command> program (<application>Man-DB</application>)
searches for UTF-8 encoded Russian manual pages in
<filename class="directory">/usr/share/man/ru.UTF-8</filename>. On the
other hand, Fedora uses UTF-8 encoded manual pages exclusively. Russian
manual pages are found in
<filename class="directory">/usr/share/man/ru</filename> and their
<command>man</command> program doesn't acknowledge
<filename class="directory">/usr/share/man/ru.UTF-8</filename>. Many
other distributions ignore the on disk encodings completely, leaving the
end user with a mix of improperly encoded manual pages for their
configuration. When <command>man</command> processes the requtested page,
it will display the contents as configured, resulting in completely
unreadable text if the on disk encoding is not what is expected for that
configuration.</para>
<para>Disagreement about the expected encoding of manual pages amongst
distribution vendors, has led to confusion for upstream package
maintainers. One package may contain UTF-8 manual pages, while another
ships with manual pages in legacy encodings. <command>man</command>
searches for manual pages based on the user's locale settings.
<application>Man-DB</application> uses a built-in table (see below) to
determine the on disk encoding of manual pages found for a user's
locale, only if the directories found do not have an extension that
describes the encoding. E.g., because of ".UTF-8" in the directory name,
<application>Man-DB</application> knows that all manual pages residing in
<filename class="directory">/usr/share/man/fr.UTF-8</filename> are UTF-8
encoded and, according to the built-in table, expects all manual pages
residing in <filename class="directory">/usr/share/man/ru</filename> to
be encoded using KOI8-R.</para>
<!-- Origin: man-db-2.5.2/src/encodings.c --> <!-- Origin: man-db-2.5.2/src/encodings.c -->
<table> <table>
@ -308,7 +332,7 @@ install -m755 convert-mans /usr/bin</userinput></screen>
<entry>GBK</entry> <entry>GBK</entry>
</row> </row>
<row> <row>
<entry>Simplified Chinese,Singapore} (zh_SG)</entry> <entry>Simplified Chinese, Singapore (zh_SG)</entry>
<entry>GBK</entry> <entry>GBK</entry>
</row> </row>
<row> <row>
@ -330,12 +354,36 @@ install -m755 convert-mans /usr/bin</userinput></screen>
Norwegian does not work because of the transition from no_NO to Norwegian does not work because of the transition from no_NO to
nb_NO locale, and will be fixed in the next release of nb_NO locale, and will be fixed in the next release of
<application>Man-DB</application>. Korean is currently non functional <application>Man-DB</application>. Korean is currently non functional
because of incomplete fixes in the Groff patch.</para> because of incomplete fixes in the Debian
<application>Groff</application> patch applied in LFS.</para>
</note> </note>
<para>Packages may install manual pages into an improperly named directory,
depending on which distributions the author develops the package for. To
assist in the conversion of the manual pages to the proper encoding for the
directory in which they are installed, the <command>convert-mans</command>
script was written. It will convert manual pages to another encoding before
(or after) installation. Install the <command>convert-mans</command>
script with the following instructions:</para>
<para>If upstream distributes the manual pages in a legacy encoding, <screen><userinput remap="install">cat &gt;&gt; convert-mans &lt;&lt; "EOF"
the manual pages can simply be copied to <literal>#!/bin/sh -e
FROM="$1"
TO="$2"
shift ; shift
while [ $# -gt 0 ]
do
FILE="$1"
shift
iconv -f "$FROM" -t "$TO" "$FILE" >.tmp.iconv
mv .tmp.iconv "$FILE"
done</literal>
EOF
install -m755 convert-mans /usr/bin</userinput></screen>
<para>If upstream distributes the manual pages in a legacy encoding, the
manual pages can simply be copied to
<filename class="directory">/usr/share/man/<replaceable>&lt;language <filename class="directory">/usr/share/man/<replaceable>&lt;language
code&gt;</replaceable></filename>. For example, <ulink code&gt;</replaceable></filename>. For example, <ulink
url="http://www.infodrom.org/projects/manpages-de/download/manpages-de-0.5.tar.gz"> url="http://www.infodrom.org/projects/manpages-de/download/manpages-de-0.5.tar.gz">
@ -353,27 +401,21 @@ cp -rv man? /usr/share/man/de</userinput></screen>
code&gt;</replaceable>.UTF-8</filename>.</para> code&gt;</replaceable>.UTF-8</filename>.</para>
<para>For example, to install <ulink <para>For example, to install <ulink
url="http://ditec.um.es/~piernas/manpages-es/man-pages-es-1.55.tar.bz2">
Spanish manual pages</ulink> in the legacy encoding, use the following
commands:</para>
<screen role="nodump"><userinput>mv man7/iso_8859-7.7{,X}
convert-mans UTF-8 ISO-8859-1 man?/*.?
mv man7/iso_8859-7.7{X,}
make install</userinput></screen>
<note>
<para>The <filename>man7/iso_8859-7.7</filename> file needs to be
exclueded from the conversion process because it is already in
ISO-8859-1 format. This is a packaging bug in man-pages-es-1.55.
Future versions should not require this workaround.</para>
</note>
<para>Finally, as an example installation of UTF-8 manual pages, the <ulink
url="http://manpagesfr.free.fr/download/man-pages-fr-2.40.0.tar.bz2"> url="http://manpagesfr.free.fr/download/man-pages-fr-2.40.0.tar.bz2">
French manual pages</ulink> can be installed with the following French manual pages</ulink> in the legacy encoding, use the following
commands:</para> commands:</para>
<screen role="nodump"><userinput>convert-mans UTF-8 ISO-8859-1 man?/*.?
mkdir -p /usr/share/man/fr
cp -rv man? /usr/share/man/fr</userinput></screen>
<note><para>The French manual pages ship with ready made scripts to do the
same conversion. The above instructions are used only as an example for
use of the <command>convert-mans</command> script.</para></note>
<para>Finally, as an example installation of UTF-8 manual pages, again, the
French manual pages could be installed with the following commands:</para>
<screen role="nodump"><userinput>mkdir -p /usr/share/man/fr.UTF-8 <screen role="nodump"><userinput>mkdir -p /usr/share/man/fr.UTF-8
cp -rv man? /usr/share/man/fr.UTF-8</userinput></screen> cp -rv man? /usr/share/man/fr.UTF-8</userinput></screen>