mirror of
https://github.com/Stichting-MINIX-Research-Foundation/netbsd.git
synced 2025-09-12 00:24:52 -04:00
3799 lines
94 KiB
HTML
3799 lines
94 KiB
HTML
<HTML>
|
|
<HEAD>
|
|
<!-- This HTML file has been created by texi2html 1.52a
|
|
from gettext.texi on 11 April 2005 -->
|
|
|
|
<TITLE>GNU gettext utilities - 13 Other Programming Languages</TITLE>
|
|
</HEAD>
|
|
<BODY>
|
|
Go to the <A HREF="gettext_1.html">first</A>, <A HREF="gettext_12.html">previous</A>, <A HREF="gettext_14.html">next</A>, <A HREF="gettext_22.html">last</A> section, <A HREF="gettext_toc.html">table of contents</A>.
|
|
<P><HR><P>
|
|
|
|
|
|
<H1><A NAME="SEC221" HREF="gettext_toc.html#TOC221">13 Other Programming Languages</A></H1>
|
|
|
|
<P>
|
|
While the presentation of <CODE>gettext</CODE> focuses mostly on C and
|
|
implicitly applies to C++ as well, its scope is far broader than that:
|
|
Many programming languages, scripting languages and other textual data
|
|
like GUI resources or package descriptions can make use of the gettext
|
|
approach.
|
|
|
|
</P>
|
|
|
|
|
|
|
|
<H2><A NAME="SEC222" HREF="gettext_toc.html#TOC222">13.1 The Language Implementor's View</A></H2>
|
|
<P>
|
|
<A NAME="IDX1072"></A>
|
|
<A NAME="IDX1073"></A>
|
|
|
|
</P>
|
|
<P>
|
|
All programming and scripting languages that have the notion of strings
|
|
are eligible to supporting <CODE>gettext</CODE>. Supporting <CODE>gettext</CODE>
|
|
means the following:
|
|
|
|
</P>
|
|
|
|
<OL>
|
|
<LI>
|
|
|
|
You should add to the language a syntax for translatable strings. In
|
|
principle, a function call of <CODE>gettext</CODE> would do, but a shorthand
|
|
syntax helps keeping the legibility of internationalized programs. For
|
|
example, in C we use the syntax <CODE>_("string")</CODE>, and in GNU awk we use
|
|
the shorthand <CODE>_"string"</CODE>.
|
|
|
|
<LI>
|
|
|
|
You should arrange that evaluation of such a translatable string at
|
|
runtime calls the <CODE>gettext</CODE> function, or performs equivalent
|
|
processing.
|
|
|
|
<LI>
|
|
|
|
Similarly, you should make the functions <CODE>ngettext</CODE>,
|
|
<CODE>dcgettext</CODE>, <CODE>dcngettext</CODE> available from within the language.
|
|
These functions are less often used, but are nevertheless necessary for
|
|
particular purposes: <CODE>ngettext</CODE> for correct plural handling, and
|
|
<CODE>dcgettext</CODE> and <CODE>dcngettext</CODE> for obeying other locale
|
|
environment variables than <CODE>LC_MESSAGES</CODE>, such as <CODE>LC_TIME</CODE> or
|
|
<CODE>LC_MONETARY</CODE>. For these latter functions, you need to make the
|
|
<CODE>LC_*</CODE> constants, available in the C header <CODE><locale.h></CODE>,
|
|
referenceable from within the language, usually either as enumeration
|
|
values or as strings.
|
|
|
|
<LI>
|
|
|
|
You should allow the programmer to designate a message domain, either by
|
|
making the <CODE>textdomain</CODE> function available from within the
|
|
language, or by introducing a magic variable called <CODE>TEXTDOMAIN</CODE>.
|
|
Similarly, you should allow the programmer to designate where to search
|
|
for message catalogs, by providing access to the <CODE>bindtextdomain</CODE>
|
|
function.
|
|
|
|
<LI>
|
|
|
|
You should either perform a <CODE>setlocale (LC_ALL, "")</CODE> call during
|
|
the startup of your language runtime, or allow the programmer to do so.
|
|
Remember that gettext will act as a no-op if the <CODE>LC_MESSAGES</CODE> and
|
|
<CODE>LC_CTYPE</CODE> locale facets are not both set.
|
|
|
|
<LI>
|
|
|
|
A programmer should have a way to extract translatable strings from a
|
|
program into a PO file. The GNU <CODE>xgettext</CODE> program is being
|
|
extended to support very different programming languages. Please
|
|
contact the GNU <CODE>gettext</CODE> maintainers to help them doing this. If
|
|
the string extractor is best integrated into your language's parser, GNU
|
|
<CODE>xgettext</CODE> can function as a front end to your string extractor.
|
|
|
|
<LI>
|
|
|
|
The language's library should have a string formatting facility where
|
|
the arguments of a format string are denoted by a positional number or a
|
|
name. This is needed because for some languages and some messages with
|
|
more than one substitutable argument, the translation will need to
|
|
output the substituted arguments in different order. See section <A HREF="gettext_3.html#SEC18">3.5 Special Comments preceding Keywords</A>.
|
|
|
|
<LI>
|
|
|
|
If the language has more than one implementation, and not all of the
|
|
implementations use <CODE>gettext</CODE>, but the programs should be portable
|
|
across implementations, you should provide a no-i18n emulation, that
|
|
makes the other implementations accept programs written for yours,
|
|
without actually translating the strings.
|
|
|
|
<LI>
|
|
|
|
To help the programmer in the task of marking translatable strings,
|
|
which is usually performed using the Emacs PO mode, you are welcome to
|
|
contact the GNU <CODE>gettext</CODE> maintainers, so they can add support for
|
|
your language to <TT>`po-mode.el´</TT>.
|
|
</OL>
|
|
|
|
<P>
|
|
On the implementation side, three approaches are possible, with
|
|
different effects on portability and copyright:
|
|
|
|
</P>
|
|
|
|
<UL>
|
|
<LI>
|
|
|
|
You may integrate the GNU <CODE>gettext</CODE>'s <TT>`intl/´</TT> directory in
|
|
your package, as described in section <A HREF="gettext_12.html#SEC192">12 The Maintainer's View</A>. This allows you to
|
|
have internationalization on all kinds of platforms. Note that when you
|
|
then distribute your package, it legally falls under the GNU General
|
|
Public License, and the GNU project will be glad about your contribution
|
|
to the Free Software pool.
|
|
|
|
<LI>
|
|
|
|
You may link against GNU <CODE>gettext</CODE> functions if they are found in
|
|
the C library. For example, an autoconf test for <CODE>gettext()</CODE> and
|
|
<CODE>ngettext()</CODE> will detect this situation. For the moment, this test
|
|
will succeed on GNU systems and not on other platforms. No severe
|
|
copyright restrictions apply.
|
|
|
|
<LI>
|
|
|
|
You may emulate or reimplement the GNU <CODE>gettext</CODE> functionality.
|
|
This has the advantage of full portability and no copyright
|
|
restrictions, but also the drawback that you have to reimplement the GNU
|
|
<CODE>gettext</CODE> features (such as the <CODE>LANGUAGE</CODE> environment
|
|
variable, the locale aliases database, the automatic charset conversion,
|
|
and plural handling).
|
|
</UL>
|
|
|
|
|
|
|
|
<H2><A NAME="SEC223" HREF="gettext_toc.html#TOC223">13.2 The Programmer's View</A></H2>
|
|
|
|
<P>
|
|
For the programmer, the general procedure is the same as for the C
|
|
language. The Emacs PO mode supports other languages, and the GNU
|
|
<CODE>xgettext</CODE> string extractor recognizes other languages based on the
|
|
file extension or a command-line option. In some languages,
|
|
<CODE>setlocale</CODE> is not needed because it is already performed by the
|
|
underlying language runtime.
|
|
|
|
</P>
|
|
|
|
|
|
<H2><A NAME="SEC224" HREF="gettext_toc.html#TOC224">13.3 The Translator's View</A></H2>
|
|
|
|
<P>
|
|
The translator works exactly as in the C language case. The only
|
|
difference is that when translating format strings, she has to be aware
|
|
of the language's particular syntax for positional arguments in format
|
|
strings.
|
|
|
|
</P>
|
|
|
|
|
|
|
|
<H3><A NAME="SEC225" HREF="gettext_toc.html#TOC225">13.3.1 C Format Strings</A></H3>
|
|
|
|
<P>
|
|
C format strings are described in POSIX (IEEE P1003.1 2001), section
|
|
XSH 3 fprintf(),
|
|
<A HREF="http://www.opengroup.org/onlinepubs/007904975/functions/fprintf.html">http://www.opengroup.org/onlinepubs/007904975/functions/fprintf.html</A>.
|
|
See also the fprintf(3) manual page,
|
|
<A HREF="http://www.linuxvalley.it/encyclopedia/ldp/manpage/man3/printf.3.php">http://www.linuxvalley.it/encyclopedia/ldp/manpage/man3/printf.3.php</A>,
|
|
<A HREF="http://informatik.fh-wuerzburg.de/student/i510/man/printf.html">http://informatik.fh-wuerzburg.de/student/i510/man/printf.html</A>.
|
|
|
|
</P>
|
|
<P>
|
|
Although format strings with positions that reorder arguments, such as
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
"Only %2$d bytes free on '%1$s'."
|
|
</PRE>
|
|
|
|
<P>
|
|
which is semantically equivalent to
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
"'%s' has only %d bytes free."
|
|
</PRE>
|
|
|
|
<P>
|
|
are a POSIX/XSI feature and not specified by ISO C 99, translators can rely
|
|
on this reordering ability: On the few platforms where <CODE>printf()</CODE>,
|
|
<CODE>fprintf()</CODE> etc. don't support this feature natively, <TT>`libintl.a´</TT>
|
|
or <TT>`libintl.so´</TT> provides replacement functions, and GNU <CODE><libintl.h></CODE>
|
|
activates these replacement functions automatically.
|
|
|
|
</P>
|
|
<P>
|
|
<A NAME="IDX1074"></A>
|
|
<A NAME="IDX1075"></A>
|
|
As a special feature for Farsi (Persian) and maybe Arabic, translators can
|
|
insert an <SAMP>`I´</SAMP> flag into numeric format directives. For example, the
|
|
translation of <CODE>"%d"</CODE> can be <CODE>"%Id"</CODE>. The effect of this flag,
|
|
on systems with GNU <CODE>libc</CODE>, is that in the output, the ASCII digits are
|
|
replaced with the <SAMP>`outdigits´</SAMP> defined in the <CODE>LC_CTYPE</CODE> locale
|
|
facet. On other systems, the <CODE>gettext</CODE> function removes this flag,
|
|
so that it has no effect.
|
|
|
|
</P>
|
|
<P>
|
|
Note that the programmer should <EM>not</EM> put this flag into the
|
|
untranslated string. (Putting the <SAMP>`I´</SAMP> format directive flag into an
|
|
<VAR>msgid</VAR> string would lead to undefined behaviour on platforms without
|
|
glibc when NLS is disabled.)
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC226" HREF="gettext_toc.html#TOC226">13.3.2 Objective C Format Strings</A></H3>
|
|
|
|
<P>
|
|
Objective C format strings are like C format strings. They support an
|
|
additional format directive: "$@", which when executed consumes an argument
|
|
of type <CODE>Object *</CODE>.
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC227" HREF="gettext_toc.html#TOC227">13.3.3 Shell Format Strings</A></H3>
|
|
|
|
<P>
|
|
Shell format strings, as supported by GNU gettext and the <SAMP>`envsubst´</SAMP>
|
|
program, are strings with references to shell variables in the form
|
|
<CODE>$<VAR>variable</VAR></CODE> or <CODE>${<VAR>variable</VAR>}</CODE>. References of the form
|
|
<CODE>${<VAR>variable</VAR>-<VAR>default</VAR>}</CODE>,
|
|
<CODE>${<VAR>variable</VAR>:-<VAR>default</VAR>}</CODE>,
|
|
<CODE>${<VAR>variable</VAR>=<VAR>default</VAR>}</CODE>,
|
|
<CODE>${<VAR>variable</VAR>:=<VAR>default</VAR>}</CODE>,
|
|
<CODE>${<VAR>variable</VAR>+<VAR>replacement</VAR>}</CODE>,
|
|
<CODE>${<VAR>variable</VAR>:+<VAR>replacement</VAR>}</CODE>,
|
|
<CODE>${<VAR>variable</VAR>?<VAR>ignored</VAR>}</CODE>,
|
|
<CODE>${<VAR>variable</VAR>:?<VAR>ignored</VAR>}</CODE>,
|
|
that would be valid inside shell scripts, are not supported. The
|
|
<VAR>variable</VAR> names must consist solely of alphanumeric or underscore
|
|
ASCII characters, not start with a digit and be nonempty; otherwise such
|
|
a variable reference is ignored.
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC228" HREF="gettext_toc.html#TOC228">13.3.4 Python Format Strings</A></H3>
|
|
|
|
<P>
|
|
Python format strings are described in
|
|
Python Library reference /
|
|
2. Built-in Types, Exceptions and Functions /
|
|
2.2. Built-in Types /
|
|
2.2.6. Sequence Types /
|
|
2.2.6.2. String Formatting Operations.
|
|
<A HREF="http://www.python.org/doc/2.2.1/lib/typesseq-strings.html">http://www.python.org/doc/2.2.1/lib/typesseq-strings.html</A>.
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC229" HREF="gettext_toc.html#TOC229">13.3.5 Lisp Format Strings</A></H3>
|
|
|
|
<P>
|
|
Lisp format strings are described in the Common Lisp HyperSpec,
|
|
chapter 22.3 Formatted Output,
|
|
<A HREF="http://www.lisp.org/HyperSpec/Body/sec_22-3.html">http://www.lisp.org/HyperSpec/Body/sec_22-3.html</A>.
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC230" HREF="gettext_toc.html#TOC230">13.3.6 Emacs Lisp Format Strings</A></H3>
|
|
|
|
<P>
|
|
Emacs Lisp format strings are documented in the Emacs Lisp reference,
|
|
section Formatting Strings,
|
|
<A HREF="http://www.gnu.org/manual/elisp-manual-21-2.8/html_chapter/elisp_4.html#SEC75">http://www.gnu.org/manual/elisp-manual-21-2.8/html_chapter/elisp_4.html#SEC75</A>.
|
|
Note that as of version 21, XEmacs supports numbered argument specifications
|
|
in format strings while FSF Emacs doesn't.
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC231" HREF="gettext_toc.html#TOC231">13.3.7 librep Format Strings</A></H3>
|
|
|
|
<P>
|
|
librep format strings are documented in the librep manual, section
|
|
Formatted Output,
|
|
<A HREF="http://librep.sourceforge.net/librep-manual.html#Formatted%20Output">http://librep.sourceforge.net/librep-manual.html#Formatted%20Output</A>,
|
|
<A HREF="http://www.gwinnup.org/research/docs/librep.html#SEC122">http://www.gwinnup.org/research/docs/librep.html#SEC122</A>.
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC232" HREF="gettext_toc.html#TOC232">13.3.8 Scheme Format Strings</A></H3>
|
|
|
|
<P>
|
|
Scheme format strings are documented in the SLIB manual, section
|
|
Format Specification.
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC233" HREF="gettext_toc.html#TOC233">13.3.9 Smalltalk Format Strings</A></H3>
|
|
|
|
<P>
|
|
Smalltalk format strings are described in the GNU Smalltalk documentation,
|
|
class <CODE>CharArray</CODE>, methods <SAMP>`bindWith:´</SAMP> and
|
|
<SAMP>`bindWithArguments:´</SAMP>.
|
|
<A HREF="http://www.gnu.org/software/smalltalk/gst-manual/gst_68.html#SEC238">http://www.gnu.org/software/smalltalk/gst-manual/gst_68.html#SEC238</A>.
|
|
In summary, a directive starts with <SAMP>`%´</SAMP> and is followed by <SAMP>`%´</SAMP>
|
|
or a nonzero digit (<SAMP>`1´</SAMP> to <SAMP>`9´</SAMP>).
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC234" HREF="gettext_toc.html#TOC234">13.3.10 Java Format Strings</A></H3>
|
|
|
|
<P>
|
|
Java format strings are described in the JDK documentation for class
|
|
<CODE>java.text.MessageFormat</CODE>,
|
|
<A HREF="http://java.sun.com/j2se/1.4/docs/api/java/text/MessageFormat.html">http://java.sun.com/j2se/1.4/docs/api/java/text/MessageFormat.html</A>.
|
|
See also the ICU documentation
|
|
<A HREF="http://oss.software.ibm.com/icu/apiref/classMessageFormat.html">http://oss.software.ibm.com/icu/apiref/classMessageFormat.html</A>.
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC235" HREF="gettext_toc.html#TOC235">13.3.11 C# Format Strings</A></H3>
|
|
|
|
<P>
|
|
C# format strings are described in the .NET documentation for class
|
|
<CODE>System.String</CODE> and in
|
|
<A HREF="http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpguide/html/cpConFormattingOverview.asp">http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpguide/html/cpConFormattingOverview.asp</A>.
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC236" HREF="gettext_toc.html#TOC236">13.3.12 awk Format Strings</A></H3>
|
|
|
|
<P>
|
|
awk format strings are described in the gawk documentation, section
|
|
Printf,
|
|
<A HREF="http://www.gnu.org/manual/gawk/html_node/Printf.html#Printf">http://www.gnu.org/manual/gawk/html_node/Printf.html#Printf</A>.
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC237" HREF="gettext_toc.html#TOC237">13.3.13 Object Pascal Format Strings</A></H3>
|
|
|
|
<P>
|
|
Where is this documented?
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC238" HREF="gettext_toc.html#TOC238">13.3.14 YCP Format Strings</A></H3>
|
|
|
|
<P>
|
|
YCP sformat strings are described in the libycp documentation
|
|
<A HREF="file:/usr/share/doc/packages/libycp/YCP-builtins.html">file:/usr/share/doc/packages/libycp/YCP-builtins.html</A>.
|
|
In summary, a directive starts with <SAMP>`%´</SAMP> and is followed by <SAMP>`%´</SAMP>
|
|
or a nonzero digit (<SAMP>`1´</SAMP> to <SAMP>`9´</SAMP>).
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC239" HREF="gettext_toc.html#TOC239">13.3.15 Tcl Format Strings</A></H3>
|
|
|
|
<P>
|
|
Tcl format strings are described in the <TT>`format.n´</TT> manual page,
|
|
<A HREF="http://www.scriptics.com/man/tcl8.3/TclCmd/format.htm">http://www.scriptics.com/man/tcl8.3/TclCmd/format.htm</A>.
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC240" HREF="gettext_toc.html#TOC240">13.3.16 Perl Format Strings</A></H3>
|
|
|
|
<P>
|
|
There are two kinds format strings in Perl: those acceptable to the
|
|
Perl built-in function <CODE>printf</CODE>, labelled as <SAMP>`perl-format´</SAMP>,
|
|
and those acceptable to the <CODE>libintl-perl</CODE> function <CODE>__x</CODE>,
|
|
labelled as <SAMP>`perl-brace-format´</SAMP>.
|
|
|
|
</P>
|
|
<P>
|
|
Perl <CODE>printf</CODE> format strings are described in the <CODE>sprintf</CODE>
|
|
section of <SAMP>`man perlfunc´</SAMP>.
|
|
|
|
</P>
|
|
<P>
|
|
Perl brace format strings are described in the
|
|
<TT>`Locale::TextDomain(3pm)´</TT> manual page of the CPAN package
|
|
libintl-perl. In brief, Perl format uses placeholders put between
|
|
braces (<SAMP>`{´</SAMP> and <SAMP>`}´</SAMP>). The placeholder must have the syntax
|
|
of simple identifiers.
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC241" HREF="gettext_toc.html#TOC241">13.3.17 PHP Format Strings</A></H3>
|
|
|
|
<P>
|
|
PHP format strings are described in the documentation of the PHP function
|
|
<CODE>sprintf</CODE>, in <TT>`phpdoc/manual/function.sprintf.html´</TT> or
|
|
<A HREF="http://www.php.net/manual/en/function.sprintf.php">http://www.php.net/manual/en/function.sprintf.php</A>.
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC242" HREF="gettext_toc.html#TOC242">13.3.18 GCC internal Format Strings</A></H3>
|
|
|
|
<P>
|
|
These format strings are used inside the GCC sources. In such a format
|
|
string, a directive starts with <SAMP>`%´</SAMP>, is optionally followed by a
|
|
size specifier <SAMP>`l´</SAMP>, an optional flag <SAMP>`+´</SAMP>, another optional flag
|
|
<SAMP>`#´</SAMP>, and is finished by a specifier: <SAMP>`%´</SAMP> denotes a literal
|
|
percent sign, <SAMP>`c´</SAMP> denotes a character, <SAMP>`s´</SAMP> denotes a string,
|
|
<SAMP>`i´</SAMP> and <SAMP>`d´</SAMP> denote an integer, <SAMP>`o´</SAMP>, <SAMP>`u´</SAMP>, <SAMP>`x´</SAMP>
|
|
denote an unsigned integer, <SAMP>`.*s´</SAMP> denotes a string preceded by a
|
|
width specification, <SAMP>`H´</SAMP> denotes a <SAMP>`location_t *´</SAMP> pointer,
|
|
<SAMP>`D´</SAMP> denotes a general declaration, <SAMP>`F´</SAMP> denotes a function
|
|
declaration, <SAMP>`T´</SAMP> denotes a type, <SAMP>`A´</SAMP> denotes a function argument,
|
|
<SAMP>`C´</SAMP> denotes a tree code, <SAMP>`E´</SAMP> denotes an expression, <SAMP>`L´</SAMP>
|
|
denotes a programming language, <SAMP>`O´</SAMP> denotes a binary operator,
|
|
<SAMP>`P´</SAMP> denotes a function parameter, <SAMP>`Q´</SAMP> denotes an assignment
|
|
operator, <SAMP>`V´</SAMP> denotes a const/volatile qualifier.
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC243" HREF="gettext_toc.html#TOC243">13.3.19 Qt Format Strings</A></H3>
|
|
|
|
<P>
|
|
Qt format strings are described in the documentation of the QString class
|
|
<A HREF="file:/usr/lib/qt-3.0.5/doc/html/qstring.html">file:/usr/lib/qt-3.0.5/doc/html/qstring.html</A>.
|
|
In summary, a directive consists of a <SAMP>`%´</SAMP> followed by a digit. The same
|
|
directive cannot occur more than once in a format string.
|
|
|
|
</P>
|
|
|
|
|
|
<H2><A NAME="SEC244" HREF="gettext_toc.html#TOC244">13.4 The Maintainer's View</A></H2>
|
|
|
|
<P>
|
|
For the maintainer, the general procedure differs from the C language
|
|
case in two ways.
|
|
|
|
</P>
|
|
|
|
<UL>
|
|
<LI>
|
|
|
|
For those languages that don't use GNU gettext, the <TT>`intl/´</TT> directory
|
|
is not needed and can be omitted. This means that the maintainer calls the
|
|
<CODE>gettextize</CODE> program without the <SAMP>`--intl´</SAMP> option, and that he
|
|
invokes the <CODE>AM_GNU_GETTEXT</CODE> autoconf macro via
|
|
<SAMP>`AM_GNU_GETTEXT([external])´</SAMP>.
|
|
|
|
<LI>
|
|
|
|
If only a single programming language is used, the <CODE>XGETTEXT_OPTIONS</CODE>
|
|
variable in <TT>`po/Makevars´</TT> (see section <A HREF="gettext_12.html#SEC199">12.4.3 <TT>`Makevars´</TT> in <TT>`po/´</TT></A>) should be adjusted to
|
|
match the <CODE>xgettext</CODE> options for that particular programming language.
|
|
If the package uses more than one programming language with <CODE>gettext</CODE>
|
|
support, it becomes necessary to change the POT file construction rule
|
|
in <TT>`po/Makefile.in.in´</TT>. It is recommended to make one <CODE>xgettext</CODE>
|
|
invocation per programming language, each with the options appropriate for
|
|
that language, and to combine the resulting files using <CODE>msgcat</CODE>.
|
|
</UL>
|
|
|
|
|
|
|
|
<H2><A NAME="SEC245" HREF="gettext_toc.html#TOC245">13.5 Individual Programming Languages</A></H2>
|
|
|
|
|
|
|
|
<H3><A NAME="SEC246" HREF="gettext_toc.html#TOC246">13.5.1 C, C++, Objective C</A></H3>
|
|
<P>
|
|
<A NAME="IDX1076"></A>
|
|
|
|
</P>
|
|
<DL COMPACT>
|
|
|
|
<DT>RPMs
|
|
<DD>
|
|
gcc, gpp, gobjc, glibc, gettext
|
|
|
|
<DT>File extension
|
|
<DD>
|
|
For C: <CODE>c</CODE>, <CODE>h</CODE>.
|
|
<BR>For C++: <CODE>C</CODE>, <CODE>c++</CODE>, <CODE>cc</CODE>, <CODE>cxx</CODE>, <CODE>cpp</CODE>, <CODE>hpp</CODE>.
|
|
<BR>For Objective C: <CODE>m</CODE>.
|
|
|
|
<DT>String syntax
|
|
<DD>
|
|
<CODE>"abc"</CODE>
|
|
|
|
<DT>gettext shorthand
|
|
<DD>
|
|
<CODE>_("abc")</CODE>
|
|
|
|
<DT>gettext/ngettext functions
|
|
<DD>
|
|
<CODE>gettext</CODE>, <CODE>dgettext</CODE>, <CODE>dcgettext</CODE>, <CODE>ngettext</CODE>,
|
|
<CODE>dngettext</CODE>, <CODE>dcngettext</CODE>
|
|
|
|
<DT>textdomain
|
|
<DD>
|
|
<CODE>textdomain</CODE> function
|
|
|
|
<DT>bindtextdomain
|
|
<DD>
|
|
<CODE>bindtextdomain</CODE> function
|
|
|
|
<DT>setlocale
|
|
<DD>
|
|
Programmer must call <CODE>setlocale (LC_ALL, "")</CODE>
|
|
|
|
<DT>Prerequisite
|
|
<DD>
|
|
<CODE>#include <libintl.h></CODE>
|
|
<BR><CODE>#include <locale.h></CODE>
|
|
<BR><CODE>#define _(string) gettext (string)</CODE>
|
|
|
|
<DT>Use or emulate GNU gettext
|
|
<DD>
|
|
Use
|
|
|
|
<DT>Extractor
|
|
<DD>
|
|
<CODE>xgettext -k_</CODE>
|
|
|
|
<DT>Formatting with positions
|
|
<DD>
|
|
<CODE>fprintf "%2$d %1$d"</CODE>
|
|
<BR>In C++: <CODE>autosprintf "%2$d %1$d"</CODE>
|
|
(see section `Introduction' in <CITE>GNU autosprintf</CITE>)
|
|
|
|
<DT>Portability
|
|
<DD>
|
|
autoconf (gettext.m4) and #if ENABLE_NLS
|
|
|
|
<DT>po-mode marking
|
|
<DD>
|
|
yes
|
|
</DL>
|
|
|
|
<P>
|
|
The following examples are available in the <TT>`examples´</TT> directory:
|
|
<CODE>hello-c</CODE>, <CODE>hello-c-gnome</CODE>, <CODE>hello-c++</CODE>, <CODE>hello-c++-qt</CODE>,
|
|
<CODE>hello-c++-kde</CODE>, <CODE>hello-c++-gnome</CODE>, <CODE>hello-objc</CODE>,
|
|
<CODE>hello-objc-gnustep</CODE>, <CODE>hello-objc-gnome</CODE>.
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC247" HREF="gettext_toc.html#TOC247">13.5.2 sh - Shell Script</A></H3>
|
|
<P>
|
|
<A NAME="IDX1077"></A>
|
|
|
|
</P>
|
|
<DL COMPACT>
|
|
|
|
<DT>RPMs
|
|
<DD>
|
|
bash, gettext
|
|
|
|
<DT>File extension
|
|
<DD>
|
|
<CODE>sh</CODE>
|
|
|
|
<DT>String syntax
|
|
<DD>
|
|
<CODE>"abc"</CODE>, <CODE>'abc'</CODE>, <CODE>abc</CODE>
|
|
|
|
<DT>gettext shorthand
|
|
<DD>
|
|
<CODE>"`gettext \"abc\"`"</CODE>
|
|
|
|
<DT>gettext/ngettext functions
|
|
<DD>
|
|
<A NAME="IDX1078"></A>
|
|
<A NAME="IDX1079"></A>
|
|
<CODE>gettext</CODE>, <CODE>ngettext</CODE> programs
|
|
<BR><CODE>eval_gettext</CODE>, <CODE>eval_ngettext</CODE> shell functions
|
|
|
|
<DT>textdomain
|
|
<DD>
|
|
<A NAME="IDX1080"></A>
|
|
environment variable <CODE>TEXTDOMAIN</CODE>
|
|
|
|
<DT>bindtextdomain
|
|
<DD>
|
|
<A NAME="IDX1081"></A>
|
|
environment variable <CODE>TEXTDOMAINDIR</CODE>
|
|
|
|
<DT>setlocale
|
|
<DD>
|
|
automatic
|
|
|
|
<DT>Prerequisite
|
|
<DD>
|
|
<CODE>. gettext.sh</CODE>
|
|
|
|
<DT>Use or emulate GNU gettext
|
|
<DD>
|
|
use
|
|
|
|
<DT>Extractor
|
|
<DD>
|
|
<CODE>xgettext</CODE>
|
|
|
|
<DT>Formatting with positions
|
|
<DD>
|
|
---
|
|
|
|
<DT>Portability
|
|
<DD>
|
|
fully portable
|
|
|
|
<DT>po-mode marking
|
|
<DD>
|
|
---
|
|
</DL>
|
|
|
|
<P>
|
|
An example is available in the <TT>`examples´</TT> directory: <CODE>hello-sh</CODE>.
|
|
|
|
</P>
|
|
|
|
|
|
|
|
<H4><A NAME="SEC248" HREF="gettext_toc.html#TOC248">13.5.2.1 Preparing Shell Scripts for Internationalization</A></H4>
|
|
<P>
|
|
<A NAME="IDX1082"></A>
|
|
|
|
</P>
|
|
<P>
|
|
Preparing a shell script for internationalization is conceptually similar
|
|
to the steps described in section <A HREF="gettext_3.html#SEC13">3 Preparing Program Sources</A>. The concrete steps for shell
|
|
scripts are as follows.
|
|
|
|
</P>
|
|
|
|
<OL>
|
|
<LI>
|
|
|
|
Insert the line
|
|
|
|
|
|
<PRE>
|
|
. gettext.sh
|
|
</PRE>
|
|
|
|
near the top of the script. <CODE>gettext.sh</CODE> is a shell function library
|
|
that provides the functions
|
|
<CODE>eval_gettext</CODE> (see section <A HREF="gettext_13.html#SEC253">13.5.2.6 Invoking the <CODE>eval_gettext</CODE> function</A>) and
|
|
<CODE>eval_ngettext</CODE> (see section <A HREF="gettext_13.html#SEC254">13.5.2.7 Invoking the <CODE>eval_ngettext</CODE> function</A>).
|
|
You have to ensure that <CODE>gettext.sh</CODE> can be found in the <CODE>PATH</CODE>.
|
|
|
|
<LI>
|
|
|
|
Set and export the <CODE>TEXTDOMAIN</CODE> and <CODE>TEXTDOMAINDIR</CODE> environment
|
|
variables. Usually <CODE>TEXTDOMAIN</CODE> is the package or program name, and
|
|
<CODE>TEXTDOMAINDIR</CODE> is the absolute pathname corresponding to
|
|
<CODE>$prefix/share/locale</CODE>, where <CODE>$prefix</CODE> is the installation location.
|
|
|
|
|
|
<PRE>
|
|
TEXTDOMAIN=@PACKAGE@
|
|
export TEXTDOMAIN
|
|
TEXTDOMAINDIR=@LOCALEDIR@
|
|
export TEXTDOMAINDIR
|
|
</PRE>
|
|
|
|
<LI>
|
|
|
|
Prepare the strings for translation, as described in section <A HREF="gettext_3.html#SEC15">3.2 Preparing Translatable Strings</A>.
|
|
|
|
<LI>
|
|
|
|
Simplify translatable strings so that they don't contain command substitution
|
|
(<CODE>"`...`"</CODE> or <CODE>"$(...)"</CODE>), variable access with defaulting (like
|
|
<CODE>${<VAR>variable</VAR>-<VAR>default</VAR>}</CODE>), access to positional arguments
|
|
(like <CODE>$0</CODE>, <CODE>$1</CODE>, ...) or highly volatile shell variables (like
|
|
<CODE>$?</CODE>). This can always be done through simple local code restructuring.
|
|
For example,
|
|
|
|
|
|
<PRE>
|
|
echo "Usage: $0 [OPTION] FILE..."
|
|
</PRE>
|
|
|
|
becomes
|
|
|
|
|
|
<PRE>
|
|
program_name=$0
|
|
echo "Usage: $program_name [OPTION] FILE..."
|
|
</PRE>
|
|
|
|
Similarly,
|
|
|
|
|
|
<PRE>
|
|
echo "Remaining files: `ls | wc -l`"
|
|
</PRE>
|
|
|
|
becomes
|
|
|
|
|
|
<PRE>
|
|
filecount="`ls | wc -l`"
|
|
echo "Remaining files: $filecount"
|
|
</PRE>
|
|
|
|
<LI>
|
|
|
|
For each translatable string, change the output command <SAMP>`echo´</SAMP> or
|
|
<SAMP>`$echo´</SAMP> to <SAMP>`gettext´</SAMP> (if the string contains no references to
|
|
shell variables) or to <SAMP>`eval_gettext´</SAMP> (if it refers to shell variables),
|
|
followed by a no-argument <SAMP>`echo´</SAMP> command (to account for the terminating
|
|
newline). Similarly, for cases with plural handling, replace a conditional
|
|
<SAMP>`echo´</SAMP> command with an invocation of <SAMP>`ngettext´</SAMP> or
|
|
<SAMP>`eval_ngettext´</SAMP>, followed by a no-argument <SAMP>`echo´</SAMP> command.
|
|
|
|
When doing this, you also need to add an extra backslash before the dollar
|
|
sign in references to shell variables, so that the <SAMP>`eval_gettext´</SAMP>
|
|
function receives the translatable string before the variable values are
|
|
substituted into it. For example,
|
|
|
|
|
|
<PRE>
|
|
echo "Remaining files: $filecount"
|
|
</PRE>
|
|
|
|
becomes
|
|
|
|
|
|
<PRE>
|
|
eval_gettext "Remaining files: \$filecount"; echo
|
|
</PRE>
|
|
|
|
If the output command is not <SAMP>`echo´</SAMP>, you can make it use <SAMP>`echo´</SAMP>
|
|
nevertheless, through the use of backquotes. However, note that inside
|
|
backquotes, backslashes must be doubled to be effective (because the
|
|
backquoting eats one level of backslashes). For example, assuming that
|
|
<SAMP>`error´</SAMP> is a shell function that signals an error,
|
|
|
|
|
|
<PRE>
|
|
error "file not found: $filename"
|
|
</PRE>
|
|
|
|
is first transformed into
|
|
|
|
|
|
<PRE>
|
|
error "`echo \"file not found: \$filename\"`"
|
|
</PRE>
|
|
|
|
which then becomes
|
|
|
|
|
|
<PRE>
|
|
error "`eval_gettext \"file not found: \\\$filename\"`"
|
|
</PRE>
|
|
|
|
</OL>
|
|
|
|
|
|
|
|
<H4><A NAME="SEC249" HREF="gettext_toc.html#TOC249">13.5.2.2 Contents of <CODE>gettext.sh</CODE></A></H4>
|
|
|
|
<P>
|
|
<CODE>gettext.sh</CODE>, contained in the run-time package of GNU gettext, provides
|
|
the following:
|
|
|
|
</P>
|
|
|
|
<UL>
|
|
<LI>$echo
|
|
|
|
The variable <CODE>echo</CODE> is set to a command that outputs its first argument
|
|
and a newline, without interpreting backslashes in the argument string.
|
|
|
|
<LI>eval_gettext
|
|
|
|
See section <A HREF="gettext_13.html#SEC253">13.5.2.6 Invoking the <CODE>eval_gettext</CODE> function</A>.
|
|
|
|
<LI>eval_ngettext
|
|
|
|
See section <A HREF="gettext_13.html#SEC254">13.5.2.7 Invoking the <CODE>eval_ngettext</CODE> function</A>.
|
|
</UL>
|
|
|
|
|
|
|
|
<H4><A NAME="SEC250" HREF="gettext_toc.html#TOC250">13.5.2.3 Invoking the <CODE>gettext</CODE> program</A></H4>
|
|
|
|
<P>
|
|
<A NAME="IDX1083"></A>
|
|
<A NAME="IDX1084"></A>
|
|
|
|
<PRE>
|
|
gettext [<VAR>option</VAR>] [[<VAR>textdomain</VAR>] <VAR>msgid</VAR>]
|
|
gettext [<VAR>option</VAR>] -s [<VAR>msgid</VAR>]...
|
|
</PRE>
|
|
|
|
<P>
|
|
<A NAME="IDX1085"></A>
|
|
The <CODE>gettext</CODE> program displays the native language translation of a
|
|
textual message.
|
|
|
|
</P>
|
|
<P>
|
|
<STRONG>Arguments</STRONG>
|
|
|
|
</P>
|
|
<DL COMPACT>
|
|
|
|
<DT><SAMP>`-d <VAR>textdomain</VAR>´</SAMP>
|
|
<DD>
|
|
<DT><SAMP>`--domain=<VAR>textdomain</VAR>´</SAMP>
|
|
<DD>
|
|
<A NAME="IDX1086"></A>
|
|
<A NAME="IDX1087"></A>
|
|
Retrieve translated messages from <VAR>textdomain</VAR>. Usually a <VAR>textdomain</VAR>
|
|
corresponds to a package, a program, or a module of a program.
|
|
|
|
<DT><SAMP>`-e´</SAMP>
|
|
<DD>
|
|
<A NAME="IDX1088"></A>
|
|
Enable expansion of some escape sequences. This option is for compatibility
|
|
with the <SAMP>`echo´</SAMP> program or shell built-in. The escape sequences
|
|
<SAMP>`\a´</SAMP>, <SAMP>`\b´</SAMP>, <SAMP>`\c´</SAMP>, <SAMP>`\f´</SAMP>, <SAMP>`\n´</SAMP>, <SAMP>`\r´</SAMP>, <SAMP>`\t´</SAMP>,
|
|
<SAMP>`\v´</SAMP>, <SAMP>`\\´</SAMP>, and <SAMP>`\´</SAMP> followed by one to three octal digits, are
|
|
interpreted like the SystemV <SAMP>`echo´</SAMP> program does.
|
|
|
|
<DT><SAMP>`-E´</SAMP>
|
|
<DD>
|
|
<A NAME="IDX1089"></A>
|
|
This option is only for compatibility with the <SAMP>`echo´</SAMP> program or shell
|
|
built-in. It has no effect.
|
|
|
|
<DT><SAMP>`-h´</SAMP>
|
|
<DD>
|
|
<DT><SAMP>`--help´</SAMP>
|
|
<DD>
|
|
<A NAME="IDX1090"></A>
|
|
<A NAME="IDX1091"></A>
|
|
Display this help and exit.
|
|
|
|
<DT><SAMP>`-n´</SAMP>
|
|
<DD>
|
|
<A NAME="IDX1092"></A>
|
|
Suppress trailing newline. By default, <CODE>gettext</CODE> adds a newline to
|
|
the output.
|
|
|
|
<DT><SAMP>`-V´</SAMP>
|
|
<DD>
|
|
<DT><SAMP>`--version´</SAMP>
|
|
<DD>
|
|
<A NAME="IDX1093"></A>
|
|
<A NAME="IDX1094"></A>
|
|
Output version information and exit.
|
|
|
|
<DT><SAMP>`[<VAR>textdomain</VAR>] <VAR>msgid</VAR>´</SAMP>
|
|
<DD>
|
|
Retrieve translated message corresponding to <VAR>msgid</VAR> from <VAR>textdomain</VAR>.
|
|
|
|
</DL>
|
|
|
|
<P>
|
|
If the <VAR>textdomain</VAR> parameter is not given, the domain is determined from
|
|
the environment variable <CODE>TEXTDOMAIN</CODE>. If the message catalog is not
|
|
found in the regular directory, another location can be specified with the
|
|
environment variable <CODE>TEXTDOMAINDIR</CODE>.
|
|
|
|
</P>
|
|
<P>
|
|
When used with the <CODE>-s</CODE> option the program behaves like the <SAMP>`echo´</SAMP>
|
|
command. But it does not simply copy its arguments to stdout. Instead those
|
|
messages found in the selected catalog are translated.
|
|
|
|
</P>
|
|
|
|
|
|
<H4><A NAME="SEC251" HREF="gettext_toc.html#TOC251">13.5.2.4 Invoking the <CODE>ngettext</CODE> program</A></H4>
|
|
|
|
<P>
|
|
<A NAME="IDX1095"></A>
|
|
<A NAME="IDX1096"></A>
|
|
|
|
<PRE>
|
|
ngettext [<VAR>option</VAR>] [<VAR>textdomain</VAR>] <VAR>msgid</VAR> <VAR>msgid-plural</VAR> <VAR>count</VAR>
|
|
</PRE>
|
|
|
|
<P>
|
|
<A NAME="IDX1097"></A>
|
|
The <CODE>ngettext</CODE> program displays the native language translation of a
|
|
textual message whose grammatical form depends on a number.
|
|
|
|
</P>
|
|
<P>
|
|
<STRONG>Arguments</STRONG>
|
|
|
|
</P>
|
|
<DL COMPACT>
|
|
|
|
<DT><SAMP>`-d <VAR>textdomain</VAR>´</SAMP>
|
|
<DD>
|
|
<DT><SAMP>`--domain=<VAR>textdomain</VAR>´</SAMP>
|
|
<DD>
|
|
<A NAME="IDX1098"></A>
|
|
<A NAME="IDX1099"></A>
|
|
Retrieve translated messages from <VAR>textdomain</VAR>. Usually a <VAR>textdomain</VAR>
|
|
corresponds to a package, a program, or a module of a program.
|
|
|
|
<DT><SAMP>`-e´</SAMP>
|
|
<DD>
|
|
<A NAME="IDX1100"></A>
|
|
Enable expansion of some escape sequences. This option is for compatibility
|
|
with the <SAMP>`gettext´</SAMP> program. The escape sequences
|
|
<SAMP>`\a´</SAMP>, <SAMP>`\b´</SAMP>, <SAMP>`\c´</SAMP>, <SAMP>`\f´</SAMP>, <SAMP>`\n´</SAMP>, <SAMP>`\r´</SAMP>, <SAMP>`\t´</SAMP>,
|
|
<SAMP>`\v´</SAMP>, <SAMP>`\\´</SAMP>, and <SAMP>`\´</SAMP> followed by one to three octal digits, are
|
|
interpreted like the SystemV <SAMP>`echo´</SAMP> program does.
|
|
|
|
<DT><SAMP>`-E´</SAMP>
|
|
<DD>
|
|
<A NAME="IDX1101"></A>
|
|
This option is only for compatibility with the <SAMP>`gettext´</SAMP> program. It has
|
|
no effect.
|
|
|
|
<DT><SAMP>`-h´</SAMP>
|
|
<DD>
|
|
<DT><SAMP>`--help´</SAMP>
|
|
<DD>
|
|
<A NAME="IDX1102"></A>
|
|
<A NAME="IDX1103"></A>
|
|
Display this help and exit.
|
|
|
|
<DT><SAMP>`-V´</SAMP>
|
|
<DD>
|
|
<DT><SAMP>`--version´</SAMP>
|
|
<DD>
|
|
<A NAME="IDX1104"></A>
|
|
<A NAME="IDX1105"></A>
|
|
Output version information and exit.
|
|
|
|
<DT><SAMP>`<VAR>textdomain</VAR>´</SAMP>
|
|
<DD>
|
|
Retrieve translated message from <VAR>textdomain</VAR>.
|
|
|
|
<DT><SAMP>`<VAR>msgid</VAR> <VAR>msgid-plural</VAR>´</SAMP>
|
|
<DD>
|
|
Translate <VAR>msgid</VAR> (English singular) / <VAR>msgid-plural</VAR> (English plural).
|
|
|
|
<DT><SAMP>`<VAR>count</VAR>´</SAMP>
|
|
<DD>
|
|
Choose singular/plural form based on this value.
|
|
|
|
</DL>
|
|
|
|
<P>
|
|
If the <VAR>textdomain</VAR> parameter is not given, the domain is determined from
|
|
the environment variable <CODE>TEXTDOMAIN</CODE>. If the message catalog is not
|
|
found in the regular directory, another location can be specified with the
|
|
environment variable <CODE>TEXTDOMAINDIR</CODE>.
|
|
|
|
</P>
|
|
|
|
|
|
<H4><A NAME="SEC252" HREF="gettext_toc.html#TOC252">13.5.2.5 Invoking the <CODE>envsubst</CODE> program</A></H4>
|
|
|
|
<P>
|
|
<A NAME="IDX1106"></A>
|
|
<A NAME="IDX1107"></A>
|
|
|
|
<PRE>
|
|
envsubst [<VAR>option</VAR>] [<VAR>shell-format</VAR>]
|
|
</PRE>
|
|
|
|
<P>
|
|
<A NAME="IDX1108"></A>
|
|
<A NAME="IDX1109"></A>
|
|
<A NAME="IDX1110"></A>
|
|
The <CODE>envsubst</CODE> program substitutes the values of environment variables.
|
|
|
|
</P>
|
|
<P>
|
|
<STRONG>Operation mode</STRONG>
|
|
|
|
</P>
|
|
<DL COMPACT>
|
|
|
|
<DT><SAMP>`-v´</SAMP>
|
|
<DD>
|
|
<DT><SAMP>`--variables´</SAMP>
|
|
<DD>
|
|
<A NAME="IDX1111"></A>
|
|
<A NAME="IDX1112"></A>
|
|
Output the variables occurring in <VAR>shell-format</VAR>.
|
|
|
|
</DL>
|
|
|
|
<P>
|
|
<STRONG>Informative output</STRONG>
|
|
|
|
</P>
|
|
<DL COMPACT>
|
|
|
|
<DT><SAMP>`-h´</SAMP>
|
|
<DD>
|
|
<DT><SAMP>`--help´</SAMP>
|
|
<DD>
|
|
<A NAME="IDX1113"></A>
|
|
<A NAME="IDX1114"></A>
|
|
Display this help and exit.
|
|
|
|
<DT><SAMP>`-V´</SAMP>
|
|
<DD>
|
|
<DT><SAMP>`--version´</SAMP>
|
|
<DD>
|
|
<A NAME="IDX1115"></A>
|
|
<A NAME="IDX1116"></A>
|
|
Output version information and exit.
|
|
|
|
</DL>
|
|
|
|
<P>
|
|
In normal operation mode, standard input is copied to standard output,
|
|
with references to environment variables of the form <CODE>$VARIABLE</CODE> or
|
|
<CODE>${VARIABLE}</CODE> being replaced with the corresponding values. If a
|
|
<VAR>shell-format</VAR> is given, only those environment variables that are
|
|
referenced in <VAR>shell-format</VAR> are substituted; otherwise all environment
|
|
variables references occurring in standard input are substituted.
|
|
|
|
</P>
|
|
<P>
|
|
These substitutions are a subset of the substitutions that a shell performs
|
|
on unquoted and double-quoted strings. Other kinds of substitutions done
|
|
by a shell, such as <CODE>${<VAR>variable</VAR>-<VAR>default</VAR>}</CODE> or
|
|
<CODE>$(<VAR>command-list</VAR>)</CODE> or <CODE>`<VAR>command-list</VAR>`</CODE>, are not performed
|
|
by the <CODE>envsubst</CODE> program, due to security reasons.
|
|
|
|
</P>
|
|
<P>
|
|
When <CODE>--variables</CODE> is used, standard input is ignored, and the output
|
|
consists of the environment variables that are referenced in
|
|
<VAR>shell-format</VAR>, one per line.
|
|
|
|
</P>
|
|
|
|
|
|
<H4><A NAME="SEC253" HREF="gettext_toc.html#TOC253">13.5.2.6 Invoking the <CODE>eval_gettext</CODE> function</A></H4>
|
|
|
|
<P>
|
|
<A NAME="IDX1117"></A>
|
|
|
|
<PRE>
|
|
eval_gettext <VAR>msgid</VAR>
|
|
</PRE>
|
|
|
|
<P>
|
|
<A NAME="IDX1118"></A>
|
|
This function outputs the native language translation of a textual message,
|
|
performing dollar-substitution on the result. Note that only shell variables
|
|
mentioned in <VAR>msgid</VAR> will be dollar-substituted in the result.
|
|
|
|
</P>
|
|
|
|
|
|
<H4><A NAME="SEC254" HREF="gettext_toc.html#TOC254">13.5.2.7 Invoking the <CODE>eval_ngettext</CODE> function</A></H4>
|
|
|
|
<P>
|
|
<A NAME="IDX1119"></A>
|
|
|
|
<PRE>
|
|
eval_ngettext <VAR>msgid</VAR> <VAR>msgid-plural</VAR> <VAR>count</VAR>
|
|
</PRE>
|
|
|
|
<P>
|
|
<A NAME="IDX1120"></A>
|
|
This function outputs the native language translation of a textual message
|
|
whose grammatical form depends on a number, performing dollar-substitution
|
|
on the result. Note that only shell variables mentioned in <VAR>msgid</VAR> or
|
|
<VAR>msgid-plural</VAR> will be dollar-substituted in the result.
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC255" HREF="gettext_toc.html#TOC255">13.5.3 bash - Bourne-Again Shell Script</A></H3>
|
|
<P>
|
|
<A NAME="IDX1121"></A>
|
|
|
|
</P>
|
|
<P>
|
|
GNU <CODE>bash</CODE> 2.0 or newer has a special shorthand for translating a
|
|
string and substituting variable values in it: <CODE>$"msgid"</CODE>. But
|
|
the use of this construct is <STRONG>discouraged</STRONG>, due to the security
|
|
holes it opens and due to its portability problems.
|
|
|
|
</P>
|
|
<P>
|
|
The security holes of <CODE>$"..."</CODE> come from the fact that after looking up
|
|
the translation of the string, <CODE>bash</CODE> processes it like it processes
|
|
any double-quoted string: dollar and backquote processing, like <SAMP>`eval´</SAMP>
|
|
does.
|
|
|
|
</P>
|
|
|
|
<OL>
|
|
<LI>
|
|
|
|
In a locale whose encoding is one of BIG5, BIG5-HKSCS, GBK, GB18030, SHIFT_JIS,
|
|
JOHAB, some double-byte characters have a second byte whose value is
|
|
<CODE>0x60</CODE>. For example, the byte sequence <CODE>\xe0\x60</CODE> is a single
|
|
character in these locales. Many versions of <CODE>bash</CODE> (all versions
|
|
up to bash-2.05, and newer versions on platforms without <CODE>mbsrtowcs()</CODE>
|
|
function) don't know about character boundaries and see a backquote character
|
|
where there is only a particular Chinese character. Thus it can start
|
|
executing part of the translation as a command list. This situation can occur
|
|
even without the translator being aware of it: if the translator provides
|
|
translations in the UTF-8 encoding, it is the <CODE>gettext()</CODE> function which
|
|
will, during its conversion from the translator's encoding to the user's
|
|
locale's encoding, produce the dangerous <CODE>\x60</CODE> bytes.
|
|
|
|
<LI>
|
|
|
|
A translator could - voluntarily or inadvertantly - use backquotes
|
|
<CODE>"`...`"</CODE> or dollar-parentheses <CODE>"$(...)"</CODE> in her translations.
|
|
The enclosed strings would be executed as command lists by the shell.
|
|
</OL>
|
|
|
|
<P>
|
|
The portability problem is that <CODE>bash</CODE> must be built with
|
|
internationalization support; this is normally not the case on systems
|
|
that don't have the <CODE>gettext()</CODE> function in libc.
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC256" HREF="gettext_toc.html#TOC256">13.5.4 Python</A></H3>
|
|
<P>
|
|
<A NAME="IDX1122"></A>
|
|
|
|
</P>
|
|
<DL COMPACT>
|
|
|
|
<DT>RPMs
|
|
<DD>
|
|
python
|
|
|
|
<DT>File extension
|
|
<DD>
|
|
<CODE>py</CODE>
|
|
|
|
<DT>String syntax
|
|
<DD>
|
|
<CODE>'abc'</CODE>, <CODE>u'abc'</CODE>, <CODE>r'abc'</CODE>, <CODE>ur'abc'</CODE>,
|
|
<BR><CODE>"abc"</CODE>, <CODE>u"abc"</CODE>, <CODE>r"abc"</CODE>, <CODE>ur"abc"</CODE>,
|
|
<BR><CODE>"'abc"'</CODE>, <CODE>u"'abc"'</CODE>, <CODE>r"'abc"'</CODE>, <CODE>ur"'abc"'</CODE>,
|
|
<BR><CODE>"""abc"""</CODE>, <CODE>u"""abc"""</CODE>, <CODE>r"""abc"""</CODE>, <CODE>ur"""abc"""</CODE>
|
|
|
|
<DT>gettext shorthand
|
|
<DD>
|
|
<CODE>_('abc')</CODE> etc.
|
|
|
|
<DT>gettext/ngettext functions
|
|
<DD>
|
|
<CODE>gettext.gettext</CODE>, <CODE>gettext.dgettext</CODE>,
|
|
<CODE>gettext.ngettext</CODE>, <CODE>gettext.dngettext</CODE>,
|
|
also <CODE>ugettext</CODE>, <CODE>ungettext</CODE>
|
|
|
|
<DT>textdomain
|
|
<DD>
|
|
<CODE>gettext.textdomain</CODE> function, or
|
|
<CODE>gettext.install(<VAR>domain</VAR>)</CODE> function
|
|
|
|
<DT>bindtextdomain
|
|
<DD>
|
|
<CODE>gettext.bindtextdomain</CODE> function, or
|
|
<CODE>gettext.install(<VAR>domain</VAR>,<VAR>localedir</VAR>)</CODE> function
|
|
|
|
<DT>setlocale
|
|
<DD>
|
|
not used by the gettext emulation
|
|
|
|
<DT>Prerequisite
|
|
<DD>
|
|
<CODE>import gettext</CODE>
|
|
|
|
<DT>Use or emulate GNU gettext
|
|
<DD>
|
|
emulate
|
|
|
|
<DT>Extractor
|
|
<DD>
|
|
<CODE>xgettext</CODE>
|
|
|
|
<DT>Formatting with positions
|
|
<DD>
|
|
<CODE>'...%(ident)d...' % { 'ident': value }</CODE>
|
|
|
|
<DT>Portability
|
|
<DD>
|
|
fully portable
|
|
|
|
<DT>po-mode marking
|
|
<DD>
|
|
---
|
|
</DL>
|
|
|
|
<P>
|
|
An example is available in the <TT>`examples´</TT> directory: <CODE>hello-python</CODE>.
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC257" HREF="gettext_toc.html#TOC257">13.5.5 GNU clisp - Common Lisp</A></H3>
|
|
<P>
|
|
<A NAME="IDX1123"></A>
|
|
<A NAME="IDX1124"></A>
|
|
<A NAME="IDX1125"></A>
|
|
|
|
</P>
|
|
<DL COMPACT>
|
|
|
|
<DT>RPMs
|
|
<DD>
|
|
clisp 2.28 or newer
|
|
|
|
<DT>File extension
|
|
<DD>
|
|
<CODE>lisp</CODE>
|
|
|
|
<DT>String syntax
|
|
<DD>
|
|
<CODE>"abc"</CODE>
|
|
|
|
<DT>gettext shorthand
|
|
<DD>
|
|
<CODE>(_ "abc")</CODE>, <CODE>(ENGLISH "abc")</CODE>
|
|
|
|
<DT>gettext/ngettext functions
|
|
<DD>
|
|
<CODE>i18n:gettext</CODE>, <CODE>i18n:ngettext</CODE>
|
|
|
|
<DT>textdomain
|
|
<DD>
|
|
<CODE>i18n:textdomain</CODE>
|
|
|
|
<DT>bindtextdomain
|
|
<DD>
|
|
<CODE>i18n:textdomaindir</CODE>
|
|
|
|
<DT>setlocale
|
|
<DD>
|
|
automatic
|
|
|
|
<DT>Prerequisite
|
|
<DD>
|
|
---
|
|
|
|
<DT>Use or emulate GNU gettext
|
|
<DD>
|
|
use
|
|
|
|
<DT>Extractor
|
|
<DD>
|
|
<CODE>xgettext -k_ -kENGLISH</CODE>
|
|
|
|
<DT>Formatting with positions
|
|
<DD>
|
|
<CODE>format "~1@*~D ~0@*~D"</CODE>
|
|
|
|
<DT>Portability
|
|
<DD>
|
|
On platforms without gettext, no translation.
|
|
|
|
<DT>po-mode marking
|
|
<DD>
|
|
---
|
|
</DL>
|
|
|
|
<P>
|
|
An example is available in the <TT>`examples´</TT> directory: <CODE>hello-clisp</CODE>.
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC258" HREF="gettext_toc.html#TOC258">13.5.6 GNU clisp C sources</A></H3>
|
|
<P>
|
|
<A NAME="IDX1126"></A>
|
|
|
|
</P>
|
|
<DL COMPACT>
|
|
|
|
<DT>RPMs
|
|
<DD>
|
|
clisp
|
|
|
|
<DT>File extension
|
|
<DD>
|
|
<CODE>d</CODE>
|
|
|
|
<DT>String syntax
|
|
<DD>
|
|
<CODE>"abc"</CODE>
|
|
|
|
<DT>gettext shorthand
|
|
<DD>
|
|
<CODE>ENGLISH ? "abc" : ""</CODE>
|
|
<BR><CODE>GETTEXT("abc")</CODE>
|
|
<BR><CODE>GETTEXTL("abc")</CODE>
|
|
|
|
<DT>gettext/ngettext functions
|
|
<DD>
|
|
<CODE>clgettext</CODE>, <CODE>clgettextl</CODE>
|
|
|
|
<DT>textdomain
|
|
<DD>
|
|
---
|
|
|
|
<DT>bindtextdomain
|
|
<DD>
|
|
---
|
|
|
|
<DT>setlocale
|
|
<DD>
|
|
automatic
|
|
|
|
<DT>Prerequisite
|
|
<DD>
|
|
<CODE>#include "lispbibl.c"</CODE>
|
|
|
|
<DT>Use or emulate GNU gettext
|
|
<DD>
|
|
use
|
|
|
|
<DT>Extractor
|
|
<DD>
|
|
<CODE>clisp-xgettext</CODE>
|
|
|
|
<DT>Formatting with positions
|
|
<DD>
|
|
<CODE>fprintf "%2$d %1$d"</CODE>
|
|
|
|
<DT>Portability
|
|
<DD>
|
|
On platforms without gettext, no translation.
|
|
|
|
<DT>po-mode marking
|
|
<DD>
|
|
---
|
|
</DL>
|
|
|
|
|
|
|
|
<H3><A NAME="SEC259" HREF="gettext_toc.html#TOC259">13.5.7 Emacs Lisp</A></H3>
|
|
<P>
|
|
<A NAME="IDX1127"></A>
|
|
|
|
</P>
|
|
<DL COMPACT>
|
|
|
|
<DT>RPMs
|
|
<DD>
|
|
emacs, xemacs
|
|
|
|
<DT>File extension
|
|
<DD>
|
|
<CODE>el</CODE>
|
|
|
|
<DT>String syntax
|
|
<DD>
|
|
<CODE>"abc"</CODE>
|
|
|
|
<DT>gettext shorthand
|
|
<DD>
|
|
<CODE>(_"abc")</CODE>
|
|
|
|
<DT>gettext/ngettext functions
|
|
<DD>
|
|
<CODE>gettext</CODE>, <CODE>dgettext</CODE> (xemacs only)
|
|
|
|
<DT>textdomain
|
|
<DD>
|
|
<CODE>domain</CODE> special form (xemacs only)
|
|
|
|
<DT>bindtextdomain
|
|
<DD>
|
|
<CODE>bind-text-domain</CODE> function (xemacs only)
|
|
|
|
<DT>setlocale
|
|
<DD>
|
|
automatic
|
|
|
|
<DT>Prerequisite
|
|
<DD>
|
|
---
|
|
|
|
<DT>Use or emulate GNU gettext
|
|
<DD>
|
|
use
|
|
|
|
<DT>Extractor
|
|
<DD>
|
|
<CODE>xgettext</CODE>
|
|
|
|
<DT>Formatting with positions
|
|
<DD>
|
|
<CODE>format "%2$d %1$d"</CODE>
|
|
|
|
<DT>Portability
|
|
<DD>
|
|
Only XEmacs. Without <CODE>I18N3</CODE> defined at build time, no translation.
|
|
|
|
<DT>po-mode marking
|
|
<DD>
|
|
---
|
|
</DL>
|
|
|
|
|
|
|
|
<H3><A NAME="SEC260" HREF="gettext_toc.html#TOC260">13.5.8 librep</A></H3>
|
|
<P>
|
|
<A NAME="IDX1128"></A>
|
|
|
|
</P>
|
|
<DL COMPACT>
|
|
|
|
<DT>RPMs
|
|
<DD>
|
|
librep 0.15.3 or newer
|
|
|
|
<DT>File extension
|
|
<DD>
|
|
<CODE>jl</CODE>
|
|
|
|
<DT>String syntax
|
|
<DD>
|
|
<CODE>"abc"</CODE>
|
|
|
|
<DT>gettext shorthand
|
|
<DD>
|
|
<CODE>(_"abc")</CODE>
|
|
|
|
<DT>gettext/ngettext functions
|
|
<DD>
|
|
<CODE>gettext</CODE>
|
|
|
|
<DT>textdomain
|
|
<DD>
|
|
<CODE>textdomain</CODE> function
|
|
|
|
<DT>bindtextdomain
|
|
<DD>
|
|
<CODE>bindtextdomain</CODE> function
|
|
|
|
<DT>setlocale
|
|
<DD>
|
|
---
|
|
|
|
<DT>Prerequisite
|
|
<DD>
|
|
<CODE>(require 'rep.i18n.gettext)</CODE>
|
|
|
|
<DT>Use or emulate GNU gettext
|
|
<DD>
|
|
use
|
|
|
|
<DT>Extractor
|
|
<DD>
|
|
<CODE>xgettext</CODE>
|
|
|
|
<DT>Formatting with positions
|
|
<DD>
|
|
<CODE>format "%2$d %1$d"</CODE>
|
|
|
|
<DT>Portability
|
|
<DD>
|
|
On platforms without gettext, no translation.
|
|
|
|
<DT>po-mode marking
|
|
<DD>
|
|
---
|
|
</DL>
|
|
|
|
<P>
|
|
An example is available in the <TT>`examples´</TT> directory: <CODE>hello-librep</CODE>.
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC261" HREF="gettext_toc.html#TOC261">13.5.9 GNU guile - Scheme</A></H3>
|
|
<P>
|
|
<A NAME="IDX1129"></A>
|
|
<A NAME="IDX1130"></A>
|
|
|
|
</P>
|
|
<DL COMPACT>
|
|
|
|
<DT>RPMs
|
|
<DD>
|
|
guile
|
|
|
|
<DT>File extension
|
|
<DD>
|
|
<CODE>scm</CODE>
|
|
|
|
<DT>String syntax
|
|
<DD>
|
|
<CODE>"abc"</CODE>
|
|
|
|
<DT>gettext shorthand
|
|
<DD>
|
|
<CODE>(_ "abc")</CODE>
|
|
|
|
<DT>gettext/ngettext functions
|
|
<DD>
|
|
<CODE>gettext</CODE>, <CODE>ngettext</CODE>
|
|
|
|
<DT>textdomain
|
|
<DD>
|
|
<CODE>textdomain</CODE>
|
|
|
|
<DT>bindtextdomain
|
|
<DD>
|
|
<CODE>bindtextdomain</CODE>
|
|
|
|
<DT>setlocale
|
|
<DD>
|
|
<CODE>(catch #t (lambda () (setlocale LC_ALL "")) (lambda args #f))</CODE>
|
|
|
|
<DT>Prerequisite
|
|
<DD>
|
|
<CODE>(use-modules (ice-9 format))</CODE>
|
|
|
|
<DT>Use or emulate GNU gettext
|
|
<DD>
|
|
use
|
|
|
|
<DT>Extractor
|
|
<DD>
|
|
<CODE>xgettext -k_</CODE>
|
|
|
|
<DT>Formatting with positions
|
|
<DD>
|
|
---
|
|
|
|
<DT>Portability
|
|
<DD>
|
|
On platforms without gettext, no translation.
|
|
|
|
<DT>po-mode marking
|
|
<DD>
|
|
---
|
|
</DL>
|
|
|
|
<P>
|
|
An example is available in the <TT>`examples´</TT> directory: <CODE>hello-guile</CODE>.
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC262" HREF="gettext_toc.html#TOC262">13.5.10 GNU Smalltalk</A></H3>
|
|
<P>
|
|
<A NAME="IDX1131"></A>
|
|
|
|
</P>
|
|
<DL COMPACT>
|
|
|
|
<DT>RPMs
|
|
<DD>
|
|
smalltalk
|
|
|
|
<DT>File extension
|
|
<DD>
|
|
<CODE>st</CODE>
|
|
|
|
<DT>String syntax
|
|
<DD>
|
|
<CODE>'abc'</CODE>
|
|
|
|
<DT>gettext shorthand
|
|
<DD>
|
|
<CODE>NLS ? 'abc'</CODE>
|
|
|
|
<DT>gettext/ngettext functions
|
|
<DD>
|
|
<CODE>LcMessagesDomain>>#at:</CODE>, <CODE>LcMessagesDomain>>#at:plural:with:</CODE>
|
|
|
|
<DT>textdomain
|
|
<DD>
|
|
<CODE>LcMessages>>#domain:localeDirectory:</CODE> (returns a <CODE>LcMessagesDomain</CODE>
|
|
object).<BR>
|
|
Example: <CODE>I18N Locale default messages domain: 'gettext' localeDirectory: /usr/local/share/locale'</CODE>
|
|
|
|
<DT>bindtextdomain
|
|
<DD>
|
|
<CODE>LcMessages>>#domain:localeDirectory:</CODE>, see above.
|
|
|
|
<DT>setlocale
|
|
<DD>
|
|
Automatic if you use <CODE>I18N Locale default</CODE>.
|
|
|
|
<DT>Prerequisite
|
|
<DD>
|
|
<CODE>PackageLoader fileInPackage: 'I18N'!</CODE>
|
|
|
|
<DT>Use or emulate GNU gettext
|
|
<DD>
|
|
emulate
|
|
|
|
<DT>Extractor
|
|
<DD>
|
|
<CODE>xgettext</CODE>
|
|
|
|
<DT>Formatting with positions
|
|
<DD>
|
|
<CODE>'%1 %2' bindWith: 'Hello' with: 'world'</CODE>
|
|
|
|
<DT>Portability
|
|
<DD>
|
|
fully portable
|
|
|
|
<DT>po-mode marking
|
|
<DD>
|
|
---
|
|
</DL>
|
|
|
|
<P>
|
|
An example is available in the <TT>`examples´</TT> directory:
|
|
<CODE>hello-smalltalk</CODE>.
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC263" HREF="gettext_toc.html#TOC263">13.5.11 Java</A></H3>
|
|
<P>
|
|
<A NAME="IDX1132"></A>
|
|
|
|
</P>
|
|
<DL COMPACT>
|
|
|
|
<DT>RPMs
|
|
<DD>
|
|
java, java2
|
|
|
|
<DT>File extension
|
|
<DD>
|
|
<CODE>java</CODE>
|
|
|
|
<DT>String syntax
|
|
<DD>
|
|
"abc"
|
|
|
|
<DT>gettext shorthand
|
|
<DD>
|
|
_("abc")
|
|
|
|
<DT>gettext/ngettext functions
|
|
<DD>
|
|
<CODE>GettextResource.gettext</CODE>, <CODE>GettextResource.ngettext</CODE>
|
|
|
|
<DT>textdomain
|
|
<DD>
|
|
---, use <CODE>ResourceBundle.getResource</CODE> instead
|
|
|
|
<DT>bindtextdomain
|
|
<DD>
|
|
---, use CLASSPATH instead
|
|
|
|
<DT>setlocale
|
|
<DD>
|
|
automatic
|
|
|
|
<DT>Prerequisite
|
|
<DD>
|
|
---
|
|
|
|
<DT>Use or emulate GNU gettext
|
|
<DD>
|
|
---, uses a Java specific message catalog format
|
|
|
|
<DT>Extractor
|
|
<DD>
|
|
<CODE>xgettext -k_</CODE>
|
|
|
|
<DT>Formatting with positions
|
|
<DD>
|
|
<CODE>MessageFormat.format "{1,number} {0,number}"</CODE>
|
|
|
|
<DT>Portability
|
|
<DD>
|
|
fully portable
|
|
|
|
<DT>po-mode marking
|
|
<DD>
|
|
---
|
|
</DL>
|
|
|
|
<P>
|
|
Before marking strings as internationalizable, uses of the string
|
|
concatenation operator need to be converted to <CODE>MessageFormat</CODE>
|
|
applications. For example, <CODE>"file "+filename+" not found"</CODE> becomes
|
|
<CODE>MessageFormat.format("file {0} not found", new Object[] { filename })</CODE>.
|
|
Only after this is done, can the strings be marked and extracted.
|
|
|
|
</P>
|
|
<P>
|
|
GNU gettext uses the native Java internationalization mechanism, namely
|
|
<CODE>ResourceBundle</CODE>s. There are two formats of <CODE>ResourceBundle</CODE>s:
|
|
<CODE>.properties</CODE> files and <CODE>.class</CODE> files. The <CODE>.properties</CODE>
|
|
format is a text file which the translators can directly edit, like PO
|
|
files, but which doesn't support plural forms. Whereas the <CODE>.class</CODE>
|
|
format is compiled from <CODE>.java</CODE> source code and can support plural
|
|
forms (provided it is accessed through an appropriate API, see below).
|
|
|
|
</P>
|
|
<P>
|
|
To convert a PO file to a <CODE>.properties</CODE> file, the <CODE>msgcat</CODE>
|
|
program can be used with the option <CODE>--properties-output</CODE>. To convert
|
|
a <CODE>.properties</CODE> file back to a PO file, the <CODE>msgcat</CODE> program
|
|
can be used with the option <CODE>--properties-input</CODE>. All the tools
|
|
that manipulate PO files can work with <CODE>.properties</CODE> files as well,
|
|
if given the <CODE>--properties-input</CODE> and/or <CODE>--properties-output</CODE>
|
|
option.
|
|
|
|
</P>
|
|
<P>
|
|
To convert a PO file to a ResourceBundle class, the <CODE>msgfmt</CODE> program
|
|
can be used with the option <CODE>--java</CODE> or <CODE>--java2</CODE>. To convert a
|
|
ResourceBundle back to a PO file, the <CODE>msgunfmt</CODE> program can be used
|
|
with the option <CODE>--java</CODE>.
|
|
|
|
</P>
|
|
<P>
|
|
Two different programmatic APIs can be used to access ResourceBundles.
|
|
Note that both APIs work with all kinds of ResourceBundles, whether
|
|
GNU gettext generated classes, or other <CODE>.class</CODE> or <CODE>.properties</CODE>
|
|
files.
|
|
|
|
</P>
|
|
|
|
<OL>
|
|
<LI>
|
|
|
|
The <CODE>java.util.ResourceBundle</CODE> API.
|
|
|
|
In particular, its <CODE>getString</CODE> function returns a string translation.
|
|
Note that a missing translation yields a <CODE>MissingResourceException</CODE>.
|
|
|
|
This has the advantage of being the standard API. And it does not require
|
|
any additional libraries, only the <CODE>msgcat</CODE> generated <CODE>.properties</CODE>
|
|
files or the <CODE>msgfmt</CODE> generated <CODE>.class</CODE> files. But it cannot do
|
|
plural handling, even if the resource was generated by <CODE>msgfmt</CODE> from
|
|
a PO file with plural handling.
|
|
|
|
<LI>
|
|
|
|
The <CODE>gnu.gettext.GettextResource</CODE> API.
|
|
|
|
Reference documentation in Javadoc 1.1 style format
|
|
is in the <A HREF="javadoc1/tree.html">javadoc1 directory</A> and
|
|
in Javadoc 2 style format
|
|
in the <A HREF="javadoc2/index.html">javadoc2 directory</A>.
|
|
|
|
Its <CODE>gettext</CODE> function returns a string translation. Note that when
|
|
a translation is missing, the <VAR>msgid</VAR> argument is returned unchanged.
|
|
|
|
This has the advantage of having the <CODE>ngettext</CODE> function for plural
|
|
handling.
|
|
|
|
<A NAME="IDX1133"></A>
|
|
To use this API, one needs the <CODE>libintl.jar</CODE> file which is part of
|
|
the GNU gettext package and distributed under the LGPL.
|
|
</OL>
|
|
|
|
<P>
|
|
Three examples, using the second API, are available in the <TT>`examples´</TT>
|
|
directory: <CODE>hello-java</CODE>, <CODE>hello-java-awt</CODE>, <CODE>hello-java-swing</CODE>.
|
|
|
|
</P>
|
|
<P>
|
|
Now, to make use of the API and define a shorthand for <SAMP>`getString´</SAMP>,
|
|
there are two idioms that you can choose from:
|
|
|
|
</P>
|
|
|
|
<UL>
|
|
<LI>
|
|
|
|
In a unique class of your project, say <SAMP>`Util´</SAMP>, define a static variable
|
|
holding the <CODE>ResourceBundle</CODE> instance:
|
|
|
|
|
|
<PRE>
|
|
public static ResourceBundle myResources =
|
|
ResourceBundle.getBundle("domain-name");
|
|
</PRE>
|
|
|
|
All classes containing internationalized strings then contain
|
|
|
|
|
|
<PRE>
|
|
private static ResourceBundle res = Util.myResources;
|
|
private static String _(String s) { return res.getString(s); }
|
|
</PRE>
|
|
|
|
and the shorthand is used like this:
|
|
|
|
|
|
<PRE>
|
|
System.out.println(_("Operation completed."));
|
|
</PRE>
|
|
|
|
<LI>
|
|
|
|
You add a class with a very short name, say <SAMP>`S´</SAMP>, containing just the
|
|
definition of the resource bundle and of the shorthand:
|
|
|
|
|
|
<PRE>
|
|
public class S {
|
|
public static ResourceBundle myResources =
|
|
ResourceBundle.getBundle("domain-name");
|
|
public static String _(String s) {
|
|
return myResources.getString(s);
|
|
}
|
|
}
|
|
</PRE>
|
|
|
|
and the shorthand is used like this:
|
|
|
|
|
|
<PRE>
|
|
System.out.println(S._("Operation completed."));
|
|
</PRE>
|
|
|
|
</UL>
|
|
|
|
<P>
|
|
Which of the two idioms you choose, will depend on whether copying two lines
|
|
of codes into every class is more acceptable in your project than a class
|
|
with a single-letter name.
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC264" HREF="gettext_toc.html#TOC264">13.5.12 C#</A></H3>
|
|
<P>
|
|
<A NAME="IDX1134"></A>
|
|
|
|
</P>
|
|
<DL COMPACT>
|
|
|
|
<DT>RPMs
|
|
<DD>
|
|
pnet, pnetlib 0.6.2 or newer, or mono 0.29 or newer
|
|
|
|
<DT>File extension
|
|
<DD>
|
|
<CODE>cs</CODE>
|
|
|
|
<DT>String syntax
|
|
<DD>
|
|
<CODE>"abc"</CODE>, <CODE>@"abc"</CODE>
|
|
|
|
<DT>gettext shorthand
|
|
<DD>
|
|
_("abc")
|
|
|
|
<DT>gettext/ngettext functions
|
|
<DD>
|
|
<CODE>GettextResourceManager.GetString</CODE>,
|
|
<CODE>GettextResourceManager.GetPluralString</CODE>
|
|
|
|
<DT>textdomain
|
|
<DD>
|
|
<CODE>new GettextResourceManager(domain)</CODE>
|
|
|
|
<DT>bindtextdomain
|
|
<DD>
|
|
---, compiled message catalogs are located in subdirectories of the directory
|
|
containing the executable
|
|
|
|
<DT>setlocale
|
|
<DD>
|
|
automatic
|
|
|
|
<DT>Prerequisite
|
|
<DD>
|
|
---
|
|
|
|
<DT>Use or emulate GNU gettext
|
|
<DD>
|
|
---, uses a C# specific message catalog format
|
|
|
|
<DT>Extractor
|
|
<DD>
|
|
<CODE>xgettext -k_</CODE>
|
|
|
|
<DT>Formatting with positions
|
|
<DD>
|
|
<CODE>String.Format "{1} {0}"</CODE>
|
|
|
|
<DT>Portability
|
|
<DD>
|
|
fully portable
|
|
|
|
<DT>po-mode marking
|
|
<DD>
|
|
---
|
|
</DL>
|
|
|
|
<P>
|
|
Before marking strings as internationalizable, uses of the string
|
|
concatenation operator need to be converted to <CODE>String.Format</CODE>
|
|
invocations. For example, <CODE>"file "+filename+" not found"</CODE> becomes
|
|
<CODE>String.Format("file {0} not found", filename)</CODE>.
|
|
Only after this is done, can the strings be marked and extracted.
|
|
|
|
</P>
|
|
<P>
|
|
GNU gettext uses the native C#/.NET internationalization mechanism, namely
|
|
the classes <CODE>ResourceManager</CODE> and <CODE>ResourceSet</CODE>. Applications
|
|
use the <CODE>ResourceManager</CODE> methods to retrieve the native language
|
|
translation of strings. An instance of <CODE>ResourceSet</CODE> is the in-memory
|
|
representation of a message catalog file. The <CODE>ResourceManager</CODE> loads
|
|
and accesses <CODE>ResourceSet</CODE> instances as needed to look up the
|
|
translations.
|
|
|
|
</P>
|
|
<P>
|
|
There are two formats of <CODE>ResourceSet</CODE>s that can be directly loaded by
|
|
the C# runtime: <CODE>.resources</CODE> files and <CODE>.dll</CODE> files.
|
|
|
|
</P>
|
|
|
|
<UL>
|
|
<LI>
|
|
|
|
The <CODE>.resources</CODE> format is a binary file usually generated through the
|
|
<CODE>resgen</CODE> or <CODE>monoresgen</CODE> utility, but which doesn't support plural
|
|
forms. <CODE>.resources</CODE> files can also be embedded in .NET <CODE>.exe</CODE> files.
|
|
This only affects whether a file system access is performed to load the message
|
|
catalog; it doesn't affect the contents of the message catalog.
|
|
|
|
<LI>
|
|
|
|
On the other hand, the <CODE>.dll</CODE> format is a binary file that is compiled
|
|
from <CODE>.cs</CODE> source code and can support plural forms (provided it is
|
|
accessed through the GNU gettext API, see below).
|
|
</UL>
|
|
|
|
<P>
|
|
Note that these .NET <CODE>.dll</CODE> and <CODE>.exe</CODE> files are not tied to a
|
|
particular platform; their file format and GNU gettext for C# can be used
|
|
on any platform.
|
|
|
|
</P>
|
|
<P>
|
|
To convert a PO file to a <CODE>.resources</CODE> file, the <CODE>msgfmt</CODE> program
|
|
can be used with the option <SAMP>`--csharp-resources´</SAMP>. To convert a
|
|
<CODE>.resources</CODE> file back to a PO file, the <CODE>msgunfmt</CODE> program can be
|
|
used with the option <SAMP>`--csharp-resources´</SAMP>. You can also, in some cases,
|
|
use the <CODE>resgen</CODE> program (from the <CODE>pnet</CODE> package) or the
|
|
<CODE>monoresgen</CODE> program (from the <CODE>mono</CODE>/<CODE>mcs</CODE> package). These
|
|
programs can also convert a <CODE>.resources</CODE> file back to a PO file. But
|
|
beware: as of this writing (January 2004), the <CODE>monoresgen</CODE> converter is
|
|
quite buggy and the <CODE>resgen</CODE> converter ignores the encoding of the PO
|
|
files.
|
|
|
|
</P>
|
|
<P>
|
|
To convert a PO file to a <CODE>.dll</CODE> file, the <CODE>msgfmt</CODE> program can be
|
|
used with the option <CODE>--csharp</CODE>. The result will be a <CODE>.dll</CODE> file
|
|
containing a subclass of <CODE>GettextResourceSet</CODE>, which itself is a subclass
|
|
of <CODE>ResourceSet</CODE>. To convert a <CODE>.dll</CODE> file containing a
|
|
<CODE>GettextResourceSet</CODE> subclass back to a PO file, the <CODE>msgunfmt</CODE>
|
|
program can be used with the option <CODE>--csharp</CODE>.
|
|
|
|
</P>
|
|
<P>
|
|
The advantages of the <CODE>.dll</CODE> format over the <CODE>.resources</CODE> format
|
|
are:
|
|
|
|
</P>
|
|
|
|
<OL>
|
|
<LI>
|
|
|
|
Freedom to localize: Users can add their own translations to an application
|
|
after it has been built and distributed. Whereas when the programmer uses
|
|
a <CODE>ResourceManager</CODE> constructor provided by the system, the set of
|
|
<CODE>.resources</CODE> files for an application must be specified when the
|
|
application is built and cannot be extended afterwards.
|
|
|
|
<LI>
|
|
|
|
Plural handling: A message catalog in <CODE>.dll</CODE> format supports the plural
|
|
handling function <CODE>GetPluralString</CODE>. Whereas <CODE>.resources</CODE> files can
|
|
only contain data and only support lookups that depend on a single string.
|
|
|
|
<LI>
|
|
|
|
The <CODE>GettextResourceManager</CODE> that loads the message catalogs in
|
|
<CODE>.dll</CODE> format also provides for inheritance on a per-message basis.
|
|
For example, in Austrian (<CODE>de_AT</CODE>) locale, translations from the German
|
|
(<CODE>de</CODE>) message catalog will be used for messages not found in the
|
|
Austrian message catalog. This has the consequence that the Austrian
|
|
translators need only translate those few messages for which the translation
|
|
into Austrian differs from the German one. Whereas when working with
|
|
<CODE>.resources</CODE> files, each message catalog must provide the translations
|
|
of all messages by itself.
|
|
|
|
<LI>
|
|
|
|
The <CODE>GettextResourceManager</CODE> that loads the message catalogs in
|
|
<CODE>.dll</CODE> format also provides for a fallback: The English <VAR>msgid</VAR> is
|
|
returned when no translation can be found. Whereas when working with
|
|
<CODE>.resources</CODE> files, a language-neutral <CODE>.resources</CODE> file must
|
|
explicitly be provided as a fallback.
|
|
</OL>
|
|
|
|
<P>
|
|
On the side of the programmatic APIs, the programmer can use either the
|
|
standard <CODE>ResourceManager</CODE> API and the GNU <CODE>GettextResourceManager</CODE>
|
|
API. The latter is an extension of the former, because
|
|
<CODE>GettextResourceManager</CODE> is a subclass of <CODE>ResourceManager</CODE>.
|
|
|
|
</P>
|
|
|
|
<OL>
|
|
<LI>
|
|
|
|
The <CODE>System.Resources.ResourceManager</CODE> API.
|
|
|
|
This API works with resources in <CODE>.resources</CODE> format.
|
|
|
|
The creation of the <CODE>ResourceManager</CODE> is done through
|
|
|
|
<PRE>
|
|
new ResourceManager(domainname, Assembly.GetExecutingAssembly())
|
|
</PRE>
|
|
|
|
|
|
The <CODE>GetString</CODE> function returns a string's translation. Note that this
|
|
function returns null when a translation is missing (i.e. not even found in
|
|
the fallback resource file).
|
|
|
|
<LI>
|
|
|
|
The <CODE>GNU.Gettext.GettextResourceManager</CODE> API.
|
|
|
|
This API works with resources in <CODE>.dll</CODE> format.
|
|
|
|
Reference documentation is in the
|
|
<A HREF="csharpdoc/index.html">csharpdoc directory</A>.
|
|
|
|
The creation of the <CODE>ResourceManager</CODE> is done through
|
|
|
|
<PRE>
|
|
new GettextResourceManager(domainname)
|
|
</PRE>
|
|
|
|
The <CODE>GetString</CODE> function returns a string's translation. Note that when
|
|
a translation is missing, the <VAR>msgid</VAR> argument is returned unchanged.
|
|
|
|
The <CODE>GetPluralString</CODE> function returns a string translation with plural
|
|
handling, like the <CODE>ngettext</CODE> function in C.
|
|
|
|
<A NAME="IDX1135"></A>
|
|
To use this API, one needs the <CODE>GNU.Gettext.dll</CODE> file which is part of
|
|
the GNU gettext package and distributed under the LGPL.
|
|
</OL>
|
|
|
|
<P>
|
|
You can also mix both approaches: use the
|
|
<CODE>GNU.Gettext.GettextResourceManager</CODE> constructor, but otherwise use
|
|
only the <CODE>ResourceManager</CODE> type and only the <CODE>GetString</CODE> method.
|
|
This is appropriate when you want to profit from the tools for PO files,
|
|
but don't want to change an existing source code that uses
|
|
<CODE>ResourceManager</CODE> and don't (yet) need the <CODE>GetPluralString</CODE> method.
|
|
|
|
</P>
|
|
<P>
|
|
Two examples, using the second API, are available in the <TT>`examples´</TT>
|
|
directory: <CODE>hello-csharp</CODE>, <CODE>hello-csharp-forms</CODE>.
|
|
|
|
</P>
|
|
<P>
|
|
Now, to make use of the API and define a shorthand for <SAMP>`GetString´</SAMP>,
|
|
there are two idioms that you can choose from:
|
|
|
|
</P>
|
|
|
|
<UL>
|
|
<LI>
|
|
|
|
In a unique class of your project, say <SAMP>`Util´</SAMP>, define a static variable
|
|
holding the <CODE>ResourceManager</CODE> instance:
|
|
|
|
|
|
<PRE>
|
|
public static GettextResourceManager MyResourceManager =
|
|
new GettextResourceManager("domain-name");
|
|
</PRE>
|
|
|
|
All classes containing internationalized strings then contain
|
|
|
|
|
|
<PRE>
|
|
private static GettextResourceManager Res = Util.MyResourceManager;
|
|
private static String _(String s) { return Res.GetString(s); }
|
|
</PRE>
|
|
|
|
and the shorthand is used like this:
|
|
|
|
|
|
<PRE>
|
|
Console.WriteLine(_("Operation completed."));
|
|
</PRE>
|
|
|
|
<LI>
|
|
|
|
You add a class with a very short name, say <SAMP>`S´</SAMP>, containing just the
|
|
definition of the resource manager and of the shorthand:
|
|
|
|
|
|
<PRE>
|
|
public class S {
|
|
public static GettextResourceManager MyResourceManager =
|
|
new GettextResourceManager("domain-name");
|
|
public static String _(String s) {
|
|
return MyResourceManager.GetString(s);
|
|
}
|
|
}
|
|
</PRE>
|
|
|
|
and the shorthand is used like this:
|
|
|
|
|
|
<PRE>
|
|
Console.WriteLine(S._("Operation completed."));
|
|
</PRE>
|
|
|
|
</UL>
|
|
|
|
<P>
|
|
Which of the two idioms you choose, will depend on whether copying two lines
|
|
of codes into every class is more acceptable in your project than a class
|
|
with a single-letter name.
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC265" HREF="gettext_toc.html#TOC265">13.5.13 GNU awk</A></H3>
|
|
<P>
|
|
<A NAME="IDX1136"></A>
|
|
<A NAME="IDX1137"></A>
|
|
|
|
</P>
|
|
<DL COMPACT>
|
|
|
|
<DT>RPMs
|
|
<DD>
|
|
gawk 3.1 or newer
|
|
|
|
<DT>File extension
|
|
<DD>
|
|
<CODE>awk</CODE>
|
|
|
|
<DT>String syntax
|
|
<DD>
|
|
<CODE>"abc"</CODE>
|
|
|
|
<DT>gettext shorthand
|
|
<DD>
|
|
<CODE>_"abc"</CODE>
|
|
|
|
<DT>gettext/ngettext functions
|
|
<DD>
|
|
<CODE>dcgettext</CODE>, missing <CODE>dcngettext</CODE> in gawk-3.1.0
|
|
|
|
<DT>textdomain
|
|
<DD>
|
|
<CODE>TEXTDOMAIN</CODE> variable
|
|
|
|
<DT>bindtextdomain
|
|
<DD>
|
|
<CODE>bindtextdomain</CODE> function
|
|
|
|
<DT>setlocale
|
|
<DD>
|
|
automatic, but missing <CODE>setlocale (LC_MESSAGES, "")</CODE> in gawk-3.1.0
|
|
|
|
<DT>Prerequisite
|
|
<DD>
|
|
---
|
|
|
|
<DT>Use or emulate GNU gettext
|
|
<DD>
|
|
use
|
|
|
|
<DT>Extractor
|
|
<DD>
|
|
<CODE>xgettext</CODE>
|
|
|
|
<DT>Formatting with positions
|
|
<DD>
|
|
<CODE>printf "%2$d %1$d"</CODE> (GNU awk only)
|
|
|
|
<DT>Portability
|
|
<DD>
|
|
On platforms without gettext, no translation. On non-GNU awks, you must
|
|
define <CODE>dcgettext</CODE>, <CODE>dcngettext</CODE> and <CODE>bindtextdomain</CODE>
|
|
yourself.
|
|
|
|
<DT>po-mode marking
|
|
<DD>
|
|
---
|
|
</DL>
|
|
|
|
<P>
|
|
An example is available in the <TT>`examples´</TT> directory: <CODE>hello-gawk</CODE>.
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC266" HREF="gettext_toc.html#TOC266">13.5.14 Pascal - Free Pascal Compiler</A></H3>
|
|
<P>
|
|
<A NAME="IDX1138"></A>
|
|
<A NAME="IDX1139"></A>
|
|
<A NAME="IDX1140"></A>
|
|
|
|
</P>
|
|
<DL COMPACT>
|
|
|
|
<DT>RPMs
|
|
<DD>
|
|
fpk
|
|
|
|
<DT>File extension
|
|
<DD>
|
|
<CODE>pp</CODE>, <CODE>pas</CODE>
|
|
|
|
<DT>String syntax
|
|
<DD>
|
|
<CODE>'abc'</CODE>
|
|
|
|
<DT>gettext shorthand
|
|
<DD>
|
|
automatic
|
|
|
|
<DT>gettext/ngettext functions
|
|
<DD>
|
|
---, use <CODE>ResourceString</CODE> data type instead
|
|
|
|
<DT>textdomain
|
|
<DD>
|
|
---, use <CODE>TranslateResourceStrings</CODE> function instead
|
|
|
|
<DT>bindtextdomain
|
|
<DD>
|
|
---, use <CODE>TranslateResourceStrings</CODE> function instead
|
|
|
|
<DT>setlocale
|
|
<DD>
|
|
automatic, but uses only LANG, not LC_MESSAGES or LC_ALL
|
|
|
|
<DT>Prerequisite
|
|
<DD>
|
|
<CODE>{$mode delphi}</CODE> or <CODE>{$mode objfpc}</CODE><BR><CODE>uses gettext;</CODE>
|
|
|
|
<DT>Use or emulate GNU gettext
|
|
<DD>
|
|
emulate partially
|
|
|
|
<DT>Extractor
|
|
<DD>
|
|
<CODE>ppc386</CODE> followed by <CODE>xgettext</CODE> or <CODE>rstconv</CODE>
|
|
|
|
<DT>Formatting with positions
|
|
<DD>
|
|
<CODE>uses sysutils;</CODE><BR><CODE>format "%1:d %0:d"</CODE>
|
|
|
|
<DT>Portability
|
|
<DD>
|
|
?
|
|
|
|
<DT>po-mode marking
|
|
<DD>
|
|
---
|
|
</DL>
|
|
|
|
<P>
|
|
The Pascal compiler has special support for the <CODE>ResourceString</CODE> data
|
|
type. It generates a <CODE>.rst</CODE> file. This is then converted to a
|
|
<CODE>.pot</CODE> file by use of <CODE>xgettext</CODE> or <CODE>rstconv</CODE>. At runtime,
|
|
a <CODE>.mo</CODE> file corresponding to translations of this <CODE>.pot</CODE> file
|
|
can be loaded using the <CODE>TranslateResourceStrings</CODE> function in the
|
|
<CODE>gettext</CODE> unit.
|
|
|
|
</P>
|
|
<P>
|
|
An example is available in the <TT>`examples´</TT> directory: <CODE>hello-pascal</CODE>.
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC267" HREF="gettext_toc.html#TOC267">13.5.15 wxWindows library</A></H3>
|
|
<P>
|
|
<A NAME="IDX1141"></A>
|
|
|
|
</P>
|
|
<DL COMPACT>
|
|
|
|
<DT>RPMs
|
|
<DD>
|
|
wxGTK, gettext
|
|
|
|
<DT>File extension
|
|
<DD>
|
|
<CODE>cpp</CODE>
|
|
|
|
<DT>String syntax
|
|
<DD>
|
|
<CODE>"abc"</CODE>
|
|
|
|
<DT>gettext shorthand
|
|
<DD>
|
|
<CODE>_("abc")</CODE>
|
|
|
|
<DT>gettext/ngettext functions
|
|
<DD>
|
|
<CODE>wxLocale::GetString</CODE>, <CODE>wxGetTranslation</CODE>
|
|
|
|
<DT>textdomain
|
|
<DD>
|
|
<CODE>wxLocale::AddCatalog</CODE>
|
|
|
|
<DT>bindtextdomain
|
|
<DD>
|
|
<CODE>wxLocale::AddCatalogLookupPathPrefix</CODE>
|
|
|
|
<DT>setlocale
|
|
<DD>
|
|
<CODE>wxLocale::Init</CODE>, <CODE>wxSetLocale</CODE>
|
|
|
|
<DT>Prerequisite
|
|
<DD>
|
|
<CODE>#include <wx/intl.h></CODE>
|
|
|
|
<DT>Use or emulate GNU gettext
|
|
<DD>
|
|
emulate, see <CODE>include/wx/intl.h</CODE> and <CODE>src/common/intl.cpp</CODE>
|
|
|
|
<DT>Extractor
|
|
<DD>
|
|
<CODE>xgettext</CODE>
|
|
|
|
<DT>Formatting with positions
|
|
<DD>
|
|
---
|
|
|
|
<DT>Portability
|
|
<DD>
|
|
fully portable
|
|
|
|
<DT>po-mode marking
|
|
<DD>
|
|
yes
|
|
</DL>
|
|
|
|
|
|
|
|
<H3><A NAME="SEC268" HREF="gettext_toc.html#TOC268">13.5.16 YCP - YaST2 scripting language</A></H3>
|
|
<P>
|
|
<A NAME="IDX1142"></A>
|
|
<A NAME="IDX1143"></A>
|
|
|
|
</P>
|
|
<DL COMPACT>
|
|
|
|
<DT>RPMs
|
|
<DD>
|
|
libycp, libycp-devel, yast2-core, yast2-core-devel
|
|
|
|
<DT>File extension
|
|
<DD>
|
|
<CODE>ycp</CODE>
|
|
|
|
<DT>String syntax
|
|
<DD>
|
|
<CODE>"abc"</CODE>
|
|
|
|
<DT>gettext shorthand
|
|
<DD>
|
|
<CODE>_("abc")</CODE>
|
|
|
|
<DT>gettext/ngettext functions
|
|
<DD>
|
|
<CODE>_()</CODE> with 1 or 3 arguments
|
|
|
|
<DT>textdomain
|
|
<DD>
|
|
<CODE>textdomain</CODE> statement
|
|
|
|
<DT>bindtextdomain
|
|
<DD>
|
|
---
|
|
|
|
<DT>setlocale
|
|
<DD>
|
|
---
|
|
|
|
<DT>Prerequisite
|
|
<DD>
|
|
---
|
|
|
|
<DT>Use or emulate GNU gettext
|
|
<DD>
|
|
use
|
|
|
|
<DT>Extractor
|
|
<DD>
|
|
<CODE>xgettext</CODE>
|
|
|
|
<DT>Formatting with positions
|
|
<DD>
|
|
<CODE>sformat "%2 %1"</CODE>
|
|
|
|
<DT>Portability
|
|
<DD>
|
|
fully portable
|
|
|
|
<DT>po-mode marking
|
|
<DD>
|
|
---
|
|
</DL>
|
|
|
|
<P>
|
|
An example is available in the <TT>`examples´</TT> directory: <CODE>hello-ycp</CODE>.
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC269" HREF="gettext_toc.html#TOC269">13.5.17 Tcl - Tk's scripting language</A></H3>
|
|
<P>
|
|
<A NAME="IDX1144"></A>
|
|
<A NAME="IDX1145"></A>
|
|
|
|
</P>
|
|
<DL COMPACT>
|
|
|
|
<DT>RPMs
|
|
<DD>
|
|
tcl
|
|
|
|
<DT>File extension
|
|
<DD>
|
|
<CODE>tcl</CODE>
|
|
|
|
<DT>String syntax
|
|
<DD>
|
|
<CODE>"abc"</CODE>
|
|
|
|
<DT>gettext shorthand
|
|
<DD>
|
|
<CODE>[_ "abc"]</CODE>
|
|
|
|
<DT>gettext/ngettext functions
|
|
<DD>
|
|
<CODE>::msgcat::mc</CODE>
|
|
|
|
<DT>textdomain
|
|
<DD>
|
|
---
|
|
|
|
<DT>bindtextdomain
|
|
<DD>
|
|
---, use <CODE>::msgcat::mcload</CODE> instead
|
|
|
|
<DT>setlocale
|
|
<DD>
|
|
automatic, uses LANG, but ignores LC_MESSAGES and LC_ALL
|
|
|
|
<DT>Prerequisite
|
|
<DD>
|
|
<CODE>package require msgcat</CODE>
|
|
<BR><CODE>proc _ {s} {return [::msgcat::mc $s]}</CODE>
|
|
|
|
<DT>Use or emulate GNU gettext
|
|
<DD>
|
|
---, uses a Tcl specific message catalog format
|
|
|
|
<DT>Extractor
|
|
<DD>
|
|
<CODE>xgettext -k_</CODE>
|
|
|
|
<DT>Formatting with positions
|
|
<DD>
|
|
<CODE>format "%2\$d %1\$d"</CODE>
|
|
|
|
<DT>Portability
|
|
<DD>
|
|
fully portable
|
|
|
|
<DT>po-mode marking
|
|
<DD>
|
|
---
|
|
</DL>
|
|
|
|
<P>
|
|
Two examples are available in the <TT>`examples´</TT> directory:
|
|
<CODE>hello-tcl</CODE>, <CODE>hello-tcl-tk</CODE>.
|
|
|
|
</P>
|
|
<P>
|
|
Before marking strings as internationalizable, substitutions of variables
|
|
into the string need to be converted to <CODE>format</CODE> applications. For
|
|
example, <CODE>"file $filename not found"</CODE> becomes
|
|
<CODE>[format "file %s not found" $filename]</CODE>.
|
|
Only after this is done, can the strings be marked and extracted.
|
|
After marking, this example becomes
|
|
<CODE>[format [_ "file %s not found"] $filename]</CODE> or
|
|
<CODE>[msgcat::mc "file %s not found" $filename]</CODE>. Note that the
|
|
<CODE>msgcat::mc</CODE> function implicitly calls <CODE>format</CODE> when more than one
|
|
argument is given.
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC270" HREF="gettext_toc.html#TOC270">13.5.18 Perl</A></H3>
|
|
<P>
|
|
<A NAME="IDX1146"></A>
|
|
|
|
</P>
|
|
<DL COMPACT>
|
|
|
|
<DT>RPMs
|
|
<DD>
|
|
perl
|
|
|
|
<DT>File extension
|
|
<DD>
|
|
<CODE>pl</CODE>, <CODE>PL</CODE>, <CODE>pm</CODE>, <CODE>cgi</CODE>
|
|
|
|
<DT>String syntax
|
|
<DD>
|
|
|
|
<UL>
|
|
|
|
<LI><CODE>"abc"</CODE>
|
|
|
|
<LI><CODE>'abc'</CODE>
|
|
|
|
<LI><CODE>qq (abc)</CODE>
|
|
|
|
<LI><CODE>q (abc)</CODE>
|
|
|
|
<LI><CODE>qr /abc/</CODE>
|
|
|
|
<LI><CODE>qx (/bin/date)</CODE>
|
|
|
|
<LI><CODE>/pattern match/</CODE>
|
|
|
|
<LI><CODE>?pattern match?</CODE>
|
|
|
|
<LI><CODE>s/substitution/operators/</CODE>
|
|
|
|
<LI><CODE>$tied_hash{"message"}</CODE>
|
|
|
|
<LI><CODE>$tied_hash_reference->{"message"}</CODE>
|
|
|
|
<LI>etc., issue the command <SAMP>`man perlsyn´</SAMP> for details
|
|
|
|
</UL>
|
|
|
|
<DT>gettext shorthand
|
|
<DD>
|
|
<CODE>__</CODE> (double underscore)
|
|
|
|
<DT>gettext/ngettext functions
|
|
<DD>
|
|
<CODE>gettext</CODE>, <CODE>dgettext</CODE>, <CODE>dcgettext</CODE>, <CODE>ngettext</CODE>,
|
|
<CODE>dngettext</CODE>, <CODE>dcngettext</CODE>
|
|
|
|
<DT>textdomain
|
|
<DD>
|
|
<CODE>textdomain</CODE> function
|
|
|
|
<DT>bindtextdomain
|
|
<DD>
|
|
<CODE>bindtextdomain</CODE> function
|
|
|
|
<DT>bind_textdomain_codeset
|
|
<DD>
|
|
<CODE>bind_textdomain_codeset</CODE> function
|
|
|
|
<DT>setlocale
|
|
<DD>
|
|
Use <CODE>setlocale (LC_ALL, "");</CODE>
|
|
|
|
<DT>Prerequisite
|
|
<DD>
|
|
<CODE>use POSIX;</CODE>
|
|
<BR><CODE>use Locale::TextDomain;</CODE> (included in the package libintl-perl
|
|
which is available on the Comprehensive Perl Archive Network CPAN,
|
|
http://www.cpan.org/).
|
|
|
|
<DT>Use or emulate GNU gettext
|
|
<DD>
|
|
platform dependent: gettext_pp emulates, gettext_xs uses GNU gettext
|
|
|
|
<DT>Extractor
|
|
<DD>
|
|
<CODE>xgettext -k__ -k\$__ -k%__ -k__x -k__n:1,2 -k__nx:1,2 -k__xn:1,2 -kN__ -k</CODE>
|
|
|
|
<DT>Formatting with positions
|
|
<DD>
|
|
Both kinds of format strings support formatting with positions.
|
|
<BR><CODE>printf "%2\$d %1\$d", ...</CODE> (requires Perl 5.8.0 or newer)
|
|
<BR><CODE>__expand("[new] replaces [old]", old => $oldvalue, new => $newvalue)</CODE>
|
|
|
|
<DT>Portability
|
|
<DD>
|
|
The <CODE>libintl-perl</CODE> package is platform independent but is not
|
|
part of the Perl core. The programmer is responsible for
|
|
providing a dummy implementation of the required functions if the
|
|
package is not installed on the target system.
|
|
|
|
<DT>po-mode marking
|
|
<DD>
|
|
---
|
|
|
|
<DT>Documentation
|
|
<DD>
|
|
Included in <CODE>libintl-perl</CODE>, available on CPAN
|
|
(http://www.cpan.org/).
|
|
|
|
</DL>
|
|
|
|
<P>
|
|
An example is available in the <TT>`examples´</TT> directory: <CODE>hello-perl</CODE>.
|
|
|
|
</P>
|
|
<P>
|
|
<A NAME="IDX1147"></A>
|
|
|
|
</P>
|
|
<P>
|
|
The <CODE>xgettext</CODE> parser backend for Perl differs significantly from
|
|
the parser backends for other programming languages, just as Perl
|
|
itself differs significantly from other programming languages. The
|
|
Perl parser backend offers many more string marking facilities than
|
|
the other backends but it also has some Perl specific limitations, the
|
|
worst probably being its imperfectness.
|
|
|
|
</P>
|
|
|
|
|
|
|
|
<H4><A NAME="SEC271" HREF="gettext_toc.html#TOC271">13.5.18.1 General Problems Parsing Perl Code</A></H4>
|
|
|
|
<P>
|
|
It is often heard that only Perl can parse Perl. This is not true.
|
|
Perl cannot be <EM>parsed</EM> at all, it can only be <EM>executed</EM>.
|
|
Perl has various built-in ambiguities that can only be resolved at runtime.
|
|
|
|
</P>
|
|
<P>
|
|
The following example may illustrate one common problem:
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
print gettext "Hello World!";
|
|
</PRE>
|
|
|
|
<P>
|
|
Although this example looks like a bullet-proof case of a function
|
|
invocation, it is not:
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
open gettext, ">testfile" or die;
|
|
print gettext "Hello world!"
|
|
</PRE>
|
|
|
|
<P>
|
|
In this context, the string <CODE>gettext</CODE> looks more like a
|
|
file handle. But not necessarily:
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
use Locale::Messages qw (:libintl_h);
|
|
open gettext ">testfile" or die;
|
|
print gettext "Hello world!";
|
|
</PRE>
|
|
|
|
<P>
|
|
Now, the file is probably syntactically incorrect, provided that the module
|
|
<CODE>Locale::Messages</CODE> found first in the Perl include path exports a
|
|
function <CODE>gettext</CODE>. But what if the module
|
|
<CODE>Locale::Messages</CODE> really looks like this?
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
use vars qw (*gettext);
|
|
|
|
1;
|
|
</PRE>
|
|
|
|
<P>
|
|
In this case, the string <CODE>gettext</CODE> will be interpreted as a file
|
|
handle again, and the above example will create a file <TT>`testfile´</TT>
|
|
and write the string "Hello world!" into it. Even advanced
|
|
control flow analysis will not really help:
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
if (0.5 < rand) {
|
|
eval "use Sane";
|
|
} else {
|
|
eval "use InSane";
|
|
}
|
|
print gettext "Hello world!";
|
|
</PRE>
|
|
|
|
<P>
|
|
If the module <CODE>Sane</CODE> exports a function <CODE>gettext</CODE> that does
|
|
what we expect, and the module <CODE>InSane</CODE> opens a file for writing
|
|
and associates the <EM>handle</EM> <CODE>gettext</CODE> with this output
|
|
stream, we are clueless again about what will happen at runtime. It is
|
|
completely unpredictable. The truth is that Perl has so many ways to
|
|
fill its symbol table at runtime that it is impossible to interpret a
|
|
particular piece of code without executing it.
|
|
|
|
</P>
|
|
<P>
|
|
Of course, <CODE>xgettext</CODE> will not execute your Perl sources while
|
|
scanning for translatable strings, but rather use heuristics in order
|
|
to guess what you meant.
|
|
|
|
</P>
|
|
<P>
|
|
Another problem is the ambiguity of the slash and the question mark.
|
|
Their interpretation depends on the context:
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
# A pattern match.
|
|
print "OK\n" if /foobar/;
|
|
|
|
# A division.
|
|
print 1 / 2;
|
|
|
|
# Another pattern match.
|
|
print "OK\n" if ?foobar?;
|
|
|
|
# Conditional.
|
|
print $x ? "foo" : "bar";
|
|
</PRE>
|
|
|
|
<P>
|
|
The slash may either act as the division operator or introduce a
|
|
pattern match, whereas the question mark may act as the ternary
|
|
conditional operator or as a pattern match, too. Other programming
|
|
languages like <CODE>awk</CODE> present similar problems, but the consequences of a
|
|
misinterpretation are particularly nasty with Perl sources. In <CODE>awk</CODE>
|
|
for instance, a statement can never exceed one line and the parser
|
|
can recover from a parsing error at the next newline and interpret
|
|
the rest of the input stream correctly. Perl is different, as a
|
|
pattern match is terminated by the next appearance of the delimiter
|
|
(the slash or the question mark) in the input stream, regardless of
|
|
the semantic context. If a slash is really a division sign but
|
|
mis-interpreted as a pattern match, the rest of the input file is most
|
|
probably parsed incorrectly.
|
|
|
|
</P>
|
|
<P>
|
|
If you find that <CODE>xgettext</CODE> fails to extract strings from
|
|
portions of your sources, you should therefore look out for slashes
|
|
and/or question marks preceding these sections. You may have come
|
|
across a bug in <CODE>xgettext</CODE>'s Perl parser (and of course you
|
|
should report that bug). In the meantime you should consider to
|
|
reformulate your code in a manner less challenging to <CODE>xgettext</CODE>.
|
|
|
|
</P>
|
|
|
|
|
|
<H4><A NAME="SEC272" HREF="gettext_toc.html#TOC272">13.5.18.2 Which keywords will xgettext look for?</A></H4>
|
|
<P>
|
|
<A NAME="IDX1148"></A>
|
|
|
|
</P>
|
|
<P>
|
|
Unless you instruct <CODE>xgettext</CODE> otherwise by invoking it with one
|
|
of the options <CODE>--keyword</CODE> or <CODE>-k</CODE>, it will recognize the
|
|
following keywords in your Perl sources:
|
|
|
|
</P>
|
|
|
|
<UL>
|
|
|
|
<LI><CODE>gettext</CODE>
|
|
|
|
<LI><CODE>dgettext</CODE>
|
|
|
|
<LI><CODE>dcgettext</CODE>
|
|
|
|
<LI><CODE>ngettext:1,2</CODE>
|
|
|
|
The first (singular) and the second (plural) argument will be
|
|
extracted.
|
|
|
|
<LI><CODE>dngettext:1,2</CODE>
|
|
|
|
The first (singular) and the second (plural) argument will be
|
|
extracted.
|
|
|
|
<LI><CODE>dcngettext:1,2</CODE>
|
|
|
|
The first (singular) and the second (plural) argument will be
|
|
extracted.
|
|
|
|
<LI><CODE>gettext_noop</CODE>
|
|
|
|
<LI><CODE>%gettext</CODE>
|
|
|
|
The keys of lookups into the hash <CODE>%gettext</CODE> will be extracted.
|
|
|
|
<LI><CODE>$gettext</CODE>
|
|
|
|
The keys of lookups into the hash reference <CODE>$gettext</CODE> will be extracted.
|
|
|
|
</UL>
|
|
|
|
|
|
|
|
<H4><A NAME="SEC273" HREF="gettext_toc.html#TOC273">13.5.18.3 How to Extract Hash Keys</A></H4>
|
|
<P>
|
|
<A NAME="IDX1149"></A>
|
|
|
|
</P>
|
|
<P>
|
|
Translating messages at runtime is normally performed by looking up the
|
|
original string in the translation database and returning the
|
|
translated version. The "natural" Perl implementation is a hash
|
|
lookup, and, of course, <CODE>xgettext</CODE> supports such practice.
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
print __"Hello world!";
|
|
print $__{"Hello world!"};
|
|
print $__->{"Hello world!"};
|
|
print $$__{"Hello world!"};
|
|
</PRE>
|
|
|
|
<P>
|
|
The above four lines all do the same thing. The Perl module
|
|
<CODE>Locale::TextDomain</CODE> exports by default a hash <CODE>%__</CODE> that
|
|
is tied to the function <CODE>__()</CODE>. It also exports a reference
|
|
<CODE>$__</CODE> to <CODE>%__</CODE>.
|
|
|
|
</P>
|
|
<P>
|
|
If an argument to the <CODE>xgettext</CODE> option <CODE>--keyword</CODE>,
|
|
resp. <CODE>-k</CODE> starts with a percent sign, the rest of the keyword is
|
|
interpreted as the name of a hash. If it starts with a dollar
|
|
sign, the rest of the keyword is interpreted as a reference to a
|
|
hash.
|
|
|
|
</P>
|
|
<P>
|
|
Note that you can omit the quotation marks (single or double) around
|
|
the hash key (almost) whenever Perl itself allows it:
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
print $gettext{Error};
|
|
</PRE>
|
|
|
|
<P>
|
|
The exact rule is: You can omit the surrounding quotes, when the hash
|
|
key is a valid C (!) identifier, i. e. when it starts with an
|
|
underscore or an ASCII letter and is followed by an arbitrary number
|
|
of underscores, ASCII letters or digits. Other Unicode characters
|
|
are <EM>not</EM> allowed, regardless of the <CODE>use utf8</CODE> pragma.
|
|
|
|
</P>
|
|
|
|
|
|
<H4><A NAME="SEC274" HREF="gettext_toc.html#TOC274">13.5.18.4 What are Strings And Quote-like Expressions?</A></H4>
|
|
<P>
|
|
<A NAME="IDX1150"></A>
|
|
|
|
</P>
|
|
<P>
|
|
Perl offers a plethora of different string constructs. Those that can
|
|
be used either as arguments to functions or inside braces for hash
|
|
lookups are generally supported by <CODE>xgettext</CODE>.
|
|
|
|
</P>
|
|
|
|
<UL>
|
|
<LI><STRONG>double-quoted strings</STRONG>
|
|
|
|
<BR>
|
|
|
|
<PRE>
|
|
print gettext "Hello World!";
|
|
</PRE>
|
|
|
|
<LI><STRONG>single-quoted strings</STRONG>
|
|
|
|
<BR>
|
|
|
|
<PRE>
|
|
print gettext 'Hello World!';
|
|
</PRE>
|
|
|
|
<LI><STRONG>the operator qq</STRONG>
|
|
|
|
<BR>
|
|
|
|
<PRE>
|
|
print gettext qq |Hello World!|;
|
|
print gettext qq <E-mail: <guido\@imperia.net>>;
|
|
</PRE>
|
|
|
|
The operator <CODE>qq</CODE> is fully supported. You can use arbitrary
|
|
delimiters, including the four bracketing delimiters (round, angle,
|
|
square, curly) that nest.
|
|
|
|
<LI><STRONG>the operator q</STRONG>
|
|
|
|
<BR>
|
|
|
|
<PRE>
|
|
print gettext q |Hello World!|;
|
|
print gettext q <E-mail: <guido@imperia.net>>;
|
|
</PRE>
|
|
|
|
The operator <CODE>q</CODE> is fully supported. You can use arbitrary
|
|
delimiters, including the four bracketing delimiters (round, angle,
|
|
square, curly) that nest.
|
|
|
|
<LI><STRONG>the operator qx</STRONG>
|
|
|
|
<BR>
|
|
|
|
<PRE>
|
|
print gettext qx ;LANGUAGE=C /bin/date;
|
|
print gettext qx [/usr/bin/ls | grep '^[A-Z]*'];
|
|
</PRE>
|
|
|
|
The operator <CODE>qx</CODE> is fully supported. You can use arbitrary
|
|
delimiters, including the four bracketing delimiters (round, angle,
|
|
square, curly) that nest.
|
|
|
|
The example is actually a useless use of <CODE>gettext</CODE>. It will
|
|
invoke the <CODE>gettext</CODE> function on the output of the command
|
|
specified with the <CODE>qx</CODE> operator. The feature was included
|
|
in order to make the interface consistent (the parser will extract
|
|
all strings and quote-like expressions).
|
|
|
|
<LI><STRONG>here documents</STRONG>
|
|
|
|
<BR>
|
|
|
|
<PRE>
|
|
print gettext <<'EOF';
|
|
program not found in $PATH
|
|
EOF
|
|
|
|
print ngettext <<EOF, <<"EOF";
|
|
one file deleted
|
|
EOF
|
|
several files deleted
|
|
EOF
|
|
</PRE>
|
|
|
|
Here-documents are recognized. If the delimiter is enclosed in single
|
|
quotes, the string is not interpolated. If it is enclosed in double
|
|
quotes or has no quotes at all, the string is interpolated.
|
|
|
|
Delimiters that start with a digit are not supported!
|
|
|
|
</UL>
|
|
|
|
|
|
|
|
<H4><A NAME="SEC275" HREF="gettext_toc.html#TOC275">13.5.18.5 Invalid Uses Of String Interpolation</A></H4>
|
|
<P>
|
|
<A NAME="IDX1151"></A>
|
|
|
|
</P>
|
|
<P>
|
|
Perl is capable of interpolating variables into strings. This offers
|
|
some nice features in localized programs but can also lead to
|
|
problems.
|
|
|
|
</P>
|
|
<P>
|
|
A common error is a construct like the following:
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
print gettext "This is the program $0!\n";
|
|
</PRE>
|
|
|
|
<P>
|
|
Perl will interpolate at runtime the value of the variable <CODE>$0</CODE>
|
|
into the argument of the <CODE>gettext()</CODE> function. Hence, this
|
|
argument is not a string constant but a variable argument (<CODE>$0</CODE>
|
|
is a global variable that holds the name of the Perl script being
|
|
executed). The interpolation is performed by Perl before the string
|
|
argument is passed to <CODE>gettext()</CODE> and will therefore depend on
|
|
the name of the script which can only be determined at runtime.
|
|
Consequently, it is almost impossible that a translation can be looked
|
|
up at runtime (except if, by accident, the interpolated string is found
|
|
in the message catalog).
|
|
|
|
</P>
|
|
<P>
|
|
The <CODE>xgettext</CODE> program will therefore terminate parsing with a fatal
|
|
error if it encounters a variable inside of an extracted string. In
|
|
general, this will happen for all kinds of string interpolations that
|
|
cannot be safely performed at compile time. If you absolutely know
|
|
what you are doing, you can always circumvent this behavior:
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
my $know_what_i_am_doing = "This is program $0!\n";
|
|
print gettext $know_what_i_am_doing;
|
|
</PRE>
|
|
|
|
<P>
|
|
Since the parser only recognizes strings and quote-like expressions,
|
|
but not variables or other terms, the above construct will be
|
|
accepted. You will have to find another way, however, to let your
|
|
original string make it into your message catalog.
|
|
|
|
</P>
|
|
<P>
|
|
If invoked with the option <CODE>--extract-all</CODE>, resp. <CODE>-a</CODE>,
|
|
variable interpolation will be accepted. Rationale: You will
|
|
generally use this option in order to prepare your sources for
|
|
internationalization.
|
|
|
|
</P>
|
|
<P>
|
|
Please see the manual page <SAMP>`man perlop´</SAMP> for details of strings and
|
|
quote-like expressions that are subject to interpolation and those
|
|
that are not. Safe interpolations (that will not lead to a fatal
|
|
error) are:
|
|
|
|
</P>
|
|
|
|
<UL>
|
|
|
|
<LI>the escape sequences <CODE>\t</CODE> (tab, HT, TAB), <CODE>\n</CODE>
|
|
|
|
(newline, NL), <CODE>\r</CODE> (return, CR), <CODE>\f</CODE> (form feed, FF),
|
|
<CODE>\b</CODE> (backspace, BS), <CODE>\a</CODE> (alarm, bell, BEL), and <CODE>\e</CODE>
|
|
(escape, ESC).
|
|
|
|
<LI>octal chars, like <CODE>\033</CODE>
|
|
|
|
<BR>
|
|
Note that octal escapes in the range of 400-777 are translated into a
|
|
UTF-8 representation, regardless of the presence of the <CODE>use utf8</CODE> pragma.
|
|
|
|
<LI>hex chars, like <CODE>\x1b</CODE>
|
|
|
|
<LI>wide hex chars, like <CODE>\x{263a}</CODE>
|
|
|
|
<BR>
|
|
Note that this escape is translated into a UTF-8 representation,
|
|
regardless of the presence of the <CODE>use utf8</CODE> pragma.
|
|
|
|
<LI>control chars, like <CODE>\c[</CODE> (CTRL-[)
|
|
|
|
<LI>named Unicode chars, like <CODE>\N{LATIN CAPITAL LETTER C WITH CEDILLA}</CODE>
|
|
|
|
<BR>
|
|
Note that this escape is translated into a UTF-8 representation,
|
|
regardless of the presence of the <CODE>use utf8</CODE> pragma.
|
|
</UL>
|
|
|
|
<P>
|
|
The following escapes are considered partially safe:
|
|
|
|
</P>
|
|
|
|
<UL>
|
|
|
|
<LI><CODE>\l</CODE> lowercase next char
|
|
|
|
<LI><CODE>\u</CODE> uppercase next char
|
|
|
|
<LI><CODE>\L</CODE> lowercase till \E
|
|
|
|
<LI><CODE>\U</CODE> uppercase till \E
|
|
|
|
<LI><CODE>\E</CODE> end case modification
|
|
|
|
<LI><CODE>\Q</CODE> quote non-word characters till \E
|
|
|
|
</UL>
|
|
|
|
<P>
|
|
These escapes are only considered safe if the string consists of
|
|
ASCII characters only. Translation of characters outside the range
|
|
defined by ASCII is locale-dependent and can actually only be performed
|
|
at runtime; <CODE>xgettext</CODE> doesn't do these locale-dependent translations
|
|
at extraction time.
|
|
|
|
</P>
|
|
<P>
|
|
Except for the modifier <CODE>\Q</CODE>, these translations, albeit valid,
|
|
are generally useless and only obfuscate your sources. If a
|
|
translation can be safely performed at compile time you can just as
|
|
well write what you mean.
|
|
|
|
</P>
|
|
|
|
|
|
<H4><A NAME="SEC276" HREF="gettext_toc.html#TOC276">13.5.18.6 Valid Uses Of String Interpolation</A></H4>
|
|
<P>
|
|
<A NAME="IDX1152"></A>
|
|
|
|
</P>
|
|
<P>
|
|
Perl is often used to generate sources for other programming languages
|
|
or arbitrary file formats. Web applications that output HTML code
|
|
make a prominent example for such usage.
|
|
|
|
</P>
|
|
<P>
|
|
You will often come across situations where you want to intersperse
|
|
code written in the target (programming) language with translatable
|
|
messages, like in the following HTML example:
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
print gettext <<EOF;
|
|
<h1>My Homepage</h1>
|
|
<script language="JavaScript"><!--
|
|
for (i = 0; i < 100; ++i) {
|
|
alert ("Thank you so much for visiting my homepage!");
|
|
}
|
|
//--></script>
|
|
EOF
|
|
</PRE>
|
|
|
|
<P>
|
|
The parser will extract the entire here document, and it will appear
|
|
entirely in the resulting PO file, including the JavaScript snippet
|
|
embedded in the HTML code. If you exaggerate with constructs like
|
|
the above, you will run the risk that the translators of your package
|
|
will look out for a less challenging project. You should consider an
|
|
alternative expression here:
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
print <<EOF;
|
|
<h1>$gettext{"My Homepage"}</h1>
|
|
<script language="JavaScript"><!--
|
|
for (i = 0; i < 100; ++i) {
|
|
alert ("$gettext{'Thank you so much for visiting my homepage!'}");
|
|
}
|
|
//--></script>
|
|
EOF
|
|
</PRE>
|
|
|
|
<P>
|
|
Only the translatable portions of the code will be extracted here, and
|
|
the resulting PO file will begrudgingly improve in terms of readability.
|
|
|
|
</P>
|
|
<P>
|
|
You can interpolate hash lookups in all strings or quote-like
|
|
expressions that are subject to interpolation (see the manual page
|
|
<SAMP>`man perlop´</SAMP> for details). Double interpolation is invalid, however:
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
# TRANSLATORS: Replace "the earth" with the name of your planet.
|
|
print gettext qq{Welcome to $gettext->{"the earth"}};
|
|
</PRE>
|
|
|
|
<P>
|
|
The <CODE>qq</CODE>-quoted string is recognized as an argument to <CODE>xgettext</CODE> in
|
|
the first place, and checked for invalid variable interpolation. The
|
|
dollar sign of hash-dereferencing will therefore terminate the parser
|
|
with an "invalid interpolation" error.
|
|
|
|
</P>
|
|
<P>
|
|
It is valid to interpolate hash lookups in regular expressions:
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
if ($var =~ /$gettext{"the earth"}/) {
|
|
print gettext "Match!\n";
|
|
}
|
|
s/$gettext{"U. S. A."}/$gettext{"U. S. A."} $gettext{"(dial +0)"}/g;
|
|
</PRE>
|
|
|
|
|
|
|
|
<H4><A NAME="SEC277" HREF="gettext_toc.html#TOC277">13.5.18.7 When To Use Parentheses</A></H4>
|
|
<P>
|
|
<A NAME="IDX1153"></A>
|
|
|
|
</P>
|
|
<P>
|
|
In Perl, parentheses around function arguments are mostly optional.
|
|
<CODE>xgettext</CODE> will always assume that all
|
|
recognized keywords (except for hashs and hash references) are names
|
|
of properly prototyped functions, and will (hopefully) only require
|
|
parentheses where Perl itself requires them. All constructs in the
|
|
following example are therefore ok to use:
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
print gettext ("Hello World!\n");
|
|
print gettext "Hello World!\n";
|
|
print dgettext ($package => "Hello World!\n");
|
|
print dgettext $package, "Hello World!\n";
|
|
|
|
# The "fat comma" => turns the left-hand side argument into a
|
|
# single-quoted string!
|
|
print dgettext smellovision => "Hello World!\n";
|
|
|
|
# The following assignment only works with prototyped functions.
|
|
# Otherwise, the functions will act as "greedy" list operators and
|
|
# eat up all following arguments.
|
|
my $anonymous_hash = {
|
|
planet => gettext "earth",
|
|
cakes => ngettext "one cake", "several cakes", $n,
|
|
still => $works,
|
|
};
|
|
# The same without fat comma:
|
|
my $other_hash = {
|
|
'planet', gettext "earth",
|
|
'cakes', ngettext "one cake", "several cakes", $n,
|
|
'still', $works,
|
|
};
|
|
|
|
# Parentheses are only significant for the first argument.
|
|
print dngettext 'package', ("one cake", "several cakes", $n), $discarded;
|
|
</PRE>
|
|
|
|
|
|
|
|
<H4><A NAME="SEC278" HREF="gettext_toc.html#TOC278">13.5.18.8 How To Grok with Long Lines</A></H4>
|
|
<P>
|
|
<A NAME="IDX1154"></A>
|
|
|
|
</P>
|
|
<P>
|
|
The necessity of long messages can often lead to a cumbersome or
|
|
unreadable coding style. Perl has several options that may prevent
|
|
you from writing unreadable code, and
|
|
<CODE>xgettext</CODE> does its best to do likewise. This is where the dot
|
|
operator (the string concatenation operator) may come in handy:
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
print gettext ("This is a very long"
|
|
. " message that is still"
|
|
. " readable, because"
|
|
. " it is split into"
|
|
. " multiple lines.\n");
|
|
</PRE>
|
|
|
|
<P>
|
|
Perl is smart enough to concatenate these constant string fragments
|
|
into one long string at compile time, and so is
|
|
<CODE>xgettext</CODE>. You will only find one long message in the resulting
|
|
POT file.
|
|
|
|
</P>
|
|
<P>
|
|
Note that the future Perl 6 will probably use the underscore
|
|
(<SAMP>`_´</SAMP>) as the string concatenation operator, and the dot
|
|
(<SAMP>`.´</SAMP>) for dereferencing. This new syntax is not yet supported by
|
|
<CODE>xgettext</CODE>.
|
|
|
|
</P>
|
|
<P>
|
|
If embedded newline characters are not an issue, or even desired, you
|
|
may also insert newline characters inside quoted strings wherever you
|
|
feel like it:
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
print gettext ("<em>In HTML output
|
|
embedded newlines are generally no
|
|
problem, since adjacent whitespace
|
|
is always rendered into a single
|
|
space character.</em>");
|
|
</PRE>
|
|
|
|
<P>
|
|
You may also consider to use here documents:
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
print gettext <<EOF;
|
|
<em>In HTML output
|
|
embedded newlines are generally no
|
|
problem, since adjacent whitespace
|
|
is always rendered into a single
|
|
space character.</em>
|
|
EOF
|
|
</PRE>
|
|
|
|
<P>
|
|
Please do not forget, that the line breaks are real, i. e. they
|
|
translate into newline characters that will consequently show up in
|
|
the resulting POT file.
|
|
|
|
</P>
|
|
|
|
|
|
<H4><A NAME="SEC279" HREF="gettext_toc.html#TOC279">13.5.18.9 Bugs, Pitfalls, And Things That Do Not Work</A></H4>
|
|
<P>
|
|
<A NAME="IDX1155"></A>
|
|
|
|
</P>
|
|
<P>
|
|
The foregoing sections should have proven that
|
|
<CODE>xgettext</CODE> is quite smart in extracting translatable strings from
|
|
Perl sources. Yet, some more or less exotic constructs that could be
|
|
expected to work, actually do not work.
|
|
|
|
</P>
|
|
<P>
|
|
One of the more relevant limitations can be found in the
|
|
implementation of variable interpolation inside quoted strings. Only
|
|
simple hash lookups can be used there:
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
print <<EOF;
|
|
$gettext{"The dot operator"
|
|
. " does not work"
|
|
. "here!"}
|
|
Likewise, you cannot @{[ gettext ("interpolate function calls") ]}
|
|
inside quoted strings or quote-like expressions.
|
|
EOF
|
|
</PRE>
|
|
|
|
<P>
|
|
This is valid Perl code and will actually trigger invocations of the
|
|
<CODE>gettext</CODE> function at runtime. Yet, the Perl parser in
|
|
<CODE>xgettext</CODE> will fail to recognize the strings. A less obvious
|
|
example can be found in the interpolation of regular expressions:
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
s/<!--START_OF_WEEK-->/gettext ("Sunday")/e;
|
|
</PRE>
|
|
|
|
<P>
|
|
The modifier <CODE>e</CODE> will cause the substitution to be interpreted as
|
|
an evaluable statement. Consequently, at runtime the function
|
|
<CODE>gettext()</CODE> is called, but again, the parser fails to extract the
|
|
string "Sunday". Use a temporary variable as a simple workaround if
|
|
you really happen to need this feature:
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
my $sunday = gettext "Sunday";
|
|
s/<!--START_OF_WEEK-->/$sunday/;
|
|
</PRE>
|
|
|
|
<P>
|
|
Hash slices would also be handy but are not recognized:
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
my @weekdays = @gettext{'Sunday', 'Monday', 'Tuesday', 'Wednesday',
|
|
'Thursday', 'Friday', 'Saturday'};
|
|
# Or even:
|
|
@weekdays = @gettext{qw (Sunday Monday Tuesday Wednesday Thursday
|
|
Friday Saturday) };
|
|
</PRE>
|
|
|
|
<P>
|
|
This is perfectly valid usage of the tied hash <CODE>%gettext</CODE> but the
|
|
strings are not recognized and therefore will not be extracted.
|
|
|
|
</P>
|
|
<P>
|
|
Another caveat of the current version is its rudimentary support for
|
|
non-ASCII characters in identifiers. You may encounter serious
|
|
problems if you use identifiers with characters outside the range of
|
|
'A'-'Z', 'a'-'z', '0'-'9' and the underscore '_'.
|
|
|
|
</P>
|
|
<P>
|
|
Maybe some of these missing features will be implemented in future
|
|
versions, but since you can always make do without them at minimal effort,
|
|
these todos have very low priority.
|
|
|
|
</P>
|
|
<P>
|
|
A nasty problem are brace format strings that already contain braces
|
|
as part of the normal text, for example the usage strings typically
|
|
encountered in programs:
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
die "usage: $0 {OPTIONS} FILENAME...\n";
|
|
</PRE>
|
|
|
|
<P>
|
|
If you want to internationalize this code with Perl brace format strings,
|
|
you will run into a problem:
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
die __x ("usage: {program} {OPTIONS} FILENAME...\n", program => $0);
|
|
</PRE>
|
|
|
|
<P>
|
|
Whereas <SAMP>`{program}´</SAMP> is a placeholder, <SAMP>`{OPTIONS}´</SAMP>
|
|
is not and should probably be translated. Yet, there is no way to teach
|
|
the Perl parser in <CODE>xgettext</CODE> to recognize the first one, and leave
|
|
the other one alone.
|
|
|
|
</P>
|
|
<P>
|
|
There are two possible work-arounds for this problem. If you are
|
|
sure that your program will run under Perl 5.8.0 or newer (these
|
|
Perl versions handle positional parameters in <CODE>printf()</CODE>) or
|
|
if you are sure that the translator will not have to reorder the arguments
|
|
in her translation -- for example if you have only one brace placeholder
|
|
in your string, or if it describes a syntax, like in this one --, you can
|
|
mark the string as <CODE>no-perl-brace-format</CODE> and use <CODE>printf()</CODE>:
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
# xgettext: no-perl-brace-format
|
|
die sprintf ("usage: %s {OPTIONS} FILENAME...\n", $0);
|
|
</PRE>
|
|
|
|
<P>
|
|
If you want to use the more portable Perl brace format, you will have to do
|
|
put placeholders in place of the literal braces:
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
die __x ("usage: {program} {[}OPTIONS{]} FILENAME...\n",
|
|
program => $0, '[' => '{', ']' => '}');
|
|
</PRE>
|
|
|
|
<P>
|
|
Perl brace format strings know no escaping mechanism. No matter how this
|
|
escaping mechanism looked like, it would either give the programmer a
|
|
hard time, make translating Perl brace format strings heavy-going, or
|
|
result in a performance penalty at runtime, when the format directives
|
|
get executed. Most of the time you will happily get along with
|
|
<CODE>printf()</CODE> for this special case.
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC280" HREF="gettext_toc.html#TOC280">13.5.19 PHP Hypertext Preprocessor</A></H3>
|
|
<P>
|
|
<A NAME="IDX1156"></A>
|
|
|
|
</P>
|
|
<DL COMPACT>
|
|
|
|
<DT>RPMs
|
|
<DD>
|
|
mod_php4, mod_php4-core, phpdoc
|
|
|
|
<DT>File extension
|
|
<DD>
|
|
<CODE>php</CODE>, <CODE>php3</CODE>, <CODE>php4</CODE>
|
|
|
|
<DT>String syntax
|
|
<DD>
|
|
<CODE>"abc"</CODE>, <CODE>'abc'</CODE>
|
|
|
|
<DT>gettext shorthand
|
|
<DD>
|
|
<CODE>_("abc")</CODE>
|
|
|
|
<DT>gettext/ngettext functions
|
|
<DD>
|
|
<CODE>gettext</CODE>, <CODE>dgettext</CODE>, <CODE>dcgettext</CODE>; starting with PHP 4.2.0
|
|
also <CODE>ngettext</CODE>, <CODE>dngettext</CODE>, <CODE>dcngettext</CODE>
|
|
|
|
<DT>textdomain
|
|
<DD>
|
|
<CODE>textdomain</CODE> function
|
|
|
|
<DT>bindtextdomain
|
|
<DD>
|
|
<CODE>bindtextdomain</CODE> function
|
|
|
|
<DT>setlocale
|
|
<DD>
|
|
Programmer must call <CODE>setlocale (LC_ALL, "")</CODE>
|
|
|
|
<DT>Prerequisite
|
|
<DD>
|
|
---
|
|
|
|
<DT>Use or emulate GNU gettext
|
|
<DD>
|
|
use
|
|
|
|
<DT>Extractor
|
|
<DD>
|
|
<CODE>xgettext</CODE>
|
|
|
|
<DT>Formatting with positions
|
|
<DD>
|
|
<CODE>printf "%2\$d %1\$d"</CODE>
|
|
|
|
<DT>Portability
|
|
<DD>
|
|
On platforms without gettext, the functions are not available.
|
|
|
|
<DT>po-mode marking
|
|
<DD>
|
|
---
|
|
</DL>
|
|
|
|
<P>
|
|
An example is available in the <TT>`examples´</TT> directory: <CODE>hello-php</CODE>.
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC281" HREF="gettext_toc.html#TOC281">13.5.20 Pike</A></H3>
|
|
<P>
|
|
<A NAME="IDX1157"></A>
|
|
|
|
</P>
|
|
<DL COMPACT>
|
|
|
|
<DT>RPMs
|
|
<DD>
|
|
roxen
|
|
|
|
<DT>File extension
|
|
<DD>
|
|
<CODE>pike</CODE>
|
|
|
|
<DT>String syntax
|
|
<DD>
|
|
<CODE>"abc"</CODE>
|
|
|
|
<DT>gettext shorthand
|
|
<DD>
|
|
---
|
|
|
|
<DT>gettext/ngettext functions
|
|
<DD>
|
|
<CODE>gettext</CODE>, <CODE>dgettext</CODE>, <CODE>dcgettext</CODE>
|
|
|
|
<DT>textdomain
|
|
<DD>
|
|
<CODE>textdomain</CODE> function
|
|
|
|
<DT>bindtextdomain
|
|
<DD>
|
|
<CODE>bindtextdomain</CODE> function
|
|
|
|
<DT>setlocale
|
|
<DD>
|
|
<CODE>setlocale</CODE> function
|
|
|
|
<DT>Prerequisite
|
|
<DD>
|
|
<CODE>import Locale.Gettext;</CODE>
|
|
|
|
<DT>Use or emulate GNU gettext
|
|
<DD>
|
|
use
|
|
|
|
<DT>Extractor
|
|
<DD>
|
|
---
|
|
|
|
<DT>Formatting with positions
|
|
<DD>
|
|
---
|
|
|
|
<DT>Portability
|
|
<DD>
|
|
On platforms without gettext, the functions are not available.
|
|
|
|
<DT>po-mode marking
|
|
<DD>
|
|
---
|
|
</DL>
|
|
|
|
|
|
|
|
<H3><A NAME="SEC282" HREF="gettext_toc.html#TOC282">13.5.21 GNU Compiler Collection sources</A></H3>
|
|
<P>
|
|
<A NAME="IDX1158"></A>
|
|
|
|
</P>
|
|
<DL COMPACT>
|
|
|
|
<DT>RPMs
|
|
<DD>
|
|
gcc
|
|
|
|
<DT>File extension
|
|
<DD>
|
|
<CODE>c</CODE>, <CODE>h</CODE>.
|
|
|
|
<DT>String syntax
|
|
<DD>
|
|
<CODE>"abc"</CODE>
|
|
|
|
<DT>gettext shorthand
|
|
<DD>
|
|
<CODE>_("abc")</CODE>
|
|
|
|
<DT>gettext/ngettext functions
|
|
<DD>
|
|
<CODE>gettext</CODE>, <CODE>dgettext</CODE>, <CODE>dcgettext</CODE>, <CODE>ngettext</CODE>,
|
|
<CODE>dngettext</CODE>, <CODE>dcngettext</CODE>
|
|
|
|
<DT>textdomain
|
|
<DD>
|
|
<CODE>textdomain</CODE> function
|
|
|
|
<DT>bindtextdomain
|
|
<DD>
|
|
<CODE>bindtextdomain</CODE> function
|
|
|
|
<DT>setlocale
|
|
<DD>
|
|
Programmer must call <CODE>setlocale (LC_ALL, "")</CODE>
|
|
|
|
<DT>Prerequisite
|
|
<DD>
|
|
<CODE>#include "intl.h"</CODE>
|
|
|
|
<DT>Use or emulate GNU gettext
|
|
<DD>
|
|
Use
|
|
|
|
<DT>Extractor
|
|
<DD>
|
|
<CODE>xgettext -k_</CODE>
|
|
|
|
<DT>Formatting with positions
|
|
<DD>
|
|
---
|
|
|
|
<DT>Portability
|
|
<DD>
|
|
Uses autoconf macros
|
|
|
|
<DT>po-mode marking
|
|
<DD>
|
|
yes
|
|
</DL>
|
|
|
|
|
|
|
|
<H2><A NAME="SEC283" HREF="gettext_toc.html#TOC283">13.6 Internationalizable Data</A></H2>
|
|
|
|
<P>
|
|
Here is a list of other data formats which can be internationalized
|
|
using GNU gettext.
|
|
|
|
</P>
|
|
|
|
|
|
|
|
<H3><A NAME="SEC284" HREF="gettext_toc.html#TOC284">13.6.1 POT - Portable Object Template</A></H3>
|
|
|
|
<DL COMPACT>
|
|
|
|
<DT>RPMs
|
|
<DD>
|
|
gettext
|
|
|
|
<DT>File extension
|
|
<DD>
|
|
<CODE>pot</CODE>, <CODE>po</CODE>
|
|
|
|
<DT>Extractor
|
|
<DD>
|
|
<CODE>xgettext</CODE>
|
|
</DL>
|
|
|
|
|
|
|
|
<H3><A NAME="SEC285" HREF="gettext_toc.html#TOC285">13.6.2 Resource String Table</A></H3>
|
|
<P>
|
|
<A NAME="IDX1159"></A>
|
|
|
|
</P>
|
|
<DL COMPACT>
|
|
|
|
<DT>RPMs
|
|
<DD>
|
|
fpk
|
|
|
|
<DT>File extension
|
|
<DD>
|
|
<CODE>rst</CODE>
|
|
|
|
<DT>Extractor
|
|
<DD>
|
|
<CODE>xgettext</CODE>, <CODE>rstconv</CODE>
|
|
</DL>
|
|
|
|
|
|
|
|
<H3><A NAME="SEC286" HREF="gettext_toc.html#TOC286">13.6.3 Glade - GNOME user interface description</A></H3>
|
|
|
|
<DL COMPACT>
|
|
|
|
<DT>RPMs
|
|
<DD>
|
|
glade, libglade, glade2, libglade2, intltool
|
|
|
|
<DT>File extension
|
|
<DD>
|
|
<CODE>glade</CODE>, <CODE>glade2</CODE>
|
|
|
|
<DT>Extractor
|
|
<DD>
|
|
<CODE>xgettext</CODE>, <CODE>libglade-xgettext</CODE>, <CODE>xml-i18n-extract</CODE>, <CODE>intltool-extract</CODE>
|
|
</DL>
|
|
|
|
<P><HR><P>
|
|
Go to the <A HREF="gettext_1.html">first</A>, <A HREF="gettext_12.html">previous</A>, <A HREF="gettext_14.html">next</A>, <A HREF="gettext_22.html">last</A> section, <A HREF="gettext_toc.html">table of contents</A>.
|
|
</BODY>
|
|
</HTML>
|