[ds6-devel] Re: i18n

Simone Piunno pioppo@ferrara.linux.it
Fri Jan 3 18:22:28 CET 2003


On Fri, Jan 03, 2003 at 06:29:03PM +0200, Chris Leishman wrote:

> >yes!
> >This is another piece of code we could probably put in "contrib" and
> >use our own copy whenever we're compiling on a non-GNU system.
> 
> Since your keen on doing this, and have the experience, can you give us 
> an overview of the process and what overheads it adds?  

ok, basically the translation process is performed at runtime, looking up
in a table where each string is associated with its translation in a 
given language.  

Inside the code, you must substitute each translatable string with a 
function call.  Traditionally the function is _().  E.g.

  printf("hello world")  ->  printf(_("hello world"))

At runtime, _() is invoked with the string to translate as a parameter
and will return the translated string after looking up the table.
This table is a binary hash file (for fast access), automatically built 
(or better, compiled) from a text file, using the "msgfmt" tool.

This text file is composed of string pairs, for each string there is
the original english and the correspondent translation.  Translators
receive this strings-only file and don't have to understand the C
language to provide a good translation (this is the main point).
There are also visual tool to better handle those files (e.g. kbabel).

The master strings-file can be built automatically: gettext is a command
line tool to automatically scan a C source file, extracting all strings
marked with _().

> Can we still build non-i18n versions easily (eg. for embedded systems,
> etc)?  I haven't really played with that sort of thing before.

_() is always a macro, so you can easily disable all the library and 
save the overhead, just using a different compile-time configuration

To summarize, developers have to:
 - initialize the library at program start
 - mark translatable strings
 - arrange Makefile to automatically do msgfmt on translation files
 - periodically rebuild the master strings-file, by running the
   extraction tool

And translators have to:
 - keep track of changes in the master strings-file
 - provide the translated strings-file (in text form)
 
Sometimes to easy translators' life we'll have to rephrase some
string or to change some code.  A typical example is this:

 printf(_("We've counted %d file%s"), count, ((count == 1) ? "" : "s"));

Note that in this example translators have no chance of not using "s" 
for the plural form.  Therefore we developers must rewrite this way:

 if (count == 1) {
     printf(_("We've counted 1 file"));
 } else {
     printf(_("We've counted %d files"), count);
 }

Note that _() must be invoked only for literal strings 
e.g this is always wrong:

  char s[] = "pippo";
  printf("%s", _(s));
  
because automatic extraction becomes impossibile.

Things could get a little worse when we'll start to support asian
languages, because they have very different charsets... in Mailman
this was a can of worms but in our case it should be simple (e.g.
we can support utf-8 without any additional work).

-- 
Adde parvum parvo magnus acervus erit.
Simone Piunno, FerraraLUG - http://members.ferrara.linux.it/pioppo


More information about the ds6-devel mailing list