On Wed, Aug 08, 2007 at 01:52:47PM +0300, Yavor Doganov wrote:
Yes, this violates the rule of thumb for i18n not to break sentences
in
this manner. It becomes even worse if you have things <em>like</em>
<i>this</i> where in some languages you have to insert a preposition or
something else between them.
Oh, that reminds me of another problem: if somewhere else
you write that you <em>like</em> HTML, po4a helpfully
assumes 'like' in 'like this' and 'like' in 'I like it'
are
the same. That rarely produces useful content in other
languages :-)
OTOH, Html.pm produces markup-free POT files, which is very good for
our
purpose: to separate the content entirely so the translator doesn't have
to repeat the error-prone process of duplicating the markup. But if it
turns out that it is not possible to translate the content properly in
some languages, this gain means nothing.
I think you need to assess your exact need: If the need is
to translate the Web pages as they are, then there is little
other choice but to have the translator handle the inline
markup (in my first example, only the translator can decide
to turn "the 4<sup>th</sup> of July" into "Le 4
juillet").
But it the need is to get the content translated as
cheaply/easily as possible, then probably the best would be
to extract all the content to text, translate that, and
generate pages from that. You'd obviously lose text markup,
links and so on, though.
Y.