Martin Quinson wrote:
[...]
> * Locale::gettext (perl module)
> Needed by po4a for localization.
> Provided by liblocale-gettext-perl on Debian, perl-Locale-gettext on
>Mandrake and Fedora Core(DAG), perl-gettext on SUSE.
>
> Would you be open to a patch that acted as a wrapper around
>Locale::gettext so that po4a would continue to work untranslated if that
>module was missing?
[...]
Ok. Great. It could even be done by default in the Common.pm module,
which
would then export the d?gettext functions. Impact on other parts would be
the need to kill the explicit gettext loading.
Good idea. I'll do that.
> * Text::WrapI18N (perl module)
> Pure perl (so easy to check in) but depends on Text::CharWidth which
>is not pure perl.
> Provided by libtext-wrapi18n-perl on Debian. Found no RPM packages
>providing it.
>
> Text::WrapI18N was not used in po4a 0.16.2. I initially thought it
>was used to wrap the text being output to the .po and .sgml files but in
>fact it seems to only be used to print messages, warnings and errors.
>Why is it needed? Doesn't a simple print work fine?
This module becomes important when you want to wrap CKJ languages
(japaneese, corean), which don't have any spaces, if I understood well. So
finding where you can cut the sentence properly is not as easy as in, say,
french.
So, actually, we *ought* to use it all around the place. Maybe through a
wrapper such as the one you propose for gettext...
Ok. I can understand why it would be important to use it for writing po
files and Sgml files. It's a shame it's used everywhere but there ;-)
(I've seen your other email explaining why it's that way).
What I don't understand is why it is used to print informational and
error messages. Won't the xterm wrap things on its own? It does just
fine when I print a very long line in French or English. Doesn't it do
the same CKJ languages? If not I'd say it is pretty broken.
My next question has to do with the wrapper implementation. I did not
check it out in details but it seems like it works on the multibyte
string. Wouldn't it be possible to implement a wrapper using the Perl
5.8 Unicode support? It would work like this:
# "Noël Noël" in UTF-8
my $oct="Noël Noël";
Here length($oct) = 11 because ë is encoded as two 8bit characters.
my $str=Encode::decode("UTF-8", $oct);
Here length($str) = 9 because ë is encoded as one unicode
character. This means we can cut up the string as we want:
my $sub=Encode::encode("UTF-8",substr($str,0,3));
Here $sub contains "Noë", that is "Noë" in UTF-8 and has a
length
of 4.
There's still the issue of word boundaries because, at least for the
Sgml file, we would not want to cut a word in two. But I would expect
that, on Unicode strings, \b, \w amd \W would work sensibly even for CKJ
languages. The advantage is that this would make it possible to do
wrapping using only standard Perl features. The drawback is it would
required perl 5.8.
--
Francois Gouget
fgouget(a)codeweavers.com