Thanks for your work on encoding issues, dudes, I really suck at that.
On Wed, Aug 04, 2004 at 12:29:54AM +0200, Denis Barbier wrote:
On Tue, Aug 03, 2004 at 11:46:50PM +0200, Jordi Vilalta wrote:
You were talking about po4a-translate and localized file charset,
and
now gettextizing master file. In the latter case, if master file
contains only ASCII, no conversion is performed. Otherwise it has to be
recoded into UTF-8, and there is indeed a problem if original charset is
not specified. One could check whether it is UTF-8, and goes back to
ISO-8859-1 otherwise, but unspecified encodings really suck, so let's
be pedantic and force those people to declare their encoding. After
all they know the encoding used in their English documentation, so they
can add the right options to po4a tools.
I'm ok with being pedentic here, too. This approach would fit me:
For the master:
- if no encoding specified, supposed to be UTF8
- if it's not valid UTF8, refuse to process until being given what it is
For translations:
- if not specified, suppose it's the same than the one in translated part
of the po file
- could be cool if we could check that the encoding is not broken, but I'm
not sure whether it's even possible.
- during gettextization, assume it's UTF8 if no encoding is provided, whine
for a proper setting if it's not the case
For po files:
- msgid must be in UTF8. No matter what happen.
- msgstr have to be in the encoding specified in the po file headers.
And once all this in implemented, we could be able to quit with assuming
that master-document = english-document ;)
Again, I've no definitive idea of all this should work, all this is merely a
proposition.
Thanks, Mt.
--
Dans la france profonde, il y a surtout des spéléologues.
-- Le Chat