Hi,
On Wed, 16 Feb 2005, Nicolas François wrote:
On Wed, Feb 16, 2005 at 01:00:09AM +0100, Jordi Vilalta wrote:
> I was just gettextizing some man pages and I've noticed a problem when
> trying to mix several po files:
>
> $ msgcat *.po
> file1.po:19:10: invalid multibyte sequence
> msgcat: found 1 fatal error
>
> I've found that there was a strange character in that position, and it
> seems it's the equivalent of man page's "\ ". What's its
meaning? Why is
> it handled with this strange byte? It seems we're generating non-compliant
> po files :S
Yes, "\ " are changed to 0xA0. Maybe this should be done only if the
charset used support this character (at least UTF-8 & latin-1).
Is it important to mantain a "\ " instead of converting it to a standard
space? When translators rewrite the message, (I think) they write standard
spaces, so the "\ " loses its posible utility. If it's important to
maintain them, I think it would be better to put "\ " in the po files.
However, I'm surprised it generate an error. I'm only getting
warnings
(sometimes annoying):
warning: The following msgid contains non-ASCII characters.
This will cause problems to translators who use a character encoding
different from yours. Consider using a pure ASCII msgid instead.
(There is no warning when the charset is UTF-8)
Can you point me to the man page you gettextized (I will need the original
and translated man page)?
It has happened for example with the ldd man page (along with a lot more).
There's no need to use the translated one. Here's a simple example to
reproduce it:
- create a simple man page that contains this line (typical):
\-V\ \-\-version
- po4a-gettextize -f man -m file.man -p file.po
- edit file.po to put a valid charset
- msgcat file.po: with ascii and utf-8 charsets i get this:
file.po:19:10: invalid multibyte sequence
msgcat: found 1 fatal error
If I use iso-8859-1, for example, I get the warning you said. But msgids
should be valid in ascii or utf-8 (culturally neutral).
Regards,
Jordi Vilalta