Sorry for the delay, you know the refrain.
summary of previous ones
------------------------
In man.pm, we transliterate "\ " to 0xAO since both mean "non-breaking
space". The first one in the groff parlance, and the second one in the
latin-1 encoding.
There is two conversions:
"\ " -> 0xAO on string extraction so that translators see 0xAO (pre_trans)
0xAO -> "\ " on translation production so that groff see "\ "
(post_trans)
This feature helps french translators which often have to add non-breaking
space in their work (and are used to 0xAO). It also somehow simplify the
internals of the man module, when it comes to spliting the argument passed
to a command.
[The same kind of thing is used in TeX to change \'e to é, and in Man to
change \(dq to ". Such beasts make these transliterations really helpful,
not only to french dudes]
The problem is that this char is not valid in all encodings (obviously, it's
not in ascii).
So, how to keep encoding agnostic?
----------------------------------
We can remove this transliteration (don't do pre_trans). It doesn't break
existing po. We loose the feature.
We can do it when the user pass the right option (which may be
module-specific or system-wide). It's easy to implement and we can point
users to the feature on issue, but it's a kind of BOTH attitude since
whatever default setting we choose, it will bite some people. But it was my
first idea, I must be a BOTH deep inside ;)
We can do them only if it doesn't break the specified encoding.
My opinion
----------
- We have to leave post_trans in place (groff may not like 0xAO).
- Doing so only if out_charset contain the char seems to be the best idea.
It won't break, and people can specify the encoding as an option to
explicitly use this (if automatic detection wasn't done yet).
- I have no idea of which transliteration can be done in which encoding.
- We may want to add a module specific option to man since some of those
transliteration do matter. (- and \- are different, even if it's
impossible to say which is which from the text rendered one). It would
only be for the dicutable ones: ( s/\*\(lq/``/ s/\*\(rq/''/ s/\(dq/"/ )
The post_trans may cause issues for them.
I would also have Martin's opinion on this point (I may have
advocated for
this feature, but he has done the commit).
Sorry for the delay again. I now live in Nancy, a cold but beautyful city
about the German and Belgium borders. I kept my last appartment exactly 5
months, and the previous one 2 month and the previous one 4 months and the
previous one less than one year. I get bored and dream of the stability that
this position in Nancy should give me.
Anyway, I won't be online for the WE, but I'd love to see po4a released or
releasable on monday ;)
Bye, Mt.