On Wed, Nov 17, 2004 at 01:17:12AM +0100, Denis Barbier wrote:
On Mon, Nov 15, 2004 at 11:55:31PM +0100, Nicolas François wrote:
> Hello,
>
> To solve an issue with the man module, I've implemented a way to specify
> some (multi-lines) verbatim blocks.
I did not understand your message at first reading because I thought
that 'verbatim blocks' were unformatted, maybe you should talk about
'untranslated blocks' instead.
I wanted to insist on the fact that these blocks are not only
untranslated, but also not touched by the parser (without rewrapping, or
any kind of reformating).
> With my first try, I could correctly process 60 additional files
(without
> even defining additional macros). 25 different blocks were used (each one
> defined in a file).
>
>
> Everything is not so neat:
> - if the block appear in the middle of a paragraph, the block will be
> copied verbatim at a wrong place: at the beginning of the paragraph
> (this could be fixed by adding a function to "flush" the parser).
> Most of the time, it is not a concern because the macros are defined
> in the header and it is up to the user to specify or not a verbatim
> block, but it should probably be fixed.
IMO such blocks should split paragraphs into 3 parts: before, block
itself, after.
I'm not really worried about notifying the parser. Do you have some ideas
for the user interface?
- The user has to be notified in the po (it will probably be module
specific), but something like a U<There is an untranslated block
here> could be OK for the man module.
- The user has to specify if the block ends a paragraph or not
> - it makes po4a slower (I don't think it's an issue,
except for the
> testsuite;): each lines of the input document has to be compared with
> a line of each specified files (and lines have to be shifted/unshifted
> many times).
Do not worry, I have some ideas to optimize regexes. Do you have some
large file which could be useful for benchmarking?
My first idea is to sort man pages on their size. Next time I will run the
check script, I will take the cpu time used by po4a for each pages.
I will try to build an interesting man page.
> - the user can do whatever he wants and it is, as the
addendums, a quite
> complicated feature which can be error prone.
I do not follow you here, my understanding was that blocks were copied
from original file, so translators have almost no control here.
Forget about it:
I was only thinking of a scenario where the user is bored and add
something that should be translated to a "multi-lines untranslated block".
> Do you think it may be worth having such a mechanism in po4a?
> Is there an interest for the other modules?
Sounds like a very good idea, but maybe you should first explain in more
details how you want to proceed.
Please find attached my current implementation (don't worry, I don't
intend to commit it as is).
It is not optimized at all (I'm reading all the little files for each
lines of the man page). It was done to test if the man pages were really
using always the same blocks.
I've put all "untranslated blocks" in different files in the same directory
(it's path is currently hardcoded).
BTW, is there a way to declare a function in Perl (I moved initialize at
the end because I needed \&translate_joined and \&untranslated).
I'm also adding some of the files I used with the man module.
> Note:
> For the man module, I'm also planing to add a verbatim and translate
> option (like in the sgml module), which should allow to specify the
> behaviour of the parser for these additional macros.
>
> Another Note:
> I've not tried this, but it may solve #263298 (Please let -gettextize know
> about addendums and remove them automatically).
This would be really cool, I am converting man pages of manpages-fr by hand,
and it is quite boring ;)
If you need the verbatim and translate options, I can push them quickly in
CVS.
--
Nekral