[Po4a-devel][RFC] Multi-lines verbatim blocks
by Nicolas François
Hello,
To solve an issue with the man module, I've implemented a way to specify
some (multi-lines) verbatim blocks.
I'm wondering if this functionality has an interest for po4a users (it
may be error prone), and if it can be useful for other modules (in this
case it should be implemented in Transtractor).
The major cause (50%) of failure of the man module (po4a exits with an
error message) concerns blocks in roff language (using .de, .if, .ie or
.el requests).
They are usually used to define new macros or different ways to do the
same thing depending on the parser (e.g. nroff or troff).
Copying these block of code from the original to the translation is
particularly useful for the man module because the roff language is quite
complicated, which pushed authors to just cut & past definitions found in
other pages (a lot of the defined macros are not even used).
With my first try, I could correctly process 60 additional files (without
even defining additional macros). 25 different blocks were used (each one
defined in a file).
Everything is not so neat:
- if the block appear in the middle of a paragraph, the block will be
copied verbatim at a wrong place: at the beginning of the paragraph
(this could be fixed by adding a function to "flush" the parser).
Most of the time, it is not a concern because the macros are defined
in the header and it is up to the user to specify or not a verbatim
block, but it should probably be fixed.
- it makes po4a slower (I don't think it's an issue, except for the
testsuite;): each lines of the input document has to be compared with
a line of each specified files (and lines have to be shifted/unshifted
many times).
- the user can do whatever he wants and it is, as the addendums, a quite
complicated feature which can be error prone.
- ...
Do you think it may be worth having such a mechanism in po4a?
Is there an interest for the other modules?
Note:
For the man module, I'm also planing to add a verbatim and translate
option (like in the sgml module), which should allow to specify the
behaviour of the parser for these additional macros.
Another Note:
I've not tried this, but it may solve #263298 (Please let -gettextize know
about addendums and remove them automatically).
Kind Regards,
--
Nekral
20 years
[Po4a-devel]Re: [Po4a-commits] po4a/lib/Locale/Po4a Man.pm,1.58,1.59
by Martin Quinson
On Fri, Nov 12, 2004 at 12:29:20PM +0000, Nicolas FRAN??OIS wrote:
> - do_paragraph($self,$paragraph,$wrapped_mode);
> - $paragraph="";
> + if ($paragraph) {
> + do_paragraph($self,$paragraph,$wrapped_mode);
> + $paragraph="";
> + }
If a paragraph consists in the string "0", it will return false. Prefer
testing the length() of the string.
It used to bring issues in Pod.pm, for example.
Thanks for your time,
Mt.
20 years
translate.org.za: a project quite similar to po4a
by Denis Barbier
Hi,
a post by Christian Perrier on debian-i18n made me go to
http://translate.org.za/
] Translate.org.za is a non-profit organisation producing Free and Open
] Source software that enables and empowers South Africans. The
] Translate Project started in 2001 with the vision of providing Free
] Software translated into the 11 official languages of South Africa.
A news item on their main page tells:
] September 27, 2004
]
] OOoCon - The OpenOffice.org conference in Berlin
]
] Translate.org.za attended OOoCon 2004which was held in Berlin from the
] 22 to 24 of September.
]
] Our goal was to raise the issues of localisation and the frustrations
] that we experience working with the OpenOffice,org project. Dwayne
] presented a paper on Translate.org.za - our tools goals and
] philosophy.
]
] It was very important that we were there - we met with partners in the
] Khmer team and completed work on some of the Translate Toolkit. Most
] people within the OpenOffice community think the OpenOffice.org is
] managing localisation quite well - we were able to give them a
] translators view of that assumption. People are listening and there
] are some changes that will hopefully feature in OpenOffice.org 2.0
] that will make it easier to produce and use translated version of
] OpenOffice.org
They develop tools similar to po4a (see <URL:http://translate.sf.net/>)
and currently support Mozilla and OpenOffice.org formats.
Their website has many very useful tips and documentation, e.g. I liked
http://sf.net/projects/gettextlog/
which is a program to log calls to gettext when programs are run, in
order to determine which msgids are seldom displayed and can be ignored
at first.
Denis
20 years
po4a 0.19 tomorrow?
by Denis Barbier
Due to last minute commits, I did not release 0.19 yet.
I will do tomorrow night. Please do not change strings so that
translators can polish their translations. Take care that PO files
have to be UTf-8 encoded because msgids contain some non-ASCII
characters.
Danilo, there have been some minor changes, your translation is
currently:
po/bin/it: 104 translated messages, 1 fuzzy translation, 3 untranslated messages.
po/pod/it: 243 translated messages, 6 fuzzy translations, 543 untranslated messages.
I will release 0.19.1 when you update your translation, sorry for the
late notice.
Nicolas, you can improve French translation if you want, Martin won't
have time to work on it.
Thanks all for your work.
Denis
20 years
[Po4a-devel]Administrivia
by Yves Rutschle
Hello po4a-ers,
My better half has talked me into translating and
maintaining our Web site into French; after some searching
around, I found po4a was probably the best tool, and almost
does all I need (or so I like to think now).
So, expect some patches to the HTML module, which is
curiously "almost finished", but also "useless as is". :)
Meanwhile, I've got a couple of remarks on the administrative
side:
- There is no mention of where the tools are being
developed. Surely it would make sense to add a link to
the page on alioth somewhere in a README that would
install with the package.
- As it is, before finding alioth, I found savannah. The
savannah projet is obviously obsolete and unused, but
there is no way to know that. It wouldn't be bad if there
was no reference to it from google, but there is. Maybe
someone who still has access to Savannah could add a
mention that that site isn't used anymore?
- What does PO stand for?
That's all folks,
Y.
20 years
[Po4a-devel]#278365: Found the issue, here is a workaround
by Martin Quinson
Hello,
I guess I found what the issue is. There is some comments in the file
prolog. It's not handled by po4a yet. I'm working on implementing this, but
in the meanwhile just remove them as workaround.
--------------------------
--- ../cron-apt.sgml 2004-11-07 15:58:36.000000000 +0100
+++ cron-apt.sgml 2004-11-07 16:02:57.000000000 +0100
@@ -1,21 +1,8 @@
<!doctype refentry PUBLIC "-//OASIS//DTD DocBook V4.1//EN" [
-<!-- Process this file with docbook-to-man to generate an nroff manual
- page: `docbook-to-man manpage.sgml > manpage.1'. You may view
- the manual page with: `docbook-to-man manpage.sgml | nroff -man |
- less'. A typical entry in a Makefile or Makefile.am is:
-
-manpage.8: manpage.sgml
- docbook-to-man $< > $@
- -->
-
- <!-- Fill in your name for FIRSTNAME and SURNAME. -->
<!ENTITY dhfirstname "<firstname>Ola</firstname>">
<!ENTITY dhsurname "<surname>Lundqvist</surname>">
- <!-- Please adjust the date whenever revising the manpage. -->
<!ENTITY dhdate "<date>mars 5, 2002</date>">
- <!-- SECTION should be 1-8, maybe w/ subsection other parameters are
- allowed: see man(7), man(1). -->
<!ENTITY dhsection "<manvolnum>8</manvolnum>">
<!ENTITY dhemail "<email>opal(a)debian.org</email>">
<!ENTITY dhusername "Ola Lundqvist">
--------------------------
Thanks for your interest in po4a.
Bye, Mt.
20 years
[Po4a-devel]Some patches for the Man module
by Nicolas François
Hello,
I should have released these patches a few time ago, but Alioth had troubles.
Sorry if this mail is a little bit long.
Some are still not "ready for production", and are provided to inform you
I'm working on those subjects (and also to grab some ideas).
I also had to work on the testsuite (the check script) and added a
stats.sh scripts for regression tests. Here are how this last script
formats regression tests statistics:
$ ./stats.sh orig work
IGN OK OK2 WOK1 WOK2 WOK3 PBS WDIFF
IGN 1698 0 0 0 0 0 0 0
OK 0 125 0 0 0 0 0 0
OK2 0 0 1564 4 2 0 0 0
WOK1 0 8 0 89 0 1 0 0
WOK2 0 0 0 0 208 0 0 1
WOK3 0 2 11 2 4 301 3 0
PBS 0 35 150 15 48 43 585 14
WDIFF 0 1 9 4 5 8 0 39
total: 4979 | 4979
(It takes two directories in argument, the two directories containing
results of the check script, i.e. the LISTE files. It creates a stats_work
directory.)
You can read this table like this:
11 man pages which were in the WOK3 category in orig are now in OK2.
Those pages can be found in stats_work/WOK3_OK2
The different categories are:
IGN man pages po4a refused to operate on (e.g. wad generated by
Pod::Man)
OK diff -uBb didn't see any difference
This can contain very rare misformatting
OK2 diff -uBb didn't see any difference after converting hyphens to
minus sign, `` to ", and '' to " in both man pages
This contains a little bit more misformatting, for example an man
page referring to an empty argument ('') should not display only ".
WOK1 wdiff doesn't see any difference after the same modifications
WOK2 This tries to detect changes in the hyphenation of words (but has
more false negative)
WOK3 This removes minus signs, and thus detects more changes in
hyphenation
PBS po4a preferred to stop processing the man page (non supported
macro, ...)
WDIFF These are probably bugs in po4a or in the man page (I started
reporting some of them in the BTS, which is another way of
improving po4a statistics)
In the table above, it is usually an improvement to have big numbers on
the bottom left corner (with the exception of the IGN column).
Here are the patches for the Man module:
+ comments
It recognize some (probably incorrect, but usual) comment lines.
Here are the results of the regression tests for this patch:
IGN OK OK2 WOK1 WOK2 WOK3 PBS WDIFF
IGN 1698 0 0 0 0 0 0 0
OK 0 125 0 0 0 0 0 0
OK2 0 0 1570 0 0 0 0 0
WOK1 0 0 0 98 0 0 0 0
WOK2 0 0 0 0 209 0 0 0
WOK3 0 0 0 0 0 323 0 0
PBS 0 0 3 0 1 1 885 0
WDIFF 0 0 1 1 0 1 0 63
+ nested_fonts
It deals with the nested font issue.
I have an idea on how to simplify it a lot, but I think it could be
applied, because it is doing a good job.
The only remaining issue is with "un-terminated" fonts, as in:
Hello, my name is \fINicolas \fBFRANÇOIS
IMHO, in groff, there is no nested font (with some exceptions, like
SB, and some italic and bold faces, or by using exotic tmac).
\fIfoo\fBbar\fR is equivalent to \fIfoo\fR\fBbar\fR (with the
exception of the \fP).
Here are the results of the regression tests for this patch:
IGN OK OK2 WOK1 WOK2 WOK3 PBS WDIFF
IGN 1698 0 0 0 0 0 0 0
OK 0 125 0 0 0 0 0 0
OK2 0 0 1562 0 0 0 8 0
WOK1 0 0 0 98 0 0 0 0
WOK2 0 0 0 0 209 0 0 0
WOK3 0 1 6 2 1 307 6 0
PBS 0 1 81 5 27 31 738 7
WDIFF 0 0 0 0 0 0 0 66
+ arg_next_line
It allows arguments to be provided on the next line for some macros
(.SH, .I, ..., .BR, ...)
It works fine, but would require some cleanup (lots of redundant
code).
It can be applied cleanly on CVS, but require the 'nested_fonts' patch
to operate cleanly.
Here are the results of the regression tests for this patch (with the
previous patch also applied):
IGN OK OK2 WOK1 WOK2 WOK3 PBS WDIFF
IGN 1698 0 0 0 0 0 0 0
OK 0 125 0 0 0 0 0 0
OK2 0 0 1567 1 0 0 0 2
WOK1 0 0 0 97 0 1 0 0
WOK2 0 0 0 0 209 0 0 0
WOK3 0 2 11 2 4 301 3 0
PBS 0 1 109 9 34 32 692 13
WDIFF 0 0 0 0 0 0 0 66
(here most of the new 'WDIFF' man pages are bug in the man page, and a
bug was filed in the BTS)
+ dot_lines
po4a generated some lines starting with a dot. In those cases, a \&
should be added to allow the line to be displayed. (for exemple:
.I ../file
is displayed in groff, but
\fI../fil/\fR won't be displayed
It also fix the same issue for lines starting by a "'"
IGN OK OK2 WOK1 WOK2 WOK3 PBS WDIFF
IGN 1698 0 0 0 0 0 0 0
OK 0 125 0 0 0 0 0 0
OK2 0 0 1570 0 0 0 0 0
WOK1 0 8 0 90 0 0 0 0
WOK2 0 0 0 0 209 0 0 0
WOK3 0 0 0 0 0 323 0 0
PBS 0 0 0 0 0 0 890 0
WDIFF 0 1 5 0 0 0 0 60
+ hyphen
I had a obligation to fix this because I said Martin that replacing
hyphens by minus signs were always allowed.
In fact, it should not be modified in
- .so/.mso arguments
- after a \s (font size modifier, e.g. \s-2)
I also added a comment on why I *hate* hyphens.
Here are the results of the regression tests for this patch:
IGN OK OK2 WOK1 WOK2 WOK3 PBS WDIFF
IGN 1698 0 0 0 0 0 0 0
OK 0 125 0 0 0 0 0 0
OK2 0 0 1570 0 0 0 0 0
WOK1 0 0 0 98 0 0 0 0
WOK2 0 0 0 0 209 0 0 0
WOK3 0 0 0 0 0 323 0 0
PBS 0 0 0 0 0 0 890 0
WDIFF 0 0 2 1 3 5 0 55
+ new_macros
Some new macros:
.R
.EX and .EE
.so and .mso
.cs
minimal support (when no argument is given) for:
.ce
.ul
.cu
Here are the results of the regression tests for this patch:
IGN OK OK2 WOK1 WOK2 WOK3 PBS WDIFF
IGN 1698 0 0 0 0 0 0 0
OK 0 125 0 0 0 0 0 0
OK2 0 0 1570 0 0 0 0 0
WOK1 0 0 0 98 0 0 0 0
WOK2 0 0 0 0 209 0 0 0
WOK3 0 0 0 0 0 323 0 0
PBS 0 27 11 0 0 1 843 8
WDIFF 0 0 0 0 0 0 0 66
+ escape
It tries to deal with the \c escape.
It still need some work.
+ others
some other minor points that I could isolate from my working directory
+ split_args
This fix an issue for the limits.conf man page.
It was also reported in #268904
It adds one string for the translation.
+ all
all the above patch, and more.
It also contains some comments that should be removed.
The results are presented in the first table.
Comments (and commits;) are welcome.
Thanks for those who read this mail to the end,
--
Nekral
20 years
[Po4a-devel]Line counting
by Yves Rutschle
Hello all,
Looks like Martin is busy, so I'll post here the patch to
correct the bug 278428 that I reported on Debian's BTS:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=278428
Here's the (big) patch:
diff -c -r1.28 Po.pm
*** lib/Locale/Po4a/Po.pm 25 Aug 2004 00:42:50 -0000 1.28
--- lib/Locale/Po4a/Po.pm 5 Nov 2004 15:33:34 -0000
***************
*** 195,200 ****
--- 195,201 ----
$linenum,$line)."\n";
}
}
+ $linenum++;
$msgstr=$buffer;
$msgid = unquote_text($msgid) if (defined($msgid));
$msgstr = unquote_text($msgstr) if (defined($msgstr));
20 years
[Po4a-devel]Re: #273736: Italian man pages are UTF-8 encoded
by Martin Quinson
package po4a
tag 273736 fixed-upstream
thanks
Hello,
This bug is now fixed-upstream. The -L option was implemented to allow to
set the encoding of the localized file. This option is moreover used in the
man page generation.
Thanks to all who helped for this.
Mt.
PS: if you answer this mail, don't forget to remove control@bugs from the
receiver.
20 years