Hello,
On Mon, 24 May 2004, Martin Quinson wrote:
[...]
On Fri, May 07, 2004 at 03:43:38PM +0200, Jordi Vilalta wrote:
> > po4a skips the generation of msgid containing an entity only (or tags only).
> > It will now issue a warning when such optimizations are done. Thanks for the
> > repport. [At least this is what I planned, but the msgid containing spaces
> > along with entities where not detected. This is also fixed]
>
> Now it seems to skip this kind of msgids (the version I tried some days
> ago didn't), but it has an irregular behavior. I've done the following
> (meaningless) test:
When I redo the test, I got something corresponding to what I expect:
====[/tmp/a]====
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" [
<!ENTITY chap SYSTEM "chapter1.xml">
<!ENTITY chap2 SYSTEM "chapter2.xml">
<!ENTITY aaa "contens of aaa">
<!ENTITY bbb "contens of bbb">
<!ENTITY ccc "contens of ccc">
]>
<book>
&chap0;
&chap;
&chap2;
&aaa;
&chap3;
&bbb;
&chap;
&ccc;
&aaa;
</book>
====[/tmp/chapter1.xml]====
[content of chapt1]
====[/tmp/chapter2.xml]====
[content of chapt2]
====[generated po file]====
# SOME DESCRIPTIVE TITLE
# Copyright (C) YEAR Free Software Foundation, Inc.
# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
#
#, fuzzy
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"POT-Creation-Date: 2004-05-24 14:10-0700\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <LL(a)li.org>\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=CHARSET\n"
"Content-Transfer-Encoding: ENCODING"
# type: definition of entity &aaa;
#, no-wrap
msgid "contens of aaa"
msgstr ""
# type: definition of entity &bbb;
#, no-wrap
msgid "contens of bbb"
msgstr ""
# type: definition of entity &ccc;
#, no-wrap
msgid "contens of ccc"
msgstr ""
# type: <book></book>
msgid ""
"&chap0; [content of chapt1] [content of chapt2] &aaa; &chap3; &bbb;
[content "
"of chapt1] &ccc; &aaa;"
msgstr ""
====[end of files]====
The type line looks ok to me, and there is no reference line for entity
definition. That way, it is not broken ;)
Well, the problem here was with the chapter?.xml files. With your files I
get the same result as you, but when changing their content to:
<chapter><title>ch.1</title>
<para>content 1</para>
</chapter>
I get this (mad) output po file:
...
# type: <title></title>
#: a.xml:12 chapter2.xml:1
msgid "ch.1"
msgstr ""
# type: <para></para>
#: a.xml:12 chapter2.xml:1
msgid "content 1"
msgstr ""
# type: <title></title>
#: chapter1.xml:1
msgid "ch.2"
msgstr ""
# type: <para></para>
#: chapter1.xml:1
msgid "content 2"
msgstr ""
# type: </chapter><chapter>
#: chapter2.xml:1
msgid "&aaa; &chap3; &bbb;"
msgstr ""
# type: </chapter></book>
msgid "&ccc; &aaa;"
msgstr ""
It seems that when inserting the content of the included file, it's parsed
in the main file, and it gets this behavior (and the wrong type lines).
Also, I don't like the substitution of the content here:
"&chap0; [content of chapt1] [content of chapt2] &aaa; &chap3; &bbb;
[content "
"of chapt1] &ccc; &aaa;"
As you see, the content of chapter1 appears twice (must be translated
twice). Instead of this, I think that inclusion entities should be treated
like the substitution entities (the content is translated once, and their
appearances should be left as they are): &aaa; appears twice in this
msgid, and its content is only translated once.
Now I've still tried to complicate it a little more. I've tried to put
some tags into a substitution entity (I've used it in real documents) and
then, the entity disappears from the generated po.
> When watching the contens of the msgids, it seems that it skips only the
> inclusion entities that it knows, and gives the "substitution" entities
> up:
No, we substitute only inclusion entities, and never the substitution ones.
This is exaclty what I wanted, since expending them would force the
translator to update his work each time the &version; entity is updated,
which is exaclty contrary to the philosophy of this mecanism.
> I think there are 2 alternative ways to treat these cases better:
> 1) Exclude all entities-only messages (any number, known or unknown)
> 2) Include the whole messages that have more than 1 entity (known or
> unknown), because in some languages it may be interesting to change
> the order of some of them.
As reflected by the source code, the second option is the selected one.
For the argument you give ;)
> hmmm, now I was thinking about the standard entities that define special
> characters, as ´ and I've seen that they're also excluded if
there's
> something like <title>&Acute;</title>. Seeing this, I prefer not to
> exclude any entities. In some cases it can be a little annoying for the
> translators, but else, there could be some untranslateable strings.
hmm. This example looks a bit artificial, doesn't it? Anyway. I added a
'include-all' option to the module to disable those optimisations.
Passing options to modules are one of the novelty introduced to the CVS
version. For example, it would be :
po4a-gettextize -t sgml -o include-all -m bla.sgml -p bla.pot
Interesting :)
[...]
Regards,
Jordi Vilalta