Hello,
I just discovered that the gnome guys did come out with their own solution
to convert xml files from and to po files. Of course, they couldn't use the
KDE one "poxml", neither one of the existing "translate" and
"po4a".
Erm. Enough grumbling.
It's called xml2po, it's included as a sub-directory of the gnome-doc-utils
debian package. It's python based (grumble grumble).
From what I can see, the entities are rewritten before the content is
extracted to po files. This is bad because it will unnecessary fuzzy the
strings when the entity content change. In po4a/sgml, entities are
translated separately, and then the entities are preserved into msgids. I'm
not sure about po4a/xml.
The most interesting point is that they provide an heuristic for automatic
tags classification. Maybe from dtd, I only glanced the code. But at the
same time, they provide specific "modes" for the dtd they use (docbook and
gnomesummary). If it works, it's exactly what I'm dreaming of since years.
I think that it could be interesting to :
- see whether we can reuse their automatic classification heuristic
- build a gnomesummary module from their one (it looks really easy to do)
- contact the authors to see whether they would accept to merge our efforts
since it's also python based, we may well get the same result than with
the translate project (both parts willing to merge, none willing to
switch the programming language, and no change as result), but it worth
trying...
Your advice?
Bye, Mt.
PS: I dream of Perl6, which will make possible the code sharing between
perl, python and even java ;)