Hi,
On Fri, 16 Jul 2004, Martin Quinson wrote:
...
The point is that the current design of the Sgml module makes it very
difficult to simply ignore some parts of the document because I use an
external parser. I can change them into text before launching the parser, and
then turn they back to their values. That's the trick I use to protect the
entities, for example. &version; is rewritten to PO4A-ampversion; before
launching nsgml, and then back while generating the document/po files. But if
I do so, I'm afraid that nsgml begins whining about CDATA being placed where
it shouldn't.
I am near the point where I decide that nsgml creates more problem than it
solves. Making a ?ML parser in perl from the scratch shouldn't be that
difficult after all.
My current implementation of the XML parser is quite generic and
customizable (I think). Maybe it could replace the current sgml one (when
it gets mature).
The only reasons why I don't go further and reimplement my own parser is the
complexity of the code. As I said recently, I don't even remember the
differences between translate and indent (I guess that's a good argument to
reingeenering the code ;). And there is some weird parts to support sgml
specificities such as conditional compilation (ok, that would be also easier
without nsgml). There is the file inclusion mecanism, also (but it should be
generalized, and put in the core of po4a so that other module can use it).
I think that this issue about the file inclusion, and the file encodings
are two of the main lacks of the po4a core. The rest are only format
modules, and these can always be extended.
Ok, the real reason is my chronical lack of time, and the fear of introducing
new bugs which I would have to fix ASAP since some people begin to use po4a...
It's open source, there are lots of hands out there, that can help ;)
I'm gonna try masking the <?bla?> as well as CDATA to nsgml, to see if it does
the trick.
Regards,
Jordi Vilalta