Hi all,
It looks like my spare time has shrunk further, hence my
long silence. Martin's last comments, and my spending some
time reading Sgml.pm and Xml.pm, along with running into
harder files from my site, have me almost convinced that
Martin is right and Html.pm is going the wrong way.
Here is the patch to it that I currently use. This brings it
to state in which it is useful for "simple" files (i.e.
files with simple paragraphs, little in-line formatting),
which I think still is useful for sites that contain a lot
of simple text (how-to's, for example, would be good
candidates if they used html as their primary format).
It doesn't change the fundamentals of its working, so all of
Martin's objections still hold true. Rather, it fixes the
module's shortcomings:
* Paragraphs are now spit along paragraphs, instead of
random 512-byte-aligned boundaries,
* title and alt attribute contents now create msgids.
I hope to get some time to play with Sgml.pm and Xml.pm soon
(because, let's face it, it's much more fun than actually
doing translations).
Y.