On Wed, Dec 15, 2004 at 11:27:22PM +0100, Nicolas François wrote:
Hello,
Here is the progress of the LaTeX module (attached; I plan to clean it and
commit it this week-end):
- derivation of the TeX module is possible
(a LaTeX moduleis attached. It only contains definition of new
commands)
- the class file is read in order to parse "% po4a:" lines
(I only support "command1 alias command2" at this time, i.e.
specifying that a command should be handled the same way as another
one)
Only the class file? What if I define new macros in the file ? o:-)
- file inclusion
I've been able to get po4a normalizing all the files of the book with
only this command:
po4a-normalize -f LaTeX data/bk2/bk2.tex
(only one chapter - ch05 - is ignored from the \include command,
because of a bug - see below)
I've done it in a quite different way than Sgml.pm because I wanted to
keep the line references, and I could not remove end of lines as
in Sgml.
In read, I'm calling parse_file. And in parse_file, I calling
again parse_file when a \include is encountered, so that files are
included at the right place. parse_file is quite equivalent to
Transtractor's parse (with the \include difference).
If this is not the right way to include files, please stop me
before this week-end.
This *is* the right way to go. It is so clumsy in sgml because we do not
really do the parsing ourselves there. nsgml does the parsing and then run
some callbacks of ours. this solution seemed simple at the first glance when
I begun working on Sgml.pm, but it brings a bunch of issues, such as this
one. Please keep avoiding Sgml.pm braindead design while working on TeX.pm :)
Identified bugs
- a command argument can contain an empty line (chapter 5), the parse
function should make sure that after the separation in paragraphs, it
didn't break inside a command argument.
This is quite problematic. Did you a nice way to find the '}' matching a
given '{' ? (yeah, I have to confess I didn't find the time to read your
code) It would solve such issues, wouldn't it?
- I'm not really happy with the way I'm dealing with spaces
(or tab or
newlines) between commands, or between commands and text. But it seems
necessary.
- I'm assuming the class file and included file will be found at the
right place
This is related to #300874 (Included files should be searched in the path of
the master)
https://alioth.debian.org/tracker/index.php?func=detail&aid=300874&am...
I should rewrite a bit the sgml module to get rid of this clumsy nsgml, and
then we could work on a generic inclusion mecanism solving such issues.
Jordi was interested in such a mecanism for Xml, too, but we failed to find
a nice interface last time we spoke about it.
- some additional empty lines are added (this should not be an
issue,
but I would like to understand where it happens)
You mean that it adds a second new line along the first one sometimes?
- There's a small difference in the table of content (I need to
analyze
this)
It may be fixed by recompiling the document once again. Welcome in LaTeX
cumbersome world :)
- Plenty other are lurking;)
Missing features
- a category for commands that can be separated from a paragraph when
located at its beginning or its end (at this time all are separated)
- more "% po4a:" stuff
- the class file will need to be translated
- many others
The only regression test I tried after the po4a-normalization is to display
both PDF superposed and switching from one to the other. Eyes are
usually very sensitive to small changes (even spacing changes). At
this time I only detected one small difference (in the table of content) up
to the 26th page.
Diffing the pdf does reveals changes? What about the ps?
All your work is very impressive. I was already more than impressed when you
played with Man.pm, but this time, I'm thrilled. Thanks for your time and
sorry for not being able to help you more.
Mt.