On Fri, Nov 26, 2004 at 06:29:24AM +0100, Martin Quinson wrote:
For [the rare] documents not defining any new macros and sticking to
unadulterated LaTeX, it should be rather easy to build a first prototype
simply splitting on limits between TeX's vertical and horizontal modes.
I think this is working for simple documents:
paragraphs are separated.
if a line starts by a known command, this macro is called and the
current paragraph flushed. Otherwise, the line continue the (or start a
new) paragraph.
- As usual (hello Yves), you need to distinguish between inline tags
(ups,
macros), which you ignore (such as textit or footnotesize or $bla$), and
formating ones, for which you translate the argument (such as \section,
\subsubsection or $$bla$$).
My implementation may have an issue (but I don't known if it is valid
LaTeX) with:
Hello there
\section{foo}
End of a paragraph
- Translate separately the content of all environment.
I don't know what an environment is.
Does it means, that if a command take two arguments they have to be
translated separately?
I think it is up to the command subroutine.
- Some macros need a more complex handling, I'm sure.
Sure!
At this time, I've the \begin{foo}, which call the foo subroutine.
- Translate separately each item (of a itemize and associate).
Done at some place. Probably command dependant
- Naturally translate separately each paragraph separated by empty
lines.
Done
- Ignore stuff like \medskip, since they are formating only.
Hint: it's used in vertical mode. (if there is some \newpage, I guess
you're dead)
I don't know what \medskip is.
This is still to do. As \noindent in the linbe preceding a paragraph
And so on and so far. I belive in this approach for simple documents.
There
is two main jobs here :
- write a proper parser, which can detect macros, separate their arguments,
etc. This may be the more difficult part. tex is full of \ and { all
around the place. You'll have to protect them, and to come up with a
usable way to determine the } corresponding to a given { (so that the
inbetween can be treated as a macro argument).
Still need to be done
Classical constructions (item) should be dealed with in there. All
the
rest should be passed to macro handler just as in the man module.
- read a latex definition and write the right handlers for the right macro.
There will be a bunch of dupplicated work if you don't do as in the man
module (or come up with a better idea, of course).
The code of the current comment will need to be specified in separate
subroutine
Once this is done, you'll be able to deal with documents with no
\newcommand. For new definitiones, I guess that the only viable idea is to
go for specifically formated comments in the document (lines begining with
'%po4a:' ?) to explain which category each macro belongs to. You may even
This seems reasonable.
Do you think footnotes should be treated separately?
In this case, How to indicate the location in the PO file?
I'm having the same issue with \index, that I separate from the paragraph,
only at their beginning and end.
Regards,
--
Nekral