Hello,
I start this thread to move the discussion away from
https://github.com/mquinson/po4a/issues/239 (now closed)
First some context from the logs, and my latest answer below.
----------[ Me: ]----------
I agree that we cannot (and should not) try to enforce any kind of
conformance to whatever. If the input is broken and the translated doc
is equally broken (as much as the input, preferably not more than the
input), that's perfectly fine. po4a is not an asciidoc linter. That's
not our purpose.
What I mean is that po4a should use defensive programming and be
cautious about its input. The difficulty is that po4a is a technical
attempt to solve a social issue, ie the collaboration between doc
writers, translators, and end users that don't speak English.
If po4a is too picky about the input format, doc writers will
disregard po4a's error messages, which is never good. Same if the
error messages are not specific enough, and that's why I like very
much the compat new option. It gives us the ability to produce highly
specialized feedback.
But if it produces invalid translated files, the whole l10n process
will be disabled from the build chain. So there is another source of
concern: po4a should try to prevent the translators from introducing
formatting errors. This is why po4a introduces the markdown-text tag,
so that we can benefit from the weblate checks that are specific to
that language (patches to integrate them [partially] into po4a
directly would be gladly accepted). For the groff plugin (man pages),
similar checks are integrated into po4a directly.
I don't think we have any check of the translation in asciidoc. #242
is somewhat related to this problem, but not completely.
----------[ Jean-Noël: ]----------
That is another story and I fully agree with you, from experience.
Even when segmentation has been done to free translators from
block-level markup errors, we still face a risk that inline markup be
upset by translators, such as styles and anchors. To be able to
interact with professional translators, we had to add some rules in
their translation framework in order to correctly tag, preserve and
check inline formattings. IMAO, this is clearly a missing feature of
po oriented tools that do not provide such customizations the way they
provide checks for c-style place-holders and so on.
Maybe po4a could provide some sanity check on this, such as check that
links are correctly preserved. Right now, I can perform the check
after generation of an html file (using
https://github.com/gjtorikian/html-proofer), this is still a late and
loose check.
----------[ (end of logs) ]----------
These kind of consideration is to be implemented in each and every
po4a module. For example, the XML module implements placeholders, that
are replaced by a specific tag in the msgid where they were found, and
translated separately. Docbook XML is using them quite heavily. But I
don't think they are tested in our testsuite :(
Even if this feature would be implemented in the asciidoc module, we
would still have to check that the translator did not break the
placeholder :(
Mt
--
You can't solve social problems with software. -- Marcus Ranum's Law