When trying to convert org.texi to PO I get the following error:
Complex regular subexpression recursion limit (32766) exceeded at
/opt/local/lib/perl5/5.26/Locale/Po4a/TeX.pm line 697.
That happens between line 11180 and 11200 of the file and it does not seem to be triggered
by any weird code.
Checking the error on the web, I found that it is possibly related to using * in a regex:
https://metacpan.org/pod/XML::Easy::Syntax
BUGS
Many of these regular expressions are liable to tickle a serious bug in perl's regexp
engine. The bug is that the * and + repeat operators don't always match an unlimited
number of repeats: in some cases they are limited to 32767 iterations. Whether this bogus
limit applies depends on the complexity of the expression being repeated, whether the
string being examined is internally encoded in UTF-8, and the version of perl. In some
cases, but not all, a false match failure is preceded by a warning "Complex regular
subexpression recursion limit (32766) exceeded".
This bug is present, in various forms, in all perl versions up to at least 5.8.9 and
5.10.0. Pre-5.10 perls may also overflow their stack space, in similar circumstances, if a
resource limit is imposed.
There is no known feasible workaround for this perl bug. The regular expressions supplied
by this module will therefore, unavoidably, fail to accept some lengthy valid inputs.
Where this occurs, though, it is likely that other regular expressions being applied to
the same or related input will also suffer the same problem. It is pervasive. Do not rely
on this module (or perl) to process long inputs on affected perl versions.
Line 697 of TeX.pm is:
# detect \begin and \end (if they are not commented)
→ if ($buffer =~ /^((?:.*?\n)? # $1 is
(?:[^%] # either not a %
| # or
(?<!\\)(?:\\\\)*\\%)*? # a % preceded by an odd nb of \
) # $2 is a \begin{ with the end of the line
(${RE_ESCAPE}(?:begin|end)\{.*)$/sx
If there could be a way to simplify this regex, maybe the issue would go away...
org.texi is the only file in the emacs distribution that chokes on this regex.
--
Jean-Christophe Helary @brandelune
https://mac4translators.blogspot.com
https://sr.ht/~brandelune/omegat-as-a-book/