Complex regular subexpression recursion limit (32766) exceeded
by Jean-Christophe Helary
When trying to convert org.texi to PO I get the following error:
Complex regular subexpression recursion limit (32766) exceeded at /opt/local/lib/perl5/5.26/Locale/Po4a/TeX.pm line 697.
That happens between line 11180 and 11200 of the file and it does not seem to be triggered by any weird code.
Checking the error on the web, I found that it is possibly related to using * in a regex:
https://metacpan.org/pod/XML::Easy::Syntax
> BUGS
>
> Many of these regular expressions are liable to tickle a serious bug in perl's regexp engine. The bug is that the * and + repeat operators don't always match an unlimited number of repeats: in some cases they are limited to 32767 iterations. Whether this bogus limit applies depends on the complexity of the expression being repeated, whether the string being examined is internally encoded in UTF-8, and the version of perl. In some cases, but not all, a false match failure is preceded by a warning "Complex regular subexpression recursion limit (32766) exceeded".
>
> This bug is present, in various forms, in all perl versions up to at least 5.8.9 and 5.10.0. Pre-5.10 perls may also overflow their stack space, in similar circumstances, if a resource limit is imposed.
>
> There is no known feasible workaround for this perl bug. The regular expressions supplied by this module will therefore, unavoidably, fail to accept some lengthy valid inputs. Where this occurs, though, it is likely that other regular expressions being applied to the same or related input will also suffer the same problem. It is pervasive. Do not rely on this module (or perl) to process long inputs on affected perl versions.
Line 697 of TeX.pm is:
# detect \begin and \end (if they are not commented)
→ if ($buffer =~ /^((?:.*?\n)? # $1 is
(?:[^%] # either not a %
| # or
(?<!\\)(?:\\\\)*\\%)*? # a % preceded by an odd nb of \
) # $2 is a \begin{ with the end of the line
(${RE_ESCAPE}(?:begin|end)\{.*)$/sx
If there could be a way to simplify this regex, maybe the issue would go away...
org.texi is the only file in the emacs distribution that chokes on this regex.
--
Jean-Christophe Helary @brandelune
https://mac4translators.blogspot.com
https://sr.ht/~brandelune/omegat-as-a-book/
3 years, 3 months
Complex regular subexpression recursion limit (32766) exceeded
by Jean-Christophe Helary
When trying to convert org.texi to PO I get the following error:
Complex regular subexpression recursion limit (32766) exceeded at /opt/local/lib/perl5/5.26/Locale/Po4a/TeX.pm line 697.
That happens between line 11180 and 11200 of the file and it does not seem to be triggered by any weird code.
Checking the error on the web, I found that it is possibly related to using * in a regex:
https://metacpan.org/pod/XML::Easy::Syntax
> BUGS
>
> Many of these regular expressions are liable to tickle a serious bug in perl's regexp engine. The bug is that the * and + repeat operators don't always match an unlimited number of repeats: in some cases they are limited to 32767 iterations. Whether this bogus limit applies depends on the complexity of the expression being repeated, whether the string being examined is internally encoded in UTF-8, and the version of perl. In some cases, but not all, a false match failure is preceded by a warning "Complex regular subexpression recursion limit (32766) exceeded".
>
> This bug is present, in various forms, in all perl versions up to at least 5.8.9 and 5.10.0. Pre-5.10 perls may also overflow their stack space, in similar circumstances, if a resource limit is imposed.
>
> There is no known feasible workaround for this perl bug. The regular expressions supplied by this module will therefore, unavoidably, fail to accept some lengthy valid inputs. Where this occurs, though, it is likely that other regular expressions being applied to the same or related input will also suffer the same problem. It is pervasive. Do not rely on this module (or perl) to process long inputs on affected perl versions.
Line 697 of TeX.pm is:
# detect \begin and \end (if they are not commented)
→ if ($buffer =~ /^((?:.*?\n)? # $1 is
(?:[^%] # either not a %
| # or
(?<!\\)(?:\\\\)*\\%)*? # a % preceded by an odd nb of \
) # $2 is a \begin{ with the end of the line
(${RE_ESCAPE}(?:begin|end)\{.*)$/sx
If there could be a way to simplify this regex, maybe the issue would go away...
org.texi is the only file in the emacs distribution that chokes on this regex.
--
Jean-Christophe Helary @brandelune
https://mac4translators.blogspot.com
https://sr.ht/~brandelune/omegat-as-a-book/
3 years, 4 months