Hello Jean-Christophe,
this is
https://github.com/mquinson/po4a/issues/37 that was "fixed" in the past
by adding a new line to org.texi.
I still fail to understand how to fix it properly, and a patch would be more than welcome
here.
Thanks, Mt
----- Le 18 Juil 21, à 17:11, Jean-Christophe Helary lists(a)traduction-libre.org a écrit :
> When trying to convert org.texi to PO I get the following error:
>
> Complex regular subexpression recursion limit (32766) exceeded at
> /opt/local/lib/perl5/5.26/Locale/Po4a/TeX.pm line 697.
>
> That happens between line 11180 and 11200 of the file and it does not seem to be
> triggered by any weird code.
>
> Checking the error on the web, I found that it is possibly related to using * in
> a regex:
>
>
https://metacpan.org/pod/XML::Easy::Syntax
>
>> BUGS
>>
>> Many of these regular expressions are liable to tickle a serious bug in
perl's
>> regexp engine. The bug is that the * and + repeat operators don't always
match
>> an unlimited number of repeats: in some cases they are limited to 32767
>> iterations. Whether this bogus limit applies depends on the complexity of the
>> expression being repeated, whether the string being examined is internally
>> encoded in UTF-8, and the version of perl. In some cases, but not all, a false
>> match failure is preceded by a warning "Complex regular subexpression
recursion
>> limit (32766) exceeded".
>>
>> This bug is present, in various forms, in all perl versions up to at least 5.8.9
>> and 5.10.0. Pre-5.10 perls may also overflow their stack space, in similar
>> circumstances, if a resource limit is imposed.
>>
>> There is no known feasible workaround for this perl bug. The regular expressions
>> supplied by this module will therefore, unavoidably, fail to accept some
>> lengthy valid inputs. Where this occurs, though, it is likely that other
>> regular expressions being applied to the same or related input will also suffer
>> the same problem. It is pervasive. Do not rely on this module (or perl) to
>> process long inputs on affected perl versions.
>
> Line 697 of TeX.pm is:
>
> # detect \begin and \end (if they are not commented)
> → if ($buffer =~ /^((?:.*?\n)? # $1 is
> (?:[^%] # either not a %
> | # or
> (?<!\\)(?:\\\\)*\\%)*? # a % preceded by an odd nb of \
> ) # $2 is a \begin{ with the end of the
line
> (${RE_ESCAPE}(?:begin|end)\{.*)$/sx
>
>
> If there could be a way to simplify this regex, maybe the issue would go away...
>
> org.texi is the only file in the emacs distribution that chokes on this regex.
>
> --
> Jean-Christophe Helary @brandelune
>
https://mac4translators.blogspot.com
>
https://sr.ht/~brandelune/omegat-as-a-book/
> _______________________________________________
> Devel mailing list -- devel(a)lists.po4a.org
> To unsubscribe send an email to devel-leave(a)lists.po4a.org