Hi!
On Tue, Nov 11, 2008 at 01:23:21PM +0100, intrigeri(a)boum.org wrote:
Hi.
Here are the first results of the zzuf[1] vs po4a contest.
Some probably have security-related consequences, but as it seems
nobody uses po4a against untrusted content yet, I guess there is no
problem disclosing these results without delay.
I don't think the mentioned errors can have security-related consequences.
(The worse I can think about would be a DOS)
If somebody is able to introduce contents in the original document, I
would not be surprised that this content is used later on, and the
infrastructure tries to translate it.
If somebody can introduce content in the PO file, then I'm not really
surprised that it could stop the production of the translation or causes
the production of unexpected content.
If you can produce a infinite loop, can you ask zzuf to reproduce this
test vector?
,----
| po4a-gettextize
`----
Without specifying the input charset, zzuf'ed po4a-gettextize quickly
errors out, complaining it was not able to detect the input charset;
no incomplete file is left on disk.
So I had to pretend the input was in UTF-8, as does ikiwiki's po plugin.
Two ways of crashing were revealed by this command-line:
zzuf -vc -s 0:100 -r 0.1:0.5 \
po4a-gettextize -f text -o markdown -M utf-8 -L utf-8 \
-m LICENSES >/dev/null
They are:
Malformed UTF-8 character (UTF-16 surrogate 0xdcc9) in substitution iterator at
/usr/share/perl5/Locale/Po4a/Po.pm line 1443.
Malformed UTF-8 character (fatal) at /usr/share/perl5/Locale/Po4a/Po.pm line
1443.
and
Malformed UTF-8 character (UTF-16 surrogate 0xdcec) in substitution (s///) at
/usr/share/perl5/Locale/Po4a/Po.pm line 1443.
Malformed UTF-8 character (fatal) at /usr/share/perl5/Locale/Po4a/Po.pm line
1443.
Perl seems to exit cleanly, and an incomplete PO file is written on
disk. I not sure if this is a bug in Perl or in Po.pm.
I'm not sure I can catch this one. A proper error message indicating the
line number in the input document would be preferable.
,----
| po4a-translate
`----
Without specifying an input charset, same behaviour as
po4a-gettextize, so let's specify UTF-8 as input charset as of now.
The command:
zzuf -cv \
po4a-translate -d -f text -o markdown -M utf-8 -L utf-8 \
-k 0 -m LICENSES -p LICENSES.fr.po -l test.fr
... prints tons of occurences of the following error, but a complete
translated document is written (obviously with some weird chars
inside):
Use of uninitialized value in string ne at
/usr/share/perl5/Locale/Po4a/TransTractor.pm line 854.
Use of uninitialized value in string ne at
/usr/share/perl5/Locale/Po4a/TransTractor.pm line 840.
Use of uninitialized value in pattern match (m//) at
/usr/share/perl5/Locale/Po4a/Po.pm line 1002.
I fixed these ones.
While:
zzuf -cv -s 0:10 -r 0.001:0.3 \
po4a-translate -d -f text -o markdown -M utf-8 -L utf-8 \
-k 0 -m LICENSES -p LICENSES.fr.po -l test.fr
... seems to lose the fight, at the readpo(LICENSES.fr.po) step,
against some kind of infinite loop, deadlock, or any similar beast.
Seems like it could go on using CPU power forever, but memory use does
not increase.
Whatever format module is used does not change anything. This is thus
probably a bug in po4a's core or in a lib it depends on.
This looks better now, but po4a reports that errors were found in the PO
file (not really surprising).
The current po4a behavior is to continue, but report in case of errors.
I could also count the number of errors and die after a certain amount.
I did not experience infinite loops or deadlocks
Cheers,
--
Nekral