Hello,
On Sat, Nov 08, 2008 at 02:43:15AM +0100, intrigeri(a)boum.org wrote:
Hello,
in the process of writing a translation plugin[1] for ikiwiki[2],
using po4a, we wondered how safe it was to run po4a on
untrusted content. Hence the following questions.
(You might need to know, in order to provide an accurate answer, that
we actually don't use /usr/bin/po4a* at all, but rather the
Locale::Po4a Perl module.)
Was po4a designed with "processing safely on untrusted content" as
a goal? If not, do you consider it is now achieved as a side effect?
"processing safely on untrusted content" was not an original goal.
However, I would like to differentiate parts of po4a:
* The po4a's Core (Po.pm, Transtractor.pm)
I really do not expect any security issues in the core
functionalities. They should only use simple and static regular
expressions and standard Perl features.
I don't think there are any implicit requirements on the inputs
provided to the core by the modules or by the PO file parser.
* The po4a's modules
The modules are parsing the "untrusted content", the behavior of the
modules might be changed by commands included in the content (LaTeX
module only?), they might use some unstrusted content later in a
regular expression.
The module usually do not have any interface to the system (like
reading or writing files, executing commands), but use the
Transtractor interface for this.
* The po4a commands
They are also parsing untrusted content (command line arguments, config
files) and have more interfaces to the system.
* The po4a's testsuite
Not developed with security in mind.
About the external dependencies:
- I could not find any command execution in Locale::Po4a, did I miss
some?
I'm not sure what you mean with "command execution"
If it's about external system's program, then there are some (look for
system, qx, open, or `):
diff might be used by Po.pm
nsgmls is used by Sgml.pm
- The first glance makes me think that Locale::gettext is used only
to
display translated messages; can you please confirm this?
That's right.
- Amongst the dependencies (I could quickly list DynaLoader, Encode,
Encode::Guess, Text::WrapI18N, Locale::gettext), is there one (or
more) that you know to be unsafe to process untrusted content?
I had some failure with WrapI18N (endless loops), which might cause DOS.
http://bugs.debian.org/470250
It is just used to have a better formating of the output error/warning
mesages.
You probably do not need this feature.
I hope DynaLoader is safe.
I have no reason to think that Encode::Guess is not safe. It can also be
avoided if the encoding is always specified. (This might need some
adaptation in po4a to only load it if needed)
I have no reason to think that Encode is not safe.
Other non-required dependencies:
Term::ReadKey
SGMLS
They are not dependencies for your use case.
- What about the msgmerge command, that po4a command-line programs
use, as well as this ikiwiki plugin?
I never checked them. I've never hearded about any security issues with
msgmerge. It is a known and widely used command.
It uses the gettext library which is already checking the format
of its content.
I do not expect any security issues from it.
It is not used by Locale::Po4a, but by the po4a command lines. However,
I expect that you will have to use them.
Was the full code checked for symlink attacks when CVE-2007-4462
was fixed?
Yes, except the testsuite.
Was po4a tested with a fuzzing program? Would you be interested in
the
results if I did this?
It was not tested, and it would be really appreciated.
A first step, and the most interesting thing for you would be to test the
core functionalities.
I expect more trouble with the modules (endless loops)
Thanks for your interest,
--
Nekral