On 2012/9/27 David Prévot wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
Hi,
Le 27/09/2012 07:55, D. Barbier a écrit :
> Indeed, this is due to accented characters.
> It seems that length() returns the number of bytes and not characters.
> I looked at Unicode issues with Perl a very long time ago and do not
> remember about its quirks; if anyone has a clue, please tell ;-)
Thomas, CCed, helped us a lot for the DPNhtml2mail script [0], and
managed to make that work.
> 0:
http://anonscm.debian.org/viewvc/publicity/dpn/scripts/DPNhtml2mail.pl?vi...
I guess the magic operates in the end of the following code:
# number of column of a string
sub _columns {
my $str = scalar shift;
return 0 if ( !defined $str || $str eq '' );
$str = decode_utf8($str) unless utf8::is_utf8($str);
return Unicode::GCString->new($str)->columns();
}
Thanks David,
This seems to be different, you are computing the string width whereas
I need the number of characters.
I believe that all we need is to add some ":encoding(foo)" flag when
opening file for reading, encoding must be specified and is thus
known.
Denis