Unicode string mess in perl -
i have external module, returning me strings. not sure how strings returned, exactly. don't know, how unicode strings work , why.
the module should return, example, czech word "být", meaning "to be". (if cannot see second letter - should this.) if display string, returned module, data dumper, see b\x{fd}t
.
however, if try print print $s
, got "wide character in print" warning, , ? instead of ý.
if try encode::decode(whatever, $s);
, resulting string cannot printed anyway (always "wide character" warning, mangled characters, right), no matter put in whatever
.
if try encode::encode("utf-8", $s);
, resulting string can printed without problems or error message.
if use use encoding 'utf8';
, printing works without need of encoding/decoding. however, if use io::captureoutput
or capture::tiny
module, starts shouting "wide character" again.
i have few questions, happens. (i tried read perldocs, not wise them)
- why can't print string right after getting module?
- why can't print string, decoded "decode"? "decode" did?
- what "encode" did, , why there no problem in printing after encoding?
- what
use encoding
do? why default encoding differentutf-8
? - what have do, if want print scalars without problems, when want use 1 of capturing modules?
edit: people tell me use -c
or binmode
or perl_unicode
. great advice. however, somehow, both capturing modules magically destroy utf8-ness of stdout. seems more bug of modules, not sure.
edit2: ok, best solution dump modules , write "capturing" myself (with less flexibility).
- because output string in perl's internal form (utf8) non-unicode filehandle.
- the
decode
function decodes sequence of bytes assumed in encoding perl's internal form (utf8). input seems decoded, - the
encode()
function encodes string perl's internal form encoding. - the
encoding
pragma allows write script in encoding like. string literals automatically converted perl's internal form. - make sure perl knows encoding data comes in , come out.
see perluniintro, perlunicode, encode module, binmode() function.
Comments
Post a Comment