Unicode string mess in perl -
i have external module, returning me strings. not sure how strings returned, exactly. don't know, how unicode strings work , why.
the module should return, example, czech word "být", meaning "to be". (if cannot see second letter - should this.) if display string, returned module, data dumper, see b\x{fd}t.
however, if try print print $s, got "wide character in print" warning, , ? instead of ý.
if try encode::decode(whatever, $s);, resulting string cannot printed anyway (always "wide character" warning, mangled characters, right), no matter put in whatever.
if try encode::encode("utf-8", $s);, resulting string can printed without problems or error message.
if use use encoding 'utf8';, printing works without need of encoding/decoding. however, if use io::captureoutput or capture::tiny module, starts shouting "wide character" again.
i have few questions, happens. (i tried read perldocs, not wise them)
- why can't print string right after getting module?
- why can't print string, decoded "decode"? "decode" did?
- what "encode" did, , why there no problem in printing after encoding?
- what
use encodingdo? why default encoding differentutf-8? - what have do, if want print scalars without problems, when want use 1 of capturing modules?
edit: people tell me use -c or binmode or perl_unicode. great advice. however, somehow, both capturing modules magically destroy utf8-ness of stdout. seems more bug of modules, not sure.
edit2: ok, best solution dump modules , write "capturing" myself (with less flexibility).
- because output string in perl's internal form (utf8) non-unicode filehandle.
- the
decodefunction decodes sequence of bytes assumed in encoding perl's internal form (utf8). input seems decoded, - the
encode()function encodes string perl's internal form encoding. - the
encodingpragma allows write script in encoding like. string literals automatically converted perl's internal form. - make sure perl knows encoding data comes in , come out.
see perluniintro, perlunicode, encode module, binmode() function.
Comments
Post a Comment