Unicode string mess in perl -


i have external module, returning me strings. not sure how strings returned, exactly. don't know, how unicode strings work , why.

the module should return, example, czech word "být", meaning "to be". (if cannot see second letter - should this.) if display string, returned module, data dumper, see b\x{fd}t.

however, if try print print $s, got "wide character in print" warning, , ? instead of ý.

if try encode::decode(whatever, $s);, resulting string cannot printed anyway (always "wide character" warning, mangled characters, right), no matter put in whatever.

if try encode::encode("utf-8", $s);, resulting string can printed without problems or error message.

if use use encoding 'utf8';, printing works without need of encoding/decoding. however, if use io::captureoutput or capture::tiny module, starts shouting "wide character" again.

i have few questions, happens. (i tried read perldocs, not wise them)

  1. why can't print string right after getting module?
  2. why can't print string, decoded "decode"? "decode" did?
  3. what "encode" did, , why there no problem in printing after encoding?
  4. what use encoding do? why default encoding different utf-8?
  5. what have do, if want print scalars without problems, when want use 1 of capturing modules?

edit: people tell me use -c or binmode or perl_unicode. great advice. however, somehow, both capturing modules magically destroy utf8-ness of stdout. seems more bug of modules, not sure.

edit2: ok, best solution dump modules , write "capturing" myself (with less flexibility).

  1. because output string in perl's internal form (utf8) non-unicode filehandle.
  2. the decode function decodes sequence of bytes assumed in encoding perl's internal form (utf8). input seems decoded,
  3. the encode() function encodes string perl's internal form encoding.
  4. the encoding pragma allows write script in encoding like. string literals automatically converted perl's internal form.
  5. make sure perl knows encoding data comes in , come out.

see perluniintro, perlunicode, encode module, binmode() function.


Comments

Popular posts from this blog

ASP.NET/SQL find the element ID and update database -

jquery - appear modal windows bottom -

c++ - Compiling static TagLib 1.6.3 libraries for Windows -