Skip Menu |

Preferred bug tracker

Please email the preferred bug tracker to report your issue.

This queue is for tickets about the Mac-Pasteboard CPAN distribution.

Report information
The Basics
Id: 84646
Status: resolved
Priority: 0/
Queue: Mac-Pasteboard

People
Owner: Nobody in particular
Requestors: OGATA [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: 0.004
Fixed in: (no value)



Subject: Mac::Pasteboard can not handling multibyte UTF-8 characters.
When Mac pasteboard have multibyte UTF-8 characters, Mac::Pasteboard#pbpaste breaks and takes picked up characters. Also Mac::Pasteboard#pbcopy copies characters as unknown form of builtin `pbpaste` and Cmd-V output. I write concept code here: https://gist.github.com/xtetsuji/5389839
I am not sure what is going on here, but I suspect something around the handling of the default flavor (com.apple.traditional-mac-plain-text) when extended characters such as SNOWMAN are present. It appears that a workaround is to use a specific encoding, and a flavor that corresponds to that encoding. Something like the following: #!/usr/bin/env perl use 5.010; use charnames qw{ :full }; use Encode; use Mac::Pasteboard; binmode STDOUT, ':encoding(UTF-8)'; $pb = Mac::Pasteboard->new(); # First part of gist say 'First part of gist: pbcopy, then $pb->paste'; system qq{ echo "\N{SNOWMAN}" | pbcopy }; say decode( 'UTF-8', $pb->paste( 'public.utf8-plain-text' ) ); # Second part of gist say 'Second part of gist: $pb->copy, then pbpaste'; $pb->clear(); $pb->copy( encode( 'UTF-8', "\N{SNOWMAN}" ), 'public.utf8-plain-text' ); system 'pbpaste'; say ''; __END__ At my current level of understanding, I am not sure how to proceed, since what flavors are guaranteed available is unclear to me, and what encodings they correspond to may not be straightforward. As an example of the latter point, public.utf16-plain-text seems to be UTF-16LE (i.e. no BOM) not UTF-16 (i.e. with leading BOM). Hopefully more research will clarify this. As a possible complication: my testing was done with Mac OS 10.8.3 Mountain Lion, but I believe the interface works back to 10.3 Panther, and I feel the need not to break anything under Panther -- at least not without going through a deprecation cycle.
Version 0.004_01 of Mac::Pasteboard went to PAUSE not too long ago, and should appear on your favorite CPAN mirror Real Soon Now. I have not been able to make the encoding Just Work, because I have not come up with a way to tell what the encoding of flavor com.apple.traditional-mac-plain-text is, nor of figuring out exactly how pbpaste decides what to return so I can make my code compatible with it. The solution I have come up with is to allow the user to select a default flavor other than com.apple.traditional-mac-plain-text, and to allow the user to turn on encoding/decoding of known flavors (public.utf8-plain-text, public.utf16-plain-text, and public.utf16-external-plain-text). I have attached a modification of your original gist that uses the new functionality. The central change is the addition of the calls to pbflavor() and pbencode(). There are a few other changes to the attached file, which I was unable to prevent myself from making. The POD for 'use bytes' says it is deprecated for production code, so I substituted 'use utf8'. I also put STDOUT into UTF-8 mode to suppress the 'Wide character in output' warning; this (logically, but to my surprise anyway) required the decoding of the captured output of `pbpaste`; otherwise the results of `pbpaste` ended up double-encoded, and therefore gibberish. If I do not hear from you, I will do a production release in a week or so, as time permits.
Subject: ogata
Download ogata
application/octet-stream 1.1k

Message body not shown because it is not plain text.

Version 0.005 having been out for a week with no further traffic, I am going to mark this ticket resolved. Please let me know if there are any further issues.