Hi Russ,
It is very confusing.
1) Pod::Simple::Text
I tried with Pod::Simple::Text, with following result:
IN :
=pod
brackets: []
=cut
Use of uninitialized value $Pod::Simple::nbsp in regexp compilation
at /u/a21g098/huzeh00/NB/EI/adk_root/os390/perl/lib/site_perl/5.22.0/Pod/Simple/Text.pm
line 100, <$textin_fh> line 5.
Use of uninitialized value $Pod::Simple::shy in regexp compilation
at /u/a21g098/huzeh00/NB/EI/adk_root/os390/perl/lib/site_perl/5.22.0/Pod/Simple/Text.pm
line 101, <$textin_fh> line 5.
OUT: brackets: []
So apart from the uninitialized-error the brackets come out ok.
2) tr command in Pod::Text
I found that in Pod::Text.pm at about line 280 there is a line
$text =~ tr/\240\255/ /d;
I guess it is to remove characters 0xA0 and 0xFF from the $text string.
but when I put in a hexdump just before and just after I see that the 0xAD
character (which is the '[' in codepage 1047) is removed:
IN : 40 40 40 40 40 82 99 81 83 92 85 a3 a2 7a 40 ad bd 15 15
IN : brackets: []
OUT: 40 40 40 40 40 82 99 81 83 92 85 a3 a2 7a 40 bd 15 15
OUT: brackets: ]
I don't know the purpose of that line so I cannot say if it is correct.
But it is tricky, since there are 2 'oldchars' specified and 1 'newchar'
plus a delete command.
If I understand the comment in
http://stackoverflow.com/questions/30710164/need-help-in-understanding-perl-tr-command-with-d
correctly, the \240 character will be replaced with the space and the \255
will be deleted.
Furthermore this action is codepage unaware, so that is a risk in itself.
But all of this does not explain why the 0xAD character is removed.
I find that it is the \255 character that matches my 0xAD.
Which I don't get.
So (apart from the obscurity of the tr-line) this narrows it down to the
question why the \255 in the tr command matches my 0xAD.
3) buggy tr ?
# cat tt.pl
sub dumpData
{
my ($pkg,$l,$t)=@_;
print "$t: ";
my @o=unpack("C*",$l);
for my $o (@o)
{ printf "%02x ",$o;
}
print "\n";
print "$t: ";
printf "%s", $l;
print "\n";
}
my $text="abc[]";
dumpData(undef, $text, "IN " );
$text =~ tr/\255//d;
dumpData(undef, $text, "OUT" );
#HUZEH00@APMVST1 /u/a21g098/huzeh00
# perl tt.pl
IN : 81 82 83 ad bd
IN : abc[]
OUT: 81 82 83 bd
OUT: abc]
The circumstance is that I am now working on a newly installed perl 5.22
distribution, so I'm getting the feeling that tr might be buggy here.
I hope this information helps. What is your opinion?
Met vriendelijke groeten
Harrie Huzen
Specialist Ontwikkelen
........................................................................
Belastingdienst
Centrum voor Applicatieontwikkeling en -onderhoud
Service Delivery – FAD Ondersteuning Ontwikkel Services (OOS)
Competence Center Gen/GuardIEn
John F. Kennedylaan 8 | 7314 PS | Apeldoorn | G2 Flex
Postbus 9050 | 7300 GM | Apeldoorn
........................................................................
M 06 - 55 42 08 50
h.huzen@belastingdienst.nl
Competence Center Gen/GuardIEn
........................................................................
= may the source be with you =
Van: "Russ Allbery via RT" <bug-podlators@rt.cpan.org>
Aan: h.huzen@belastingdienst.nl
Datum: 03-10-2016 18:44
Onderwerp: Re: [rt.cpan.org #118240] In EBCDIC context a '[' gets deleted
<URL:
https://rt.cpan.org/Ticket/Display.html?id=118240 >
"h.huzen@belastingdienst.nl via RT" <bug-podlators@rt.cpan.org> writes:
Show quoted text> Using podlators-4.08 on z/OS (IBM mainframe, perl 5.22, default codepage
is
Show quoted text> EBCDIC 1047) a '[' gets deleted:
Could you try this with Pod::Simple::Text and see if you get the same
behavior? I'm trying to narrow this down to see if it's something that
Pod::Text is doing or if it's in Pod::Simple, which does the codepage
handling.
--
#!/usr/bin/perl -- Russ Allbery, Just Another Perl Hacker
$^=q;@!>~|{>krw>yn{u<$$<[~||<Juukn{=,<S~|}<Jwx}qn{<Yn{u<Qjltn{ > 0gFzD gD,
00Fz, 0,,( 0hF 0g)F/=, 0> "L$/GEIFewe{,$/ 0C$~> "@=,m,|,(e 0.), 01,pnn,y{
rw} >;,$0=q,$,,($_=$^)=~y,$/ C-~><@=\n\r,-~$:-u/ #y,d,s,(\$.),$1,gee,print
------------------------------------------------------------------------
De Belastingdienst stelt e-mail niet open voor aanvragen, aangiften, bezwaarschriften, verzoeken, klachten, ingebrekestellingen en soortgelijke formele berichten.
Dit bericht is uitsluitend bestemd voor de geadresseerde. Het bericht kan vertrouwelijke informatie bevatten waarvoor de fiscale geheimhoudingsplicht geldt. Als u dit bericht per abuis hebt ontvangen, wordt u verzocht het te verwijderen en de afzender te informeren.
The Dutch Tax and Customs Administration does not accept filings, requests, appeals, complaints, notices of default or similar formal notices, sent by email.
This message is solely intended for the addressee. It may contain information that is confidential and legally privileged. If you are not the intended recipient please delete this message and notify the sender.