Subject: | Can get unicode text? |
Hi Chris,
I have an an issue for you :) if you have some time to consider.
Basically, I'm trying to extract Thai text from Pdf file.
I have only just tried the 'getPageText' method, but I don't get any relevant Thai text from pdf.
(I'm not exactly sure if they have any problem with other languages like Chinese, Japanese etc. or it is just my font problem, I'm dump about pdf)
Anyway, I have created and attached bug_unicode.t test file along with sample pdf files for you to check out
Could you see if anything wrong with the test?
⮀ CAM-PDF-1.59 prove -vwl t/bug_unicode.t
t/bug_unicode.t ..
1..1
not ok 1 - Should get expected text
# Failed test 'Should get expected text'
# at t/bug_unicode.t line 11.
Wide character in print at /Users/zdk/perl5/perlbrew/perls/perl-5.14.2/lib/5.14.4/Test/Builder.pm line 1759.
# got: '!"#$%&'(')!*
# '
# expected: 'ภาษาไทย'
# Looks like you failed 1 test of 1.
Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/1 subtests
Test Summary Report
-------------------
t/bug_unicode.t (Wstat: 256 Tests: 1 Failed: 1)
Failed test: 1
Non-zero exit status: 1
Files=1, Tests=1, 0 wallclock secs ( 0.02 usr 0.00 sys + 0.07 cusr 0.00 csys = 0.09 CPU)
Result: FAIL
Thanks
Cheers,
zdk
Subject: | CH3.pdf |
Message body not shown because it is not plain text.