Subject: | Support new API in utf8n_to_uvuni() |
The Perl 5.14 core is changing to handle the Unicode non-characters
properly. These are legal everywhere except for interchange between
applications, but Perl has treated them as illegal. Because of all
this, and the fact that Perl has it wrong as to which code points are
non-characters, the API is changing. As a result, Normalize needs to
change to use the new API. The attached patch does that, while
preserving compatibility with older Perls. Note that the behavior of
Normalize doesn't change, as it explicitly allowed these characters.
FYI, also, utf8n_to_uvuni() has always allowed code points that are
higher than the Unicode maximum of U+10FFFF. The new API will add a
flag to disallow them if desired.
So as to not break anything, I am waiting until this patch is applied
and pushed to blead before continuing with the core Perl changes.
Thank you for applying this, or using it as a basis for your own patch,
and for supporting Normalize in general.
Karl Williamson
Subject: | 0001-Normalize.xs-Support-new-utf8n_to_uvuni-API.patch |
From 32655e71829481e8d0a2665015a313e88371bd12 Mon Sep 17 00:00:00 2001
From: Karl Williamson <public@khwilliamson.com>
Date: Sat, 27 Nov 2010 19:55:26 -0700
Subject: [PATCH] Normalize.xs: Support new utf8n_to_uvuni() API
---
cpan/Unicode-Normalize/Normalize.xs | 7 +++++++
1 files changed, 7 insertions(+), 0 deletions(-)
diff --git a/cpan/Unicode-Normalize/Normalize.xs b/cpan/Unicode-Normalize/Normalize.xs
index f4bbca7..2115095 100644
--- a/cpan/Unicode-Normalize/Normalize.xs
+++ b/cpan/Unicode-Normalize/Normalize.xs
@@ -20,6 +20,13 @@
#define utf8n_to_uvuni utf8_to_uv
#endif /* utf8n_to_uvuni */
+/* Starting in Perl 5.14, non-character code points are changed from disallow
+ * to allow by default, and the #define name is changed. So this turns or-ing
+ * it into a no-op */
+#ifdef UTF8_DISALLOW_NONCHAR
+#define UTF8_ALLOW_FFFF 0
+#endif
+
/* UTF8_ALLOW_BOM is used before Perl 5.8.0 */
#ifdef UTF8_ALLOW_BOM
#define AllowAnyUTF (UTF8_ALLOW_SURROGATE|UTF8_ALLOW_BOM|UTF8_ALLOW_FFFF)
--
1.5.6.3