Subject: | strip out random U+200B zero-width spaces from titles |
For some unknown reason Amazon is putting U+200B in the titles of a few of my wishlist items.
They're not in the item title at the item page. They're only present on the wishlist page, in the
HTML source. It's a non-printing character so the browser doesn't display it. So I wrote this
patch to remove it from the get_list() as well.
Subject: | zero-width-spaces.txt |
diff --git a/lib/WWW/Amazon/Wishlist.pm b/lib/WWW/Amazon/Wishlist.pm
index ecbbf5d..e885725 100755
--- a/lib/WWW/Amazon/Wishlist.pm
+++ b/lib/WWW/Amazon/Wishlist.pm
@@ -467,6 +467,8 @@ sub _extract
} # if
DEBUG_HTML && print STDERR " DDD price=$sPrice=\n";
} # else
+ # Strip out zero-width spaces scattered about randomly in item titles
+ $sTitle =~ s/\x{200b}//g;
# Add this item to the result set:
my %hsItem = (
asin => $sASIN,