Skip Menu |

This queue is for tickets about the Encoding-FixLatin CPAN distribution.

Report information
The Basics
Id: 122171
Status: rejected
Priority: 0/
Queue: Encoding-FixLatin

People
Owner: Nobody in particular
Requestors: tom [...] thomasrutter.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: UTF8 fixer note
Date: Thu, 22 Jun 2017 20:38:07 +1000
To: bug-Encoding-FixLatin [...] rt.cpan.org
From: Thomas Rutter <tom [...] thomasrutter.com>
Not sure if this is still active but though I'd write a quick note that converting overlong UTF8 sequences to their equivalent short encoding introduces a potential security flaw in some software as it allows for any character to pass through certain filtering/parsing by disguising it in its overlong form, knowing it'll be converted back to the illegal payload later. It would be better to replace them with the Unicode replacement char (simply removing them can introduce a similar security flaw whereby inserting an invalid UTF8 sequence in the middle of an illegal payload can mask it from filters, with the invalid sequence removed from the middle later). CheersThomas  
Far from being a security flaw, fixing over-long UTF-8 sequences is a deliberate, documented feature of Encoding::FixLatin that can be used to enhance security. If you want to see overlong sequences dropped then you should perform that filtering before the data goes through FixLatin. If you want to filter out specific characters then you should do that after FixLatin.