Subject: | HTTP::Headers::Util::split_header_words should convert Internet Media Types- parameter attribute names to lowercase |
According to the HTTP/1.1 Spec
[[
3.7 Media Types
HTTP uses Internet Media Types in the Content-Type (section
14.17) and Accept (section 14.1) header fields in order to provide
open and extensible data typing and type negotiation.
media-type = type "/" subtype *( ";" parameter )
type = token
subtype = token
Parameters MAY follow the type/subtype in the form of attribute/value
pairs (as defined in section 3.6).
The type, subtype, and parameter attribute names are case-
insensitive. Parameter values might or might not be case-sensitive,
depending on the semantics of the parameter name. (...)
]]
Emphasis on _parameter attribute names are case-insensitive_.
This is problematic when using HTTP::Headers::Util::split_header_words
on such constructs as:
Content-Type: text/html; CHARSET=ISO-8859-1
which, per the specification, is equivalent to
Content-Type: text/html; charset=ISO-8859-1
if running
@values = split_header_words($h->header("Content-Type"));
$data{charset} will be nonexistent in the first case, defined and
='ISO-8859-1' in the second.
If the spec considers all parameter values to be case-insensitive,
systematically lc() the parameter name. If not, maybe split_header_words
needs an option to lc() on demand?
This issue is being discussed in:
http://lists.w3.org/Archives/Public/www-validator/2007Aug/0067.html
and surrounding thread.
many thanks
--
olivier