Show quoted text> Thanks for the report.
>
> I agree the current situation is not ideal. What do you propose as a
> fix? The best solution would take into account several things:
>
> - The historical behavior, where people be loose with the case of their
> headers.
> - The best practices you point out, which generally uppercases just the
> first letter after a hypen, but has some exceptions, like "SubOK",
> "DAV", "C-PEP", and "SoapAction"
> - The existing documentation for this behavior, which reads as follows:
>
> "Any other named parameters will be stripped of their initial hyphens
> and turned into header fields, allowing you to specify any HTTP header
> you desire. Internal underscores will be turned into hyphens"
>
> The documentation is helpfully vague on the implementation here,
> allowing us to change the implementation while still remaining true to
> the documentation.
>
> Thanks for your help with this!
There are 133 total headers in the spec (116 permanent and 17
provisional).
The options are:
(1) Essentially a dictionary of regexps mapping to canonical forms. It
would be undesirable to check as many as 133 regexps for each header.
The number could be reduced by batching things like the Accpet-* headers
together, but that still seems too burdensome.
(2) Push the task to a specialized module like HTTP::Headers. This
solves the problem for some headers, but FAILs on others and does not
implement our requirement regarding leading hyphens.
Example:
perl -we 'use HTTP::Headers; my $h
=HTTP::Headers->new("Content-Type"=>"text/html; charset=UTF-8",
"Content-style-type"=>"css", "c-pep"=>"bar", -something_new=>"foo");
print $h->as_string, "\n";'
Outputs:
Content-Type: text/html; charset=UTF-8
-Something-New: foo
C-Pep: bar
Content-Style-Type: css
We would want C-PEP and Something-New in the output.
(3) Screen special cases and push the rest to HTTP::Headers. This seems
like a waste since we would be doing much of the processing ourselves.
(4) Screen special cases and pass the rest for rule-bound
transformation. I think this makes the most sense.
29 of 133 headers have uppercase letters that appear "out of place",
i.e. not following a hyphen.
A-IM
C-PEP
C-PEP-Info
Content-ID
Content-MD5
DAV
Differential-ID
ETag
GetProfile
IM
MIME-Version
P3P
PEP
PICS-Label
ProfileObject
SetProfile
SoapAction
Status-URI
TCN
TE
URI
WWW-Authenticate
Message-ID
SubOK
UA-Color
UA-Media
UA-Pixels
UA-Resolution
UA-Windowpixels
We can catch 11 of those 29 w/ 3 regexps like:
s/^UA-/UA-/i;
s/\bID$/ID/i;
s/\bPEP\b/PEP/i;
Or a combined 23 of 29 with:
s/\b(UA|ID|IM|PEP|P3P|WWW|URI|MD5|DAV|PICS|MIME|TE|TCN)\b/\u($1)/ei;
That leaves 6. Note, regexp could be tuned a bit for performance,
essentially branching with stuff like ...|P(ICS|[E3]P)|... Matching part
of regexp is eligible for compile-once flag also.
Those six headers this I think we will just need to check for explicitly:
ETag
GetProfile
ProfileObject
SetProfile
SoapAction
SubOK
Then everything else can follow the ucfirst after hyphen rule.
If that sounds feasible, I can attempt a patch.
--Joe