Skip Menu |

This queue is for tickets about the MIME-Types CPAN distribution.

Report information
The Basics
Id: 58467
Status: resolved
Priority: 0/
Queue: MIME-Types

People
Owner: Nobody in particular
Requestors: steve [...] deefs.net
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: 1.29
Fixed in: (no value)



Subject: Reading from DATA isn't working with mod_perl
Using Apache2 (prefork), MIME::Types 1.27 works properly, but version 1.29 isn't consistently loading the entire list of MIME types. When I have several children loading the module and doing lookups more or less simultaneously, some of the children only get part of the list. It should be possible to reproduce this by putting the following code in a module that gets read as part of every request: my $class = MIME::Types->new(); my $mime = $class->mimeTypeOf($some_filename_here); my $type; if (defined $mime) { $type = $mime->type(); print STDERR "Found $type (" . scalar($class->types()) . " total)\n"; } else { print STDERR "Not Found (" . scalar($class->types()) . " total)\n"; } When I try it, I'm getting a list of 727 types some of the time, but not always. Other times, it's only returning between 250 and 500 types. The attached patch file reverts just the change from $mime_type_definitions to __DATA__, which fixes the problem.
Subject: mime-types.bug.patch
--- Types.pm.orig 2010-06-16 17:35:57.871783255 -0400 +++ Types.pm.new 2010-06-16 17:39:18.315780873 -0400 @@ -17,6 +17,8 @@ my %list; sub new(@) { (bless {}, shift)->init( {@_} ) } +my $mime_type_definitions; # see bottom file + sub init($) { my ($self, $args) = @_; @@ -24,7 +26,7 @@ { local $_; local $/ = "\n"; - while(<DATA>) + foreach (split /^/, $mime_type_definitions) { chomp; next if !length $_ or substr($_, 0, 1) eq '#'; @@ -194,14 +196,12 @@ CROAK } -1; - #------------------------------------------- # Internet media type registry is at # http://www.iana.org/assignments/media-types/ # Another list can be found at: http://ftyps.com -__DATA__ +$mime_type_definitions = <<__MIMETYPES__; application/activemessage application/andrew-inset;ez application/annodex;anx @@ -1200,3 +1200,7 @@ # IE6 bug image/pjpeg;;base64 + +__MIMETYPES__ + +1;
Subject: Re: [rt.cpan.org #58467] Reading from DATA isn't working with mod_perl
Date: Thu, 17 Jun 2010 00:36:36 +0200
To: Steve Simms via RT <bug-MIME-Types [...] rt.cpan.org>
From: Mark Overmeer <solutions [...] overmeer.net>
* Steve Simms via RT (bug-MIME-Types@rt.cpan.org) [100616 22:20]: Show quoted text
> Wed Jun 16 18:20:18 2010: Request 58467 was acted upon. > Transaction: Ticket created by SSIMMS > Queue: MIME-Types > Subject: Reading from DATA isn't working with mod_perl > Broken in: 1.29 > > Using Apache2 (prefork), MIME::Types 1.27 works properly, but version > 1.29 isn't consistently loading the entire list of MIME types. When I > have several children loading the module and doing lookups more or less > simultaneously, some of the children only get part of the list.
It is unhealthy to read the whole list in each child: better read it before the fork. -- Regards, MarkOv ------------------------------------------------------------------------ Mark Overmeer MSc MARKOV Solutions Mark@Overmeer.net solutions@overmeer.net http://Mark.Overmeer.net http://solutions.overmeer.net
On Wed Jun 16 18:36:50 2010, solutions@overmeer.net wrote: Show quoted text
> It is unhealthy to read the whole list in each child: better read it > before the fork.
That works, thanks. It's definitely a "gotcha" though. Another solution that seems to work is to add the following to MIME/Types.pm: __PACKAGE__->init(); It may just be playing with the timing, but it's working consistently on my server, without needing to add Perl code to the Apache config. Would it be possible to add that to the distribution, and/or could this issue be noted in the documentation?
Subject: Re: [rt.cpan.org #58467] Reading from DATA isn't working with mod_perl
Date: Thu, 17 Jun 2010 09:21:43 +0200
To: Steve Simms via RT <bug-MIME-Types [...] rt.cpan.org>
From: Mark Overmeer <mark [...] overmeer.net>
* Steve Simms via RT (bug-MIME-Types@rt.cpan.org) [100617 01:17]: Show quoted text
> Queue: MIME-Types > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=58467 > > > On Wed Jun 16 18:36:50 2010, solutions@overmeer.net wrote:
> > It is unhealthy to read the whole list in each child: better read it > > before the fork.
> > That works, thanks. It's definitely a "gotcha" though.
It's a gotcha from mod_perl, not from my module. Many modules with will show initiation problems when run under mod_perl. Show quoted text
> Another solution that seems to work is to add the following to > MIME/Types.pm: > __PACKAGE__->init();
Well, the <DATA> list is only read once; only with the first call to MIME::Types->new. new() calls init(), in some OO fashion. my $mimetypes = MIME::Types->new; is by far the best way to do it. Show quoted text
> Would it be possible to add that to the distribution, and/or could this > issue be noted in the documentation?
Where this problem is 100% caused by (mis)behavior of mod_perl, nearly any other Perl module could add such a warning. All mod_perl books have a chapter on these issues. (Creating your own webservers in pure perl, f.i. HTTP::Daemon, is very simple) -- Regards, MarkOv ------------------------------------------------------------------------ Mark Overmeer MSc MARKOV Solutions Mark@Overmeer.net solutions@overmeer.net http://Mark.Overmeer.net http://solutions.overmeer.net
not a Mime::Types problem
From: tlhackque [...] yahoo.com
On Fri Jul 02 05:14:45 2010, MARKOV wrote: Show quoted text
> not a Mime::Types problem
Having just been bitten by this, I respectfully disagree. Yes, MOD_PERL does impose some restrictions - but being compatible would make MIME::Types more useful. MOD_PERL is a widely-used accelerator for web applications, and can produce 100x performance increases for such applications. I don't think that MIME::Types should be a barrier to its use. The problem is that on each request, MIME::Types tries to read from DATA to rebuild the index - but after the first request serviced by a given httpd process, DATA is at EOF. Of course, this means that no types are defined. Ideally, one would make the indicies persist - which I thought ought to be as simple as changing the my's to our's. I'm missing something, because that didn't work. (My second day with MOD_PERL, so I'm no expert.) As a workaround, I extracted the __DATA__ section to MIME::Types.pm.data & made the following changes, which do allow it to function. A permanent solution - preferably a better one from a MOD_PERL expert -- would be appreciated. But even this patch allows my application to run at mod_perl speeds. --- /usr/lib/perl5/site_perl/5.8.8/MIME/Types.pm~ 2010-06-03 06:00:43.000000000 -0400 +++ /usr/lib/perl5/site_perl/5.8.8/MIME/Types.pm 2010-08-25 12:48:40.000000000 -0400 @@ -22,11 +22,17 @@ unless(keys %list) # already read { local $_; local $/ = "\n"; - while(<DATA>) + # MOD_PERL doesn't support <DATA>; copy it out to a .data file. + # In particular, it won't re-seek it to the right place for each request. + # Install should extract it to a .data file. + + my $f = __FILE__ . '.data'; + open( my $fh, '<', $f ) or die "Can't open data file $f: $!"; + while( <$fh> ) { chomp; next if !length $_ or substr($_, 0, 1) eq '#'; my $os = s/^(\w+)\:// ? qr/$1/i : undef; my ($type, $extensions, $encoding) = split /\;/; @@ -40,11 +46,11 @@ , extensions => $extent , encoding => $encoding , system => $os ); } - close DATA; + close $fh; } $self; }
Subject: Re: [rt.cpan.org #58467] Reading from DATA isn't working with mod_perl
Date: Wed, 25 Aug 2010 21:33:04 +0200
To: via RT <bug-MIME-Types [...] rt.cpan.org>
From: Mark Overmeer <mark [...] overmeer.net>
* via RT (bug-MIME-Types@rt.cpan.org) [100825 17:39]: Show quoted text
> Queue: MIME-Types > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=58467 > > > On Fri Jul 02 05:14:45 2010, MARKOV wrote:
> > not a Mime::Types problem
> > Having just been bitten by this, I respectfully disagree. > ... > The problem is that on each request, MIME::Types tries to read from > DATA to rebuild the index - but after the first request serviced by a > given httpd process, DATA is at EOF. Of course, this means that no > types are defined.
I know. Modperl does reuse a thread for multiple requests. To do so, it cleans out the variables but does not restore file-handles. That's a very nasty hack. Show quoted text
> Ideally, one would make the indicies persist - which I thought ought to > be as simple as changing the my's to our's. I'm missing something, > because that didn't work. (My second day with MOD_PERL, so I'm no > expert.)
If you load MIME::Types before mod-perl/apache starts forking clients, then all clients will see the table and you do not need tricks. If I remember well, you should call MIME::Types once within a BEGIN block or some mod_perl init scripts... that will be much faster as well, because table processing only works once. If you use mod_perl, you will suffer the consequences. Each module you use will need different things to get them to work under mod_perl. Writing a pure perl webserver without mod_perl of apache is very simple. That's what I usually do. F.i. based on HTTP::Daemon. -- Regards, MarkOv ------------------------------------------------------------------------ Mark Overmeer MSc MARKOV Solutions Mark@Overmeer.net solutions@overmeer.net http://Mark.Overmeer.net http://solutions.overmeer.net
On Wed 25. aug. 2010 15:33:18, Mark@Overmeer.net wrote: Show quoted text
> * via RT (bug-MIME-Types@rt.cpan.org) [100825 17:39]:
> > Queue: MIME-Types > > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=58467 > > > > > On Fri Jul 02 05:14:45 2010, MARKOV wrote:
> > > not a Mime::Types problem
> > > > Having just been bitten by this, I respectfully disagree. > > ... > > The problem is that on each request, MIME::Types tries to read from > > DATA to rebuild the index - but after the first request serviced by
a Show quoted text
> > given httpd process, DATA is at EOF. Of course, this means that no > > types are defined.
> > I know. Modperl does reuse a thread for multiple requests. To do > so, it cleans out the variables but does not restore file-handles. > That's a very nasty hack. >
> > Ideally, one would make the indicies persist - which I thought
ought to Show quoted text
> > be as simple as changing the my's to our's. I'm missing something, > > because that didn't work. (My second day with MOD_PERL, so I'm no > > expert.)
> > If you load MIME::Types before mod-perl/apache starts forking > clients, then all clients will see the table and you do not need > tricks. If I remember well, you should call MIME::Types once within > a BEGIN block or some mod_perl init scripts... that will be much > faster as well, because table processing only works once. > > If you use mod_perl, you will suffer the consequences. Each module > you use will need different things to get them to work under mod_perl. > Writing a pure perl webserver without mod_perl of apache is very
simple. Show quoted text
> That's what I usually do. F.i. based on HTTP::Daemon.
+1 to fixing this for mod_perl, (and no, haven't really encountered similar problems with modern modules). Shouldn't the list of mimetypes be put in a constant at "compile"-time anyway, or are there some really good points in favour of reading it at instantiation? Anyway, we solved it by having this in startup.pl: BEGIN { use MIME::Types; my $mimetypes = MIME::Types->new(); } Thanks for your module, nonetheless! :-)
Subject: Re: [rt.cpan.org #58467] Reading from DATA isn't working with mod_perl
Date: Thu, 21 Oct 2010 12:38:17 +0200
To: Nicolas Mendoza via RT <bug-MIME-Types [...] rt.cpan.org>
From: Mark Overmeer <solutions [...] overmeer.net>
* Nicolas Mendoza via RT (bug-MIME-Types@rt.cpan.org) [101020 15:12]: Show quoted text
> Queue: MIME-Types > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=58467 > > > +1 to fixing this for mod_perl,
Fix mod_perl: don't try to fix the modules which are not broken. Show quoted text
> (and no, haven't really encountered similar problems with modern modules).
You're lucky Show quoted text
> Anyway, we solved it by having this in startup.pl: > BEGIN { > use MIME::Types; > my $mimetypes = MIME::Types->new(); > }
Next version of the module will contain this in the documentation: =section MIME::Types and mod_perl This module uses a DATA handle to read all the types at first instantiation. That doesn't work well with the module abuse by mod_perl, which reuses compiled Perl instances by simply clearing the variables. When you use this module with mod_perl, add this to C<startup.pl> use MIME::Types; BEGIN { MIME::Types->new() } Now, the type definitions will get parsed before the processes are spawned. In older versions, the whole table was in a single scalar. The table has recently exploded in size. Perl contains multiple copies of scalars, for instance to be able to produce in-context error messages. So, the whole table was copied at least three times. With the DATA section, only the processed version is kept in memory. -- MarkOv ------------------------------------------------------------------------ Mark Overmeer MSc MARKOV Solutions Mark@Overmeer.net solutions@overmeer.net http://Mark.Overmeer.net http://solutions.overmeer.net
not a bug. Typical mod_perl usage problem
Le Jeu 21 Oct 2010 06:38:32, solutions@overmeer.net a écrit : [...] Show quoted text
> > Next version of the module will contain this in the documentation: >
Please do include something in the doc as you promised. We just encountered the same problem in our project and wasted some time understanding what was wrong. You may dislike mod_perl, but nevertheless there are quite a lot of mod_perl users, so if you can make their life better, that's worth it, isn't it ?
From: demerphq [...] gmail.com
On Tue Nov 16 03:21:24 2010, MARKOV wrote: Show quoted text
> not a bug. Typical mod_perl usage problem
I don't agree. Earlier versions of MIME::Types did not have these problems. Upgrading to use this code breaks things. Therefor the code DOES have a bug, in that it assumes that things work a specific way always when they don't work that way always. Revert the code which moves this data to the DATA block, or check if DATA is *closed* when reading the DATA handle and this stops happening. Yes, you are right, the BEST solution is to use the workaround and ensure that the module is loaded pre-fork, but you could easily A) add check to see if you running under mod-perl, and then a) warn if the data section is reloaded, and b) ensure that the code works properly, even if it is efficient. However that means forcing all your mod-perl users to change their code, when in fact this is triggered by you changing MIME::Types. That isn't right. cheers, Yves
Subject: Re: [rt.cpan.org #58467] Reading from DATA isn't working with mod_perl
Date: Mon, 16 May 2011 13:23:46 +0200
To: demerphq via RT <bug-MIME-Types [...] rt.cpan.org>
From: Mark Overmeer <solutions [...] overmeer.net>
* demerphq via RT (bug-MIME-Types@rt.cpan.org) [110516 11:10]: Show quoted text
> Queue: MIME-Types > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=58467 > > > On Tue Nov 16 03:21:24 2010, MARKOV wrote:
> > not a bug. Typical mod_perl usage problem
> > I don't agree. Earlier versions of MIME::Types did not have these problems. > Upgrading to use this code breaks things.
True that the previous version worked differently. On the other hand, I did not change the interface of the module. What you experience is a nasty consequence of using mod-perl. Show quoted text
> Yes, you are right, the BEST solution is to use the workaround and > ensure that the module is loaded pre-fork, but you could easily A) add > check to see if you running under mod-perl, and then a) warn if the data > section is reloaded, and
I do agree: we should have (A) and (a). Actually, (a) is already supported: you can instantiate as many Mime::Types objects as you want, the table is only read once. You may be able to contribute code for (A). Creating Mime::Type objects before the fork is far more efficient than later. Show quoted text
> b) ensure that the code works properly, even if it is efficient.
The code is working properly: it works according to the manual-page. mod_perl is perl+nasty_tricks. Show quoted text
> However that means forcing all your mod-perl users to > change their code, when in fact this is triggered by you changing > MIME::Types. That isn't right.
Other people complained that the list of MIME::Types grew that large that it consumed too much memory and start-up time. Not everyone is using mod-perl; there are people who run other kinds of applications. Really! -- Regards, MarkOv ------------------------------------------------------------------------ Mark Overmeer MSc MARKOV Solutions Mark@Overmeer.net solutions@overmeer.net http://Mark.Overmeer.net http://solutions.overmeer.net
From: demerphq [...] gmail.com
On Mon May 16 07:24:06 2011, solutions@overmeer.net wrote: Show quoted text
> * demerphq via RT (bug-MIME-Types@rt.cpan.org) [110516 11:10]:
> > Queue: MIME-Types > > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=58467 > > > > > On Tue Nov 16 03:21:24 2010, MARKOV wrote:
> > > not a bug. Typical mod_perl usage problem
> > > > I don't agree. Earlier versions of MIME::Types did not have these
> problems.
> > Upgrading to use this code breaks things.
> > True that the previous version worked differently. On the other hand, > I > did not change the interface of the module. What you experience is a > nasty consequence of using mod-perl.
Combined with a change to the code. :-) Show quoted text
> > Yes, you are right, the BEST solution is to use the workaround and > > ensure that the module is loaded pre-fork, but you could easily A)
> add
> > check to see if you running under mod-perl, and then a) warn if the
> data
> > section is reloaded, and
> > I do agree: we should have (A) and (a). > Actually, (a) is already supported: you can instantiate as many > Mime::Types objects as you want, the table is only read once. > > You may be able to contribute code for (A). Creating Mime::Type > objects before the fork is far more efficient than later.
If you want a patch I can arrange that no problem. Show quoted text
> > b) ensure that the code works properly, even if it is efficient.
> > The code is working properly: it works according to the manual-page. > mod_perl is perl+nasty_tricks.
I don't think this is relevant. mod_perl is part of the perl universe, and if the only that changes in a system is this module and things start breaking then I think it is fair to hold the module to account. Show quoted text
> > However that means forcing all your mod-perl users to > > change their code, when in fact this is triggered by you changing > > MIME::Types. That isn't right.
> > Other people complained that the list of MIME::Types grew that large > that it consumed too much memory and start-up time. Not everyone is > using mod-perl; there are people who run other kinds of applications. > Really!
This doesn't make sense. You are saying that breaking your mod_perl user base is an acceptable tradeoff for reducing memory footprint. Forcing all your mod_perl users to change their code so that other users can have a reduced memory footprint does not seem like a very neighborly solution. Why are they more important than us? Why is reduced memory footprint more important than broken systems?
Subject: Re: [rt.cpan.org #58467] Reading from DATA isn't working with mod_perl
Date: Tue, 14 Jun 2011 17:14:35 +0200
To: demerphq via RT <bug-MIME-Types [...] rt.cpan.org>
From: Mark Overmeer <mark [...] overmeer.net>
* demerphq via RT (bug-MIME-Types@rt.cpan.org) [110527 14:57]: Show quoted text
> Queue: MIME-Types > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=58467 > > > On Mon May 16 07:24:06 2011, solutions@overmeer.net wrote:
Show quoted text
> > What you experience is a nasty consequence of using mod-perl.
> Combined with a change to the code. :-)
There is no release without change of code. Show quoted text
>>> Yes, you are right, the BEST solution is to use the workaround and >>> ensure that the module is loaded pre-fork, but you could easily A) >>> add check to see if you running under mod-perl
>> You may be able to contribute code for (A). Creating Mime::Type >> objects before the fork is far more efficient than later.
Show quoted text
> If you want a patch I can arrange that no problem.
Yes please. -- Regards, MarkOv ------------------------------------------------------------------------ Mark Overmeer MSc MARKOV Solutions Mark@Overmeer.net solutions@overmeer.net http://Mark.Overmeer.net http://solutions.overmeer.net
On Mon May 16 07:24:06 2011, solutions@overmeer.net wrote: Show quoted text
> mod_perl is perl+nasty_tricks.
mod_perl is, in this regard, working exactly like any other preforking server. Its "nasty tricks" are a red herring; they exist but have nothing to do with this problem. In particular, the stuff mod_perl does with clearing symbol tables (or whatever) is irrelevant -- you can reproduce the problem with a one-liner: perl -e 'use MIME::Types; for (1..10) { fork || last; wait } warn "$$: " . MIME::Types::by_suffix("rtf")->[0]' 809: text/rtf at -e line 1. 810: at -e line 1. 811: at -e line 1. 812: at -e line 1. 813: at -e line 1. 814: at -e line 1. 815: at -e line 1. 816: at -e line 1. 818: at -e line 1. 819: at -e line 1. 808: at -e line 1. This bug bit me in a fastcgi daemon, where I followed the incredibly common practice of loading modules in the parent process that may not be used until after forking. Nothing in the documentation indicates that this incredibly common practice won't work with MIME::Types, so I find your insistence that it's "100% caused by (mis)behavior of mod_perl" a little hard to swallow.
Subject: Re: [rt.cpan.org #58467] Reading from DATA isn't working with mod_perl
Date: Thu, 18 Aug 2011 23:33:45 +0200
To: Hans Dieter Pearcey via RT <bug-MIME-Types [...] rt.cpan.org>
From: Mark Overmeer <mark [...] overmeer.net>
* Hans Dieter Pearcey via RT (bug-MIME-Types@rt.cpan.org) [110817 14:45]: Show quoted text
> Queue: MIME-Types > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=58467 > > > mod_perl is, in this regard, working exactly like any other preforking > server. Its "nasty tricks" are a red herring; they exist but have > nothing to do with this problem.
Multiprocessing is not supported by Perl, although you may use fork() if you understand what you are doing. One of the things is that it always causes problems with programs using DATA handles, END, DESTROY, %SIG, AUTOLOAD, etc. On the other hand, adding a simple seek to the code will make it work for people who use fork or mod_perl without reading the man-pages, although slow because each process parses the huge table again. Just released as 1.32 -- Regards, MarkOv ------------------------------------------------------------------------ Mark Overmeer MSc MARKOV Solutions Mark@Overmeer.net solutions@overmeer.net http://Mark.Overmeer.net http://solutions.overmeer.net