Skip Menu |

This queue is for tickets about the Cache-Cache CPAN distribution.

Report information
The Basics
Id: 66329
Status: rejected
Priority: 0/
Queue: Cache-Cache

People
Owner: Nobody in particular
Requestors: makk384 [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 1.05
Fixed in: (no value)



Subject: scaling of Cache::SizeAwareFileCache
The Cache::SizeAwareFileCache doesn't scale very well, when doing sets under even a middling-sized cache. It is doing a full scan of the cache upon each set, presumably to ensure that it doesn't go over the given size. As an example, a set of 103 items on a cache with ~10K records resulted in 179K checks of the cache to see if over size. The offending lines are actually in Cache::CacheSizer, in the limit_size method. Here's the NYTProf stats on that method: 103 1.09ms 309 189s _Limit_Size( $self->_get_cache( ), # spent 189s making 103 calls to Cache::CacheSizer::_build_cache_meta_data, avg 1.83s/call # spent 2.28ms making 103 calls to Cache::CacheSizer::_Limit_Size, avg 22µs/call # spent 176µs making 103 calls to Cache::CacheSizer::_get_cache, avg 2µs/call 61 $self->_build_cache_meta_data( ), 62 $p_new_size ); 63 } Within _Limit_Size, this baloons to a *huge* number of checks on the cache, as per: 103 15.8ms 206 77.8s foreach my $key ( $self->_get_cache( )->get_keys( ) ) # spent 77.8s making 103 calls to Cache::BaseCache::get_keys, avg 756ms/call # spent 169µs making 103 calls to Cache::CacheSizer::_get_cache, avg 2µs/call 103 { 104 179046 570ms 358092 97.3s my $object = $self->_get_cache( )->get_object( $key ) or # spent 96.9s making 179046 calls to Cache::BaseCache::get_object, avg 541µs/call # spent 322ms making 179046 calls to Cache::CacheSizer::_get_cache, avg 2µs/call 105 next; 106 107 179046 624ms 179046 12.2s $cache_meta_data->insert( $object ); # spent 12.2s making 179046 calls to Cache::CacheMetaData::insert, avg 68µs/call 108 } Sorry for the horrible formatting. As to how to resolve this, a few options come to mind: - having some flexibility on how often size constraints are enforced within a process. Only check size details after x sets or when an item that exceeds threshold y is set. - maintain additonal metadata on sizing at the top-level, which is updated on each write and each deletion (with appropriate locking ;) ). Take care, Mark.
You're right, it doesn't scale well. :) This isn't going to be fixed since Cache::Cache has stopped active development. Have you seen CHI, the follow-up to Cache::Cache? It has a size-aware file cache that operates much more reasonably, though it still has two do two writes for each set (one for the set, one to update the size). The basic speed of the File cache is also much improved (this was one of the motivators for creating CHI). See http://search.cpan.org/~jswartz/CHI-0.41/lib/CHI/Benchmarks.pod On Wed Mar 02 11:01:06 2011, MMORGAN wrote: Show quoted text
> The Cache::SizeAwareFileCache doesn't scale very well, when doing sets > under even a middling-sized cache. It is doing a full scan of the cache > upon each set, presumably to ensure that it doesn't go over the given
size. Show quoted text
> > As an example, a set of 103 items on a cache with ~10K records resulted > in 179K checks of the cache to see if over size. > > The offending lines are actually in Cache::CacheSizer, in the limit_size > method. Here's the NYTProf stats on that method: > > 103 1.09ms 309 189s _Limit_Size( $self->_get_cache( ), > # spent 189s making 103 calls to > Cache::CacheSizer::_build_cache_meta_data, avg 1.83s/call # spent 2.28ms > making 103 calls to Cache::CacheSizer::_Limit_Size, avg 22µs/call # > spent 176µs making 103 calls to Cache::CacheSizer::_get_cache, avg
2µs/call Show quoted text
> 61 $self->_build_cache_meta_data( ), > 62 $p_new_size ); > 63 } > > Within _Limit_Size, this baloons to a *huge* number of checks on the > cache, as per: > > 103 15.8ms 206 77.8s foreach my $key ( $self->_get_cache( )->get_keys( ) ) > # spent 77.8s making 103 calls to Cache::BaseCache::get_keys, avg > 756ms/call # spent 169µs making 103 calls to > Cache::CacheSizer::_get_cache, avg 2µs/call > 103 { > 104 179046 570ms 358092 97.3s my $object = $self->_get_cache( > )->get_object( $key ) or > # spent 96.9s making 179046 calls to Cache::BaseCache::get_object, avg > 541µs/call # spent 322ms making 179046 calls to > Cache::CacheSizer::_get_cache, avg 2µs/call > 105 next; > 106 > 107 179046 624ms 179046 12.2s $cache_meta_data->insert( $object ); > # spent 12.2s making 179046 calls to Cache::CacheMetaData::insert, avg > 68µs/call > 108 } > > Sorry for the horrible formatting. > > As to how to resolve this, a few options come to mind: > > - having some flexibility on how often size constraints are enforced > within a process. Only check size details after x sets or when an item > that exceeds threshold y is set. > - maintain additonal metadata on sizing at the top-level, which is > updated on each write and each deletion (with appropriate locking ;) ). > > Take care, > Mark.