Skip Menu |

This queue is for tickets about the Cache-Memcached CPAN distribution.

Report information
The Basics
Id: 35611
Status: open
Priority: 0/
Queue: Cache-Memcached

People
Owner: Nobody in particular
Requestors: KAPPA [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Wishlist
Broken in: 1.24
Fixed in: (no value)



Subject: [patch] Do not try to store in cache too big values
memcached does not accept values bigger than 1mb. Moreover, it's commonly not very useful to cache big values -- they are probably rare. We here at Rambler have an option "too_big_threshold" in Cache::Memcached which sets the size in bytes after which no storing is even attempted. Such logic cannot be elegantly implemented on the application side because serialization is done inside Cache::Memcached. I attach my Perl patch to the latest BRADFITZ's Cache::Memcached, it could be useful.
Subject: toobig-2.patch
--- Memcached.pm.orig 2007-07-17 22:40:23.000000000 +0400 +++ Memcached.pm 2008-05-03 04:09:26.000000000 +0400 @@ -25,6 +25,7 @@ bucketcount _single_sock _stime connect_timeout cb_connect_fail parser_class + too_big_threshold }; # flag definitions @@ -76,6 +77,7 @@ $self->{'stat_callback'} = $args->{'stat_callback'} || undef; $self->{'readonly'} = $args->{'readonly'}; $self->{'parser_class'} = $args->{'parser_class'} || $parser_class; + $self->{'too_big_threshold'} = $args->{'too_big_threshold'}; # TODO: undocumented $self->{'connect_timeout'} = $args->{'connect_timeout'} || 0.25; @@ -474,6 +476,10 @@ } } + if ($self->{'too_big_threshold'} && $len >= $self->{'too_big_threshold'}) { + return 1; # positive NOP + } + $exptime = int($exptime || 0); local $SIG{'PIPE'} = "IGNORE" unless $FLAG_NOSIGNAL; @@ -959,6 +965,10 @@ Values larger than this threshold will be compressed by C<set> and decompressed by C<get>. +Use C<too_big_threshold> to set the upper limit on size of cached data +items. Values larger than this threshold won't be sent over the +network. It's no use to send >1MB values anyway, they are not stored. + Use C<no_rehash> to disable finding a new memcached server when one goes down. Your application may or may not need this, depending on your expirations and key usage. @@ -1002,6 +1012,10 @@ Sets the compression threshold. See C<new> constructor for more information. +=item C<set_too_big_threshold> + +Sets the overall size threshold. See C<new> constructor for more information. + =item C<enable_compress> Temporarily enable or disable compression. Has no effect if C<compress_threshold>
On Sat May 03 07:45:59 2008, KAPPA wrote: Show quoted text
> memcached does not accept values bigger than 1mb. Moreover, it's > commonly not very useful to cache big values -- they are probably
rare. Show quoted text
> > We here at Rambler have an option "too_big_threshold" in > Cache::Memcached which sets the size in bytes after which no storing
is Show quoted text
> even attempted. > > Such logic cannot be elegantly implemented on the application side > because serialization is done inside Cache::Memcached.
The value of "too_big_threshold" is really a server-side parameter. You can set this by recompiling (on older versions) or setting the -I flag (for newer versions) - why would the client need to be told about the maximum size? If anything, it should be obtained from issuing a "stats settings" command upon connection, and looking at "item_size_max", I think. Moreover, why isn't it sufficient to check the return value of the set method? If your item wasn't cached, you'll get a false return value, at which point your application can handle or ignore this as required.
On Sat Dec 01 17:06:32 2012, DOHERTY wrote: Show quoted text
> The value of "too_big_threshold" is really a server-side parameter. You > can set this by recompiling (on older versions) or setting the -I flag > (for newer versions) - why would the client need to be told about the > maximum size? If anything, it should be obtained from issuing a "stats > settings" command upon connection, and looking at "item_size_max", I > think. > > Moreover, why isn't it sufficient to check the return value of the set > method? If your item wasn't cached, you'll get a false return value, at > which point your application can handle or ignore this as required.
I know this is an old bug, but this is a good feature. The reason it is insufficient to simple let set() fail for large values is because it is a huge waste of time and resources. For example: * assume you have some 5mb data structure * assume it's normally not quite that big * assume you want to cache it * you can not check in the frontend code, because 5mb may compress down to < 1mb and cache just fine... it is likely to compress down to around 500kb or less in most cases. * its sent through Cache::Memcached... the data is too random to compress well, and ends up being 4mb. * now that 4mb of data will be sent across the wire to the memcached server, which will only reject it once the whole message is received. The network part is the bottleneck for most operations, especially for larger values. This is entirely avoidable. Cache::Memcached::Fast implements "max_size". The only change needed in this patch, AFAICT, would be replacing "too_big_threshold" with "max_size" so that Cache::Memcached and Cache::Memcached::Fast have the same API. The fact that C:M:Fast implemented it should be a good example of the desire for the feature.
This patch is still relevant. I updated it against the most fresh Cache::Memcached and also renamed the parameter to 'max_size' to be compatible with Cache::Memcached::Fast. 17 Şub 2014 Pts, 15:04:00 tarihinde, UNRTST yazdı: Show quoted text
> On Sat Dec 01 17:06:32 2012, DOHERTY wrote:
> > The value of "too_big_threshold" is really a server-side parameter. > > You > > can set this by recompiling (on older versions) or setting the -I > > flag > > (for newer versions) - why would the client need to be told about the > > maximum size? If anything, it should be obtained from issuing a > > "stats > > settings" command upon connection, and looking at "item_size_max", I > > think. > > > > Moreover, why isn't it sufficient to check the return value of the > > set > > method? If your item wasn't cached, you'll get a false return value, > > at > > which point your application can handle or ignore this as required.
> > I know this is an old bug, but this is a good feature. > > The reason it is insufficient to simple let set() fail for large > values is because it is a huge waste of time and resources. For > example: > > * assume you have some 5mb data structure > * assume it's normally not quite that big > * assume you want to cache it > * you can not check in the frontend code, because 5mb may compress > down to < 1mb and cache just fine... it is likely to compress down to > around 500kb or less in most cases. > * its sent through Cache::Memcached... the data is too random to > compress well, and ends up being 4mb. > * now that 4mb of data will be sent across the wire to the memcached > server, which will only reject it once the whole message is received. > > The network part is the bottleneck for most operations, especially for > larger values. This is entirely avoidable. > > Cache::Memcached::Fast implements "max_size". The only change needed > in this patch, AFAICT, would be replacing "too_big_threshold" with > "max_size" so that Cache::Memcached and Cache::Memcached::Fast have > the same API. > > The fact that C:M:Fast implemented it should be a good example of the > desire for the feature.
Subject: patch-max_size
Download patch-max_size
application/octet-stream 1.9k

Message body not shown because it is not plain text.