Bug #112760 for CHI-Driver-Redis: Key listing set is never cleaned

Mon Mar 07 10:12:51 2016 jsemaan [...] inverse.ca - Ticket created

Subject:	Key listing set is never cleaned
Date:	Mon, 7 Mar 2016 10:12:44 -0500
To:	bug-CHI-Driver-Redis [...] rt.cpan.org
From:	Julien Semaan <jsemaan [...] inverse.ca>

Hi all, I am an engineer part of the PacketFence project (http://packetfence.org) and we have been using your driver actively for a few months. We have discovered that there is a leak in the library that causes the key listing sets for a namespace to never be cleaned up. When we look at the code, an expiry is placed on the key/value that is stored but there is no expiration mechanism in place for the entry that is added to the key listing set. Here is the problematic part : $self->redis->sadd($self->prefix . 'chinamespaces', $self->namespace); $self->redis->sadd($ns, $skey); $self->redis->set($realkey, $data); if (defined($expires_in)) { $self->redis->expire($realkey, $expires_in); } Now, this causes major issues on our end since a lot of our keys are temporary so the key set grows rapidly and causes excessive memory usage. Before we try to fix it on our side, we would like to know what you be the approach from your side to ensure the key listing doesn't leak. Would using the SCAN method of redis be a good idea in your opinion ? That has the downside of not having the usual redis guarantee of having the latest data since it is non-locking. Cheers ! -- Julien Semaan jsemaan@inverse.ca :: +1 (866) 353-6153 *155 ::www.inverse.ca Inverse inc. :: Leaders behind SOGo (www.sogo.nu) and PacketFence (www.packetfence.org)

Tue Mar 08 13:38:37 2016 ianburrell [...] gmail.com - Correspondence added

It sounds like expiring members of set in Redis is a common problem. I found a few pages about different approaches: https://quickleft.com/blog/how-to-create-and-expire-list-items-in-redis/ https://github.com/antirez/redis/issues/135 https://groups.google.com/forum/#!topic/redis-db/rXXMCLNkNSs The main suggested approach is to use sorted set with expiration time as score, ZRANGEBYSCORE to retrieve unexpired elements, and ZREMRANGEBYSCORE to expire elements. It looks like there is a purge() method in CHI for doing the expiration. Would it work for you to call purge() regularly? We could implement a more efficient purge() that handles the Redis keys expiring. I wonder if it would be possible to set expiration on the set to the latest expiration time and then the set of keys will expire if all keys expire. But I'm guessing that doesn't happen for you.

Tue Mar 08 13:38:37 2016 The RT System itself - Status changed from 'new' to 'open'

Tue Mar 08 13:38:39 2016 ianburrell [...] gmail.com - Taken

Tue Mar 08 13:58:37 2016 jsemaan [...] inverse.ca - Correspondence added

Subject:	Re: [rt.cpan.org #112760] Key listing set is never cleaned
Date:	Tue, 8 Mar 2016 13:58:23 -0500
To:	bug-CHI-Driver-Redis [...] rt.cpan.org
From:	Julien Semaan <jsemaan [...] inverse.ca>

The scoring approach you define does seem like it will require a daemon maintaining this or will create latency on reads. We used to use the purge method when we used the file store but it involves code paths that we do not want to maintain anymore since one of the reasons we moved to redis was the auto expiry of keys. You are right that we never have all keys expiring so that cannot be used to expire the set on our side I opened a PR that uses SCAN for listing earlier today : https://github.com/rentrak/chi-driver-redis/pull/7 It obviously has downsides that you may not want to have in the library, but in our case the leak problem is bigger than the downsides we have with this new strategy as we do not list keys often. We are about to test this under a heavy load to comfort the design, but still as I said, we do not use listing extensively so we're a soft use-case for that change. Regards, On 03/08/2016 01:38 PM, Ian Burrell via RT wrote: Show quoted text

> <URL: https://rt.cpan.org/Ticket/Display.html?id=112760 > > > It sounds like expiring members of set in Redis is a common problem. I found a few pages about different approaches: > > https://quickleft.com/blog/how-to-create-and-expire-list-items-in-redis/ > https://github.com/antirez/redis/issues/135 > https://groups.google.com/forum/#!topic/redis-db/rXXMCLNkNSs > > The main suggested approach is to use sorted set with expiration time as score, ZRANGEBYSCORE to retrieve unexpired elements, and ZREMRANGEBYSCORE to expire elements. > > It looks like there is a purge() method in CHI for doing the expiration. Would it work for you to call purge() regularly? We could implement a more efficient purge() that handles the Redis keys expiring. > > I wonder if it would be possible to set expiration on the set to the latest expiration time and then the set of keys will expire if all keys expire. But I'm guessing that doesn't happen for you. > >

Wed Aug 31 03:01:55 2016 ATANCASIS [...] cpan.org - Correspondence added

Hi Ian, I've extended CHI::Driver::Redis and implemented the main approach of the fixes you've mentioned on your previous message —by keeping inventory on a sorted set in contrast to a set as there would otherwise be no way to discern which keys have already expired. For the meantime, I've released this modification as a separate module under CHI::Driver::Redis::SortedSet <https://metacpan.org/pod/CHI::Driver::Redis::SortedSet> but would like to check if you'd be interested to incorporate the modifications I've made into your module so the community does not to have to deal with multiple drivers which would essentially be doing the same thing. Some implementation notes BTW: 1. Upon calling store(), instead of adding to a set to keep track of namespace members, the same key is added to a sorted set with the score set to either: a. IF key has a finite expiration — now + $expires_at b. ELSE — "+inf" (as key will never expire) 2. Likewise, when store() gets called, "cleanup" of expired keys via ZREMRANGEBYSCORE is executed. This can be costly for sparsely-used namespaces with blasts in between as the number of expired entries can be huge. That said, I still would not imagine this would have huge costs as sorted sets are automatically "house-kept" with every usage. 3. chinamespaces is still left intact as a set (and not migrated as a sorted set) as this should not be a concern either way. However, if it did, the same strategy can be applied. 4. For existing Redis datasets pre-populated with entries using CHI::Driver::Redis, switching the driver to using the new one will fail (albeit silently as one of the design goals of the original module) as the namespace inventory list data type is now different. A transparent "migration" schema should not be too difficult to implement as it will only involve: a. check if TYPE "<ns>" eq "zset", if it is, no-op (i.e. bail-out) b. foreach SMEMBERS, ZADD to "<ns_temp>", with SCORE = now + TTL c. DEL "<ns>" d. RENAME "<ns_temp>" to "<ns>" I hope this provides an outline of my thoughts. Looking forward to hearing from you. Cheers, Arnold On Tue Mar 08 13:58:37 2016, jsemaan@inverse.ca wrote: Show quoted text

> The scoring approach you define does seem like it will require a > daemon > maintaining this or will create latency on reads. > > We used to use the purge method when we used the file store but it > involves code paths that we do not want to maintain anymore since one > of > the reasons we moved to redis was the auto expiry of keys. > > You are right that we never have all keys expiring so that cannot be > used to expire the set on our side > > I opened a PR that uses SCAN for listing earlier today : > https://github.com/rentrak/chi-driver-redis/pull/7 > > It obviously has downsides that you may not want to have in the > library, > but in our case the leak problem is bigger than the downsides we have > with this new strategy as we do not list keys often. > > We are about to test this under a heavy load to comfort the design, > but > still as I said, we do not use listing extensively so we're a soft > use-case for that change. > > Regards, > > On 03/08/2016 01:38 PM, Ian Burrell via RT wrote:

> > <URL: https://rt.cpan.org/Ticket/Display.html?id=112760 > > > > > It sounds like expiring members of set in Redis is a common problem. > > I found a few pages about different approaches: > > > > https://quickleft.com/blog/how-to-create-and-expire-list-items-in- > > redis/ > > https://github.com/antirez/redis/issues/135 > > https://groups.google.com/forum/#!topic/redis-db/rXXMCLNkNSs > > > > The main suggested approach is to use sorted set with expiration time > > as score, ZRANGEBYSCORE to retrieve unexpired elements, and > > ZREMRANGEBYSCORE to expire elements. > > > > It looks like there is a purge() method in CHI for doing the > > expiration. Would it work for you to call purge() regularly? We could > > implement a more efficient purge() that handles the Redis keys > > expiring. > > > > I wonder if it would be possible to set expiration on the set to the > > latest expiration time and then the set of keys will expire if all > > keys expire. But I'm guessing that doesn't happen for you. > > > >