Skip Menu |

This queue is for tickets about the Rose-DBx-Object-Cached-CHI CPAN distribution.

Report information
The Basics
Id: 40823
Status: resolved
Priority: 0/
Queue: Rose-DBx-Object-Cached-CHI

People
Owner: kmcgrath [...] baknet.com
Requestors: me [...] dbourget.com
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in:
  • 0.03
  • 0.04
  • 0.05
  • 0.06
  • 0.07
  • 0.08
Fixed in: (no value)



Subject: Failure to get created_at on nonexistent cache object
in remember(), a function is called on the return value of $cache->get_object(..) without checking that the return value is defined, but the value is not guaranteed to be defined. the relevant line (61) is: $self->{__xrdbopriv_chi_created_at} = $cache->get_object("${class}::Objects_By_Id" . LEVEL_SEP . $pk)->created_at() $cache->set_object(..) is called immediately before, but it can fail for a number of reasons (cache is full, connection has been lost, object is too large, etc). when this happens, created_at() is called on undef. the same problem occurs in other places where the cache is updated. relatedly, calling get_object adds unnecessary overhead. instead, the time could be set from system time after the cache has been updated. the cache_is_in_sync method would only have to be adapted to check that the object's ..created_at time is later or equal to the cache's created_at time.
On Mon Nov 10 19:01:55 2008, dbourget wrote: Show quoted text
> in remember(), a function is called on the return value of > $cache->get_object(..) without checking that the return value is > defined, but the value is not guaranteed to be defined. the relevant > line (61) is: > > $self->{__xrdbopriv_chi_created_at} = > $cache->get_object("${class}::Objects_By_Id" . LEVEL_SEP . > $pk)->created_at() > > $cache->set_object(..) is called immediately before, but it can fail for > a number of reasons (cache is full, connection has been lost, object is > too large, etc). when this happens, created_at() is called on undef. > > the same problem occurs in other places where the cache is updated.
I've put a quick fix for this out, version 0.09. According to the CHI code it looks like the value of the object is returned on a success and undef otherwise. If a value is returned on success then created_at() will be run. I feel like there is a better way to do error checking here, which I'll look into later this week. Show quoted text
> > relatedly, calling get_object adds unnecessary overhead. instead, the > time could be set from system time after the cache has been updated. the > cache_is_in_sync method would only have to be adapted to check that the > object's ..created_at time is later or equal to the cache's created_at > time.
The idea here is to make sure the same create_at time is being used by process using the cache. With FastMmap or Memcached caches I feel like this time could be different between processes without using the create_at() function. I could also be wrong on how the CHI created_at() function works, I assumes it gotten from the cache along with the object, I should prob. check the code to make sure this is true. Also, I made some updates so get_object is called less. I found a few places where I was able to replace calls to get() and get_object() with a single call to get_object() then use the ->value and ->create_at functions from there. Let me know if the new version helps out. I'm also open to more ideas about how to handle keeping objects in sync and handling cache errors. -Kevin
On Mon Nov 10 20:44:54 2008, KMCGRATH wrote: Show quoted text
> On Mon Nov 10 19:01:55 2008, dbourget wrote:
> > in remember(), a function is called on the return value of > > $cache->get_object(..) without checking that the return value is > > defined, but the value is not guaranteed to be defined. the relevant > > line (61) is: > > > > $self->{__xrdbopriv_chi_created_at} = > > $cache->get_object("${class}::Objects_By_Id" . LEVEL_SEP . > > $pk)->created_at() > > > > $cache->set_object(..) is called immediately before, but it can fail for > > a number of reasons (cache is full, connection has been lost, object is > > too large, etc). when this happens, created_at() is called on undef. > > > > the same problem occurs in other places where the cache is updated.
> > I've put a quick fix for this out, version 0.09. According to the CHI > code it looks like the value of the object is returned on a success and > undef otherwise. If a value is returned on success then created_at() > will be run. I feel like there is a better way to do error checking > here, which I'll look into later this week. >
> > > > relatedly, calling get_object adds unnecessary overhead. instead, the > > time could be set from system time after the cache has been updated. the > > cache_is_in_sync method would only have to be adapted to check that the > > object's ..created_at time is later or equal to the cache's created_at > > time.
> > The idea here is to make sure the same create_at time is being used by > process using the cache. With FastMmap or Memcached caches I feel like > this time could be different between processes without using the > create_at() function. I could also be wrong on how the CHI created_at() > function works, I assumes it gotten from the cache along with the > object, I should prob. check the code to make sure this is true. > > Also, I made some updates so get_object is called less. I found a few > places where I was able to replace calls to get() and get_object() with > a single call to get_object() then use the ->value and ->create_at > functions from there. > > Let me know if the new version helps out. I'm also open to more ideas > about how to handle keeping objects in sync and handling cache errors. > > -Kevin > >
well I jumped the gun a bit too quick on 0.09 so 0.10 is right on it's heels.... Realized after the push that a call to ->value on the CHI Object does not check for expire, so the calls to get are back in... for now.
Subject: Re: [rt.cpan.org #40823] Failure to get created_at on nonexistent cache object
Date: Tue, 11 Nov 2008 14:35:32 +1100
To: bug-Rose-DBx-Object-Cached-CHI [...] rt.cpan.org
From: "David Bourget" <david.bourget [...] gmail.com>
hi kevin, thanks for your updates. i won't upgrade yet because i'm happy with my more radical patch for now (all the offending lines commented out and the is_cache_in_sync function always returning 1), but it's good to know things won't break down when i update. by the way, i found that this package performs very slowly compared to the in-process cache provided as part of Rose. we're talking something like two orders of magnitude slower. i haven't tried to figure out what it is because it is still fast enough for my purposes (so far), but i suspect it's either the multiple get calls on the cache you mention or the overhead of stripping/rebuilding Rose objects (which is presumably not required for in-process caching). in my experience, it makes almost no difference what cache driver i use (fastmmap, memcached, memory), almost all the latency is in this package. cheers db On Tue, Nov 11, 2008 at 2:18 PM, Kevin Christopher McGrath via RT <bug-Rose-DBx-Object-Cached-CHI@rt.cpan.org> wrote: Show quoted text
> <URL: http://rt.cpan.org/Ticket/Display.html?id=40823 > > > On Mon Nov 10 20:44:54 2008, KMCGRATH wrote:
>> On Mon Nov 10 19:01:55 2008, dbourget wrote:
>> > in remember(), a function is called on the return value of >> > $cache->get_object(..) without checking that the return value is >> > defined, but the value is not guaranteed to be defined. the relevant >> > line (61) is: >> > >> > $self->{__xrdbopriv_chi_created_at} = >> > $cache->get_object("${class}::Objects_By_Id" . LEVEL_SEP . >> > $pk)->created_at() >> > >> > $cache->set_object(..) is called immediately before, but it can fail for >> > a number of reasons (cache is full, connection has been lost, object is >> > too large, etc). when this happens, created_at() is called on undef. >> > >> > the same problem occurs in other places where the cache is updated.
>> >> I've put a quick fix for this out, version 0.09. According to the CHI >> code it looks like the value of the object is returned on a success and >> undef otherwise. If a value is returned on success then created_at() >> will be run. I feel like there is a better way to do error checking >> here, which I'll look into later this week. >>
>> > >> > relatedly, calling get_object adds unnecessary overhead. instead, the >> > time could be set from system time after the cache has been updated. the >> > cache_is_in_sync method would only have to be adapted to check that the >> > object's ..created_at time is later or equal to the cache's created_at >> > time.
>> >> The idea here is to make sure the same create_at time is being used by >> process using the cache. With FastMmap or Memcached caches I feel like >> this time could be different between processes without using the >> create_at() function. I could also be wrong on how the CHI created_at() >> function works, I assumes it gotten from the cache along with the >> object, I should prob. check the code to make sure this is true. >> >> Also, I made some updates so get_object is called less. I found a few >> places where I was able to replace calls to get() and get_object() with >> a single call to get_object() then use the ->value and ->create_at >> functions from there. >> >> Let me know if the new version helps out. I'm also open to more ideas >> about how to handle keeping objects in sync and handling cache errors. >> >> -Kevin >> >>
> > > well I jumped the gun a bit too quick on 0.09 so 0.10 is right on it's > heels.... Realized after the push that a call to ->value on the CHI > Object does not check for expire, so the calls to get are back in... for > now. >
On Mon Nov 10 22:36:23 2008, david.bourget@gmail.com wrote: Show quoted text
> by the way, i found that this package performs very slowly compared to > the in-process cache provided as part of Rose. we're talking something > like two orders of magnitude slower. i haven't tried to figure out > what it is because it is still fast enough for my purposes (so far), > but i suspect it's either the multiple get calls on the cache you > mention or the overhead of stripping/rebuilding Rose objects (which is > presumably not required for in-process caching). in my experience, it > makes almost no difference what cache driver i use (fastmmap, > memcached, memory), almost all the latency is in this package.
Yes. Currently I have only used CHI with Storable. And Storable requires one rip out all the code refs before putting the object into the cache for any driver. Well at least I've had to. I tried going down the all the paths for Storable that allow code refs, etc... but my objects would not always thaw correctly. Rose::DB::Object::Cache does not need to do this, so it should be much faster to use. And if what you really need is a in memory cache then that is the module you should be using. CHI can use different serializer objects, so maybe when some free time crops up I'll mess around with something other than Storable. Thanks for your input, Kevin
Subject: Re: [rt.cpan.org #40823] Failure to get created_at on nonexistent cache object
Date: Tue, 11 Nov 2008 16:06:07 +1100
To: bug-Rose-DBx-Object-Cached-CHI [...] rt.cpan.org
From: "David Bourget" <david.bourget [...] gmail.com>
i just took a look at the code again and i'm a bit puzzled: there is no overhead when loading from the cache because the (stripped) structure is simply assigned to the existing handle. i guess the delay is due to the CHI module itself, not your driver (possibly to Storable, ultimately). one thing i noticed though is that in remember() the clone and strip methods are (potentially) called many more times than they need to be. you could make one stripped copy of the object and use that with all the set() calls instead. d Show quoted text
> Yes. Currently I have only used CHI with Storable. And Storable > requires one rip out all the code refs before putting the object into > the cache for any driver. Well at least I've had to. I tried going > down the all the paths for Storable that allow code refs, etc... but my > objects would not always thaw correctly. Rose::DB::Object::Cache does > not need to do this, so it should be much faster to use. And if what you > really need is a in memory cache then that is the module you should be > using. CHI can use different serializer objects, so maybe when some > free time crops up I'll mess around with something other than Storable. > > Thanks for your input, > Kevin >
On Tue Nov 11 00:06:32 2008, david.bourget@gmail.com wrote: Show quoted text
> i just took a look at the code again and i'm a bit puzzled: there is > no overhead when loading from the cache because the (stripped) > structure is simply assigned to the existing handle. i guess the delay > is due to the CHI module itself, not your driver (possibly to > Storable, ultimately).
I haven't done a lot of benchmarks on CHI. I started using it because of the convenience of pluggable drivers. It would be interesting to see the difference between using the FastMmap driver for CHI vs. the Cache::FastMmap driver side by side. Show quoted text
> > one thing i noticed though is that in remember() the clone and strip > methods are (potentially) called many more times than they need to be. > you could make one stripped copy of the object and use that with all > the set() calls instead.
You are correct. I'll create a $safe_obj at the beginning of the function and use that for all the sets. -Kevin
This issue should be resolved with the new version of the module. Also I have created a new module Rose::DBx::Object::FastMmap. I did some benchmarks and the CHI interface slows performance considerably. I took a Rose::DB::Object then just did a get and set with the default implementations of each module. Summed up version: my $rdbo = new MyObject->load->strip; my $chi = CHI->new(driver => 'FastMmap'); my $fast = Cache::FastMmap->new(); timethese(20000, { 'CHI set()' => sub { $chi->set('a',$rdbo) }, 'FastMmap set()' => sub { $fast->set('a',$rdbo) }, }); timethese(20000, { 'CHI get()' => sub { $chi->get('a') }, 'FastMmap set()' => sub { $fast->get('a') }, }); Benchmark: timing 20000 iterations of CHI set(), FastMmap set()... CHI set(): 3 wallclock secs ( 2.96 usr + 0.11 sys = 3.07 CPU) @ 6514.66/s (n=20000) FastMmap set(): 1 wallclock secs ( 1.10 usr + 0.12 sys = 1.22 CPU) @ 16393.44/s (n=20000) Benchmark: timing 20000 iterations of CHI get(), FastMmap set()... CHI get(): 3 wallclock secs ( 1.92 usr + 0.23 sys = 2.15 CPU) @ 9302.33/s (n=20000) FastMmap set(): 0 wallclock secs ( 0.43 usr + 0.16 sys = 0.59 CPU) @ 33898.31/s (n=20000)