Skip Menu |

This queue is for tickets about the Data-Throttler CPAN distribution.

Report information
The Basics
Id: 47189
Status: resolved
Priority: 0/
Queue: Data-Throttler

People
Owner: Nobody in particular
Requestors: buzz [...] exotica.org.uk
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: 0.02
Fixed in: (no value)



I have the following code: my $ip = $ENV{"REMOTE_ADDR"}; my $throttler = Data::Throttler->new( max_items => 40, # 40 downloads every interval => 7200, # 2 hours db_file => "/tmp/throttle.dat", ); $no_throttle = $throttler->try_push( key => $ip ); however it seems after running the script for some time, it is miscalculating. for example I hit the script about 10 times today after not using it for some days and I seem to be already throttled. I'm not sure how to debug this.
On Sat Jun 20 12:52:03 2009, exobuzz wrote: Show quoted text
>for example I hit the script about 10 times today after > not using it for some days and I seem to be already throttled. > I'm not sure how to debug this.
This is very easy to debug, just add use Log::Log4perl qw(:easy); Log::Log4perl->easy_init($DEBUG); at the top of the script and you'll see how full the throttler's buckets are and how the decision to throttle or not throttle is made.
Show quoted text
> at the top of the script and you'll see how full the throttler's buckets > are and how the decision to throttle or not throttle is made.
im not sure I understand the log completely, but does it make sense to you when it increases counter by 1 (25/40) and then the next time it is refused? 2009/06/20 18:34:15> Trying to push 87.194.172.115 18:34:15 1 2009/06/20 18:34:15> Searching bucket for time=18:34:15 2009/06/20 18:34:15> No bucket found for time=18:34:15 2009/06/20 18:34:15> Rotating buckets time=18:34:15 head=0 2009/06/20 18:34:15> Adding bucket: 18:07:53 - 18:19:52 2009/06/20 18:34:15> Adding bucket: 18:19:53 - 18:31:52 2009/06/20 18:34:16> Adding bucket: 18:31:53 - 18:43:52 2009/06/20 18:34:16> Adding bucket: 18:43:53 - 18:55:52 2009/06/20 18:34:16> After rotation: 16:55:53 - 18:55:52 (covers 18:34:15) 2009/06/20 18:34:16> Searching bucket for time=18:34:15 2009/06/20 18:34:16> Found bucket 18:31:53 - 18:43:52 2009/06/20 18:34:16> Increasing counter 87.194.172.115 by 1 (25|40) 2009/06/20 18:34:36> Trying to push 87.194.172.115 18:34:36 1 2009/06/20 18:34:36> Searching bucket for time=18:34:36 2009/06/20 18:34:36> Found bucket 18:31:53 - 18:43:52 2009/06/20 18:34:37> Not increasing counter 87.194.172.115 by 1 (already at max) 2009/06/20 20:45:46> Trying to push 87.194.172.115 20:45:46 1 2009/06/20 20:45:46> Searching bucket for time=20:45:46
Perhaps I understand the system wrong. If i set to 40 per 2 hours, and then try to make 11 requests in under half an hour, will it throttle the last one ? it seems to make 11 buckets if I dont make a request for 2 hours and then make one. If i want people to be able to "burst" 20 requests but no more than 40 in a two hour period, do i need to change the number of buckets ?
On Sat Jun 20 16:05:23 2009, exobuzz wrote: Show quoted text
> Perhaps I understand the system wrong. If i set to 40 per 2 hours, > and then try to make 11 requests in under half an hour, will it throttle > the last one ?
No, you should be ok. If you have no history, and those 11 requests trickle in within half an hour, you still have 19 left to hit the specified quota of 40. I suspect that you have prefilled buckets already when you're starting the program and those previously filled buckets lift you over the limit because their values count to the defined two-hour window as well. To see what's in the buckets, run print $throttler->buckets_dump(); as described in the last part of the documentation.
Hmm, I'm not sure how to handle this then. The application is a web cgi, that can be requested at any time. I want to limit visitors use of it by ip. I had assumed I could just leave it running and your thing would cleanup as it needs? if on monday at 5pm I requested the cgi 5 times, then come back on tuesday at 5pm, will it handle that? do old requests older than a 2 hour window (my setting) get removed?
On Sat Jun 20 19:30:54 2009, exobuzz wrote: Show quoted text
> Hmm, I'm not sure how to handle this then. The application is a > web cgi, that can be requested at any time. I want to limit > visitors use of it by ip.
That's exactly what it's for. Show quoted text
> I had assumed I could just leave it running and your thing would > cleanup as it needs? if on monday at 5pm I requested the cgi 5 times, > then come back on tuesday at 5pm, will it handle that?
If your window is two hours, then it'll limit the number of requests to the defined value N during any two hour window. In your example, there's 24 hours between the two requests, which means the previous value will be long forgotten. At Tuesday 5pm, you'll have the full quota of N available again. If you're looking to limit the maximum lifetime requests instead, you need a simple counter, not a throttler.
Then there does seem to be something odd. at 18:34 i get throttled 2009/06/20 18:34:37> Not increasing counter 87.194.172.115 by 1 (already at max i wait 2 hours and then i am able to do 11 requests (the number of buckets before getting throttled again) 2009/06/20 20:45:46> Trying to push 87.194.172.115 20:45:46 1 2009/06/20 20:45:46> Searching bucket for time=20:45:46 2009/06/20 20:45:46> No bucket found for time=20:45:46 2009/06/20 20:45:46> Rotating buckets time=20:45:46 head=4 2009/06/20 20:45:46> Adding bucket: 18:55:53 - 19:07:52 2009/06/20 20:45:46> Adding bucket: 19:07:53 - 19:19:52 2009/06/20 20:45:47> Adding bucket: 19:19:53 - 19:31:52 2009/06/20 20:45:47> Adding bucket: 19:31:53 - 19:43:52 2009/06/20 20:45:47> Adding bucket: 19:43:53 - 19:55:52 2009/06/20 20:45:47> Adding bucket: 19:55:53 - 20:07:52 2009/06/20 20:45:47> Adding bucket: 20:07:53 - 20:19:52 2009/06/20 20:45:47> Adding bucket: 20:19:53 - 20:31:52 2009/06/20 20:45:47> Adding bucket: 20:31:53 - 20:43:52 2009/06/20 20:45:47> Adding bucket: 20:43:53 - 20:55:52 2009/06/20 20:45:47> Adding bucket: 20:55:53 - 21:07:52 2009/06/20 20:45:47> After rotation: 19:07:53 - 21:07:52 (covers 20:45:46) 2009/06/20 20:45:47> Searching bucket for time=20:45:46 2009/06/20 20:45:47> Found bucket 20:43:53 - 20:55:52 2009/06/20 20:45:47> Increasing counter 87.194.172.115 by 1 (0|40) 2009/06/20 20:46:27> Trying to push 87.194.172.115 20:46:27 1 2009/06/20 20:46:27> Searching bucket for time=20:46:27 2009/06/20 20:46:27> Found bucket 20:43:53 - 20:55:52 2009/06/20 20:46:28> Increasing counter 87.194.172.115 by 1 (1|40) 2009/06/20 20:46:55> Trying to push 87.194.172.115 20:46:55 1 2009/06/20 20:46:55> Searching bucket for time=20:46:55 2009/06/20 20:46:55> Found bucket 20:43:53 - 20:55:52 2009/06/20 20:46:55> Increasing counter 87.194.172.115 by 1 (2|40) 2009/06/20 20:47:38> Trying to push 87.194.172.115 20:47:38 1 2009/06/20 20:47:38> Searching bucket for time=20:47:38 2009/06/20 20:47:38> Found bucket 20:43:53 - 20:55:52 2009/06/20 20:47:39> Increasing counter 87.194.172.115 by 1 (3|40) 2009/06/20 20:48:16> Trying to push 87.194.172.115 20:48:16 1 2009/06/20 20:48:16> Searching bucket for time=20:48:16 2009/06/20 20:48:16> Found bucket 20:43:53 - 20:55:52 2009/06/20 20:48:16> Increasing counter 87.194.172.115 by 1 (4|40) 2009/06/20 20:51:46> Trying to push 87.194.172.115 20:51:46 1 2009/06/20 20:51:46> Searching bucket for time=20:51:46 2009/06/20 20:51:46> Found bucket 20:43:53 - 20:55:52 2009/06/20 20:51:46> Increasing counter 87.194.172.115 by 1 (5|40) 2009/06/20 20:57:31> Trying to push 87.194.172.115 20:57:31 1 2009/06/20 20:57:31> Searching bucket for time=20:57:31 2009/06/20 20:57:31> 20:57:31 covered by last bucket 2009/06/20 20:57:31> Increasing counter 87.194.172.115 by 1 (6|40) 2009/06/20 20:58:16> Trying to push 87.194.172.115 20:58:16 1 2009/06/20 20:58:16> Searching bucket for time=20:58:16 2009/06/20 20:58:16> 20:58:16 covered by last bucket 2009/06/20 20:58:16> Increasing counter 87.194.172.115 by 1 (13|40) 2009/06/20 20:58:32> Trying to push 87.194.172.115 20:58:32 1 2009/06/20 20:58:32> Searching bucket for time=20:58:32 2009/06/20 20:58:32> 20:58:32 covered by last bucket 2009/06/20 20:58:32> Increasing counter 87.194.172.115 by 1 (20|40) 2009/06/20 20:59:11> Trying to push 87.194.172.115 20:59:11 1 2009/06/20 20:59:11> Searching bucket for time=20:59:11 2009/06/20 20:59:11> 20:59:11 covered by last bucket 2009/06/20 20:59:11> Increasing counter 87.194.172.115 by 1 (27|40) 2009/06/20 20:59:22> Trying to push 87.194.172.115 20:59:22 1 2009/06/20 20:59:22> Searching bucket for time=20:59:22 2009/06/20 20:59:22> 20:59:22 covered by last bucket 2009/06/20 20:59:22> Increasing counter 87.194.172.115 by 1 (34|40) 2009/06/20 21:00:04> Trying to push 87.194.172.115 21:00:04 1 2009/06/20 21:00:05> Searching bucket for time=21:00:04 2009/06/20 21:00:05> 21:00:04 covered by last bucket 2009/06/20 21:00:05> Not increasing counter 87.194.172.115 by 1 (already at max) which is why I thought I would need more buckets ?
Subject: Re: [rt.cpan.org #47189] throttle calculation issue
Date: Tue, 23 Jun 2009 05:22:49 -0700 (PDT)
To: Jools Smyth via RT <bug-Data-Throttler [...] rt.cpan.org>
From: Mike Schilli <m [...] perlmeister.com>
On Sun, 21 Jun 2009, Jools Smyth via RT wrote: Show quoted text
> which is why I thought I would need more buckets ?
Hmm, I think you're right, something's iffy here. I'll investigate right after YAPC is over. -- Mike Mike Schilli m@perlmeister.com
Yup, this was a bug, I just fixed it: http://github.com/mschilli/data-throttler-perl/commit/b54eeed8029033998b4f9622160fcc5adcd45cfa The 0.03 release is on its way to CPAN, please check it out and let me know if it fixes the problem you're seeing. Thanks for reporting this.