Bug #32815 for Schedule-RateLimiter: Feature request: "threshold" limiting

Wed Jan 30 16:21:21 2008 FERRENCY [...] cpan.org - Ticket created

Subject:

Feature request: "threshold" limiting

I would like to use Schedule::RateLimiter to limit events in an inverse fashion, which I call "threshold limiting." Currently, Rate Limiting allows you to specify, "Allow the event to happen at most once per hour." I would like to be able to specify, "Allow the event to happen only if it has been triggered at least 3 times in an hour." This only makes sense for non-blocking requests, which I would make the default for this sort of rate limiter object. The basic functionality is almost, but not quite, the same as returning: !$limiter->event(block => 0); The problem is, once the $limiter starts returning 0, it stops counting the event as having occurred. One idea on how it might look: my $limiter = Schedule::RateLimiter->new(iterations => 3, seconds => 1800, threshold => 1); Now, $limiter defaults to block => 0 when you call $limiter->event(); $limiter->event() always counts the current trigger as an event, and returns the logical negation of what it would return in the non-threshold version. Have I missed anything? Alan

Tue Feb 05 12:23:46 2008 FERRENCY [...] cpan.org - Ticket #32946: Ticket created

Subject:

Feature request: explicit event timestamp

It would be beneficial if the event() method of a RateLimiter object could accept an explicit timestamp for when the event occurred. This would replace the use of time() to get the current timestamp for use with an incoming event. I would expect the behavior to be undefined if events arrived out of order chronologically, and that's fine with me. However, the ability to specify an explicit timestamp with an event would allow RateLimiter to be used to "replay" a log of events which happened in the past, with the same behavior it had when the events were happening in real time. For this use case, the log of events will typically be provided in chronological order, so the RateLimiter should be able to be have identically to the way it would have the first time around. This would also be useful to compensate for network or other delays, and would allow events to be handled at the time we believe they happened instead of the time when RateLimiter is eventually told they happened. Alan

Wed Feb 20 15:48:44 2008 FERRENCY [...] cpan.org - Correspondence added

Attached is a patch which implements a variation on threshold limiting: the "min_iterations" parameter. It works like this: - if min_iterations is specified, the rate limiter's event() will not succeed until there have been min_iterations events (including this one). They will stop succeeding as soon as "iterations" events have occurred, and "iterations" is still required. - block is implicitly 0 for objects created with min_iterations This means that if you specify for example (min_iterations => 4, iterations => 4, seconds => 60), you should get exactly one successful event every minute. I have also included a test file which verifies this behavior. The patch also implements the "specify event time in the event() call" feature request, but I have not tested it. Using this feature seems like it would be a good way to speed up your entire test suite. Alan

use Test::More tests => 26; use Schedule::RateLimiter; ok(1, 'Did the Schedule::RateLimiter module load?'); # If we made it this far, we're ok. ######################### my $throttle = Schedule::RateLimiter->new( seconds => 2, iterations => 5, min_iterations => 3 ); ok ( ref( $throttle ), 'Did we build an Schedule::RateLimiter with more than one iteration?' ); ok !$throttle->event(); ok !$throttle->event(); ok $throttle->event(); ok $throttle->event(); ok $throttle->event(); ok !$throttle->event(); ok !$throttle->event(); ok !$throttle->event(); ok !$throttle->event(); ok !$throttle->event(); ok !$throttle->event(); ok !$throttle->event(); ok !$throttle->event(); ok !$throttle->event(); ok !$throttle->event(); ok !$throttle->event(); # Wait until we time out and try again. sleep 3; ok !$throttle->event(); ok !$throttle->event(); ok $throttle->event(); ok $throttle->event(); ok $throttle->event(); ok !$throttle->event(); ok !$throttle->event(); ok !$throttle->event();

*** ../Schedule-RateLimiter-0.01-5KjWIV/RateLimiter.pm Thu Dec 4 18:09:10 2003 --- RateLimiter.pm Wed Feb 20 15:46:03 2008 *************** *** 142,147 **** --- 142,148 ---- current => 0, list => \@list, iterations => $iterations, + min_iterations => $args{min_iterations}, seconds => $args{seconds}, block => ( exists($args{block}) ) ? $args{block} : 1, }, $proto; *************** *** 184,204 **** my $self = shift; my %args = @_; ! my $t = Time::HiRes::time(); my $last = $self->{list}[$self->{current}] || 0; ! my $block = exists( $args{block} ) ? $args{block} : $self->{block}; if ( ($t - $last) < $self->{seconds} ) { return 0 unless $block; Time::HiRes::sleep($self->{seconds} - ($t - $last)); } ! $self->{list}[$self->{current}] = $t; ! ! $self->{current} = ($self->{current}+1) % $self->{iterations}; return 1; } =head1 BUGS --- 185,219 ---- my $self = shift; my %args = @_; ! my $t = $args{timestamp} || Time::HiRes::time(); my $last = $self->{list}[$self->{current}] || 0; ! my $block = (exists( $args{block} ) ? $args{block} : $self->{block}) ! && !$self->{min_iterations}; ! ! $self->_record_event($t) if $self->{min_iterations}; ! ! if ( $self->{min_iterations} ) { ! # Find the nth previous iteration and make sure it's new enough ! my $pt = ($self->{list}[($self->{current} - $self->{min_iterations}) % $self->{iterations}] || 0); ! return 0 if $t > $pt + $self->{seconds}; ! } if ( ($t - $last) < $self->{seconds} ) { return 0 unless $block; Time::HiRes::sleep($self->{seconds} - ($t - $last)); } ! $self->_record_event($t) unless $self->{min_iterations}; return 1; + } + + sub _record_event { + my ($self, $t) = @_; + + $self->{list}[$self->{current}] = $t; + $self->{current} = ($self->{current}+1) % $self->{iterations}; } =head1 BUGS

Wed Feb 20 15:48:51 2008 FERRENCY [...] cpan.org - Status changed from 'new' to 'open'

Wed Feb 20 20:00:12 2008 Dan [...] DWright.Org - Ticket #32946: Correspondence added

On Tue Feb 05 12:23:46 2008, FERRENCY wrote: Show quoted text

> It would be beneficial if the event() method of a RateLimiter object > could accept an explicit timestamp for when the event occurred. This > would replace the use of time() to get the current timestamp for use > with an incoming event.

Since you sent a proposed patch, which included this implementation on ticket #32815, I'm going to merge the two tickets.

Wed Feb 20 20:00:13 2008 The RT System itself - Ticket #32946: Status changed from 'new' to 'open'

Wed Feb 20 20:00:29 2008 Dan [...] DWright.Org - Ticket #32946: Merged into ticket #32815

Wed Feb 20 20:00:29 2008 Dan [...] DWright.Org - Merged into ticket #32815

Wed Feb 20 20:21:51 2008 Dan [...] DWright.Org - Correspondence added

Hi, Here's a few notes on your idea. First off, the optional 'timestamp' argument seems somewhat problematic with respect to blocking. For example... if I have blocking turned on and I specify an event 10 seconds in the future, but the next event cannot happen for 20 seconds, how long do we sleep? Likewise, what if I specified something 10 seconds in the past? We could just come up with some arbitrary definition of what is the "correct" behavior, but I have a feeling somebody else would be able to argue why that's wrong. So, my current thinking is that the timestamp parameter is incompatible with blocking. Second, I tried to rationalize what would be the meaning if min_iterations existed, but was set to zero. I couldn't come up with anything. So, my current thinking is that min_iterations must be greater than zero or else that is the same as having it unset. Third, I tried integrating your patch, and then I started looking it over and re-arranging the code. And, eventually, I had sub even() to the point where it was if (min_iterations) {do this} else {do that} That made me think that while min_iterations is a neat feature, it's meaning is very contrary to what event() was intended to do. So, I propose let's not put this in event() at all. In fact, I'm proposing that event() could even be deprecated, and replaced with three new methods: hold_for_event, test_event, and record_event: hold_for_event: "This event has to happen, just tell me when it's safe to start." hold_for_event does a blocking wait until the next time an event can happen. It does not accept any parameters at all. It ignores min_iterations. It assumes that upon return, the event happened. There would be no return value. test_event: "If I wanted this event to happen at this time, could it?" test_event never blocks. It accepts an optional timestamp parameter. It also honors min_iterations. Returns true if the event could happen, false otherwise. record_event: "An event happened at this time. Deal with it" record_event doesn't care if an event should have happened. It only knows that the event did happen. It accepts an optional timestamp and records the event. There would be no return value. Although event() would be deprecated, I'd have to keep it around for backward compatibility. I'd just re-implment it so that it calls the above methods. I believe this gives you all of the functionality you are looking for, and them some. Please let me know if you see anything wrong with this approach.

Wed Feb 20 20:39:23 2008 alan [...] ferrency.com - Correspondence added

Subject:	Re: [rt.cpan.org #32815] Feature request: "threshold" limiting
Date:	Wed, 20 Feb 2008 20:47:51 -0500 (EST)
To:	"Daniel J. Wright via RT" <bug-Schedule-RateLimiter [...] rt.cpan.org>
From:	alan [...] ferrency.com

Show quoted text

> First off, the optional 'timestamp' argument seems somewhat > problematic with respect to blocking. For example... if I have > blocking turned on and I specify an event 10 seconds in the future, > but the next event cannot happen for 20 seconds, how long do we sleep? > Likewise, what if I specified something 10 seconds in the past? We > could just come up with some arbitrary definition of what is the > "correct" behavior, but I have a feeling somebody else would be able > to argue why that's wrong. So, my current thinking is that the > timestamp parameter is incompatible with blocking.

I don't personally need support for blocking with timestamp. However, I don't think there's any inherent problem with it, so I wouldn't bother purposefully removing support. Don't think about $args{timestamp} in relation to real time. It _defines_ what "now" is, so by definition it is always identical to what you should think of as the current time. If your $args{seconds} is 60, and your oldest event is 20 seconds prior to $args{timestamp}, then you have to wait 40 seconds. Whether the time stamp is in the past or the future in relation to the real wall clock doesn't matter at all. One timestamp is specified, there is no need to consider time() whatsoever. Show quoted text

> Second, I tried to rationalize what would be the meaning if > min_iterations existed, but was set to zero. I couldn't come up with > anything. So, my current thinking is that min_iterations must be > greater than zero or else that is the same as having it unset.

I'm fine with that interpretation, and I think my patch acts this way. Show quoted text

> Third, I tried integrating your patch, and then I started looking > it over and re-arranging the code. And, eventually, I had sub event() > to the point where it was if (min_iterations) {do this} else {do that}

I'd like to see the code before I interpret this: it's easy to unfactor any code to the point where it's separated into two disparate cases, but at what cost? Show quoted text

> hold_for_event:

<snip> Show quoted text

> test_event:

<snip> Show quoted text

> record_event: > "An event happened at this time. Deal with it"

(Is this equivalent to my factored out _record_event, or whatever I called it?) These seem fine to me overall. I'm not necessarily convinced the separation is necessary or useful, but I don't mind it. Does this separation eliminate the need for the (block => ) parameter to the constructor? Show quoted text

> I believe this gives you all of the functionality you are looking for, > and them some. Please let me know if you see anything wrong with > this approach.

Seems fine enough to me. Alan