Bug #69514 for Net-SNMP: More control over non-blocking callbacks and the Dispatcher

Fri Jul 15 14:12:08 2011 BitCard [...] ResonatorSoft.org - Ticket created

Subject:

More control over non-blocking callbacks and the Dispatcher

Hi. I've been bugging you about various things with the Net::SNMP for a while, but I'd like to talk about some serious analysis of the non-blocking behaviors and what I (and probably others) would want to see. I'm open to developing this and patching it myself just fine. My goal is to take some of the code wrappers I've done and figure out exactly how to integrate it into the fold. From what I've found, non-blocking usage really only boils down to a few needs, and any missing functionality could be added in without the need for an entire module or different processing engine. Other modules usually end up re-creating core subroutines, anyway, and why do that when it can just be added in. So, in general, here are cases where non-blocking breakpoints/callbacks are desired (the "problems"): 1. After a single request has been processed (get_request, set_request, get_table, get_bulk, etc.) 2. After a single row is acquired (via get_entries) 3. After all of the requests for a single session have been processed 4. Find ways of processing multiple sessions/hosts. Solutions: 1. This is already built-in as -callback. 2. There is an "undocumented and unsupported" feature called -rowcallback. Why this is "undocumented and unsupported" is beyond me because this is awesome for taking large SNMP tables and putting them directly as rows into CSV/XML/databases/etc. without having to worry about storing the entire table of columns before parsing them later. You throw a row into the file and forget about it. What's required to take this lovely feature into a "supported" status? 3. There are a number of potential solutions just for 3: 3a. Use the -callback function of the single request and collect the data until you run your own callback - This has been my reasoning for using stuff like snmp_dispatch_once. However, it's turned into my own set of pre-loading and looping routines that really belong in the module in some form. I end up caching all of the data until the host is done with the data dump and then call my own callback. This should be Net::SNMP::Dispatcher's job, since it already has caching and callback functions of its own. Ideally, every single non-blocking operation should just be able to throw in the entire list of events, and fire snmp_dispatcher() once to finish everything up. 3b. Use something like a single get_bulk request for everything plus -callback - I'm not sure if that even works. Can you request, say, 20 different tables and expect it not to choke? Would that be chopped up into multiple get_table requests, or is this just going to be a low-level get_bulk SNMP request that takes your requests verbatim as a single packet and immediately pukes at the size of the request? This also doesn't solve for (admittedly rare) scenarios like getting and setting in the same session. 3c. Use EV or AnyEvent - That doesn't actually solve anything. It's just a different dispatching framework, and there's nothing there that says "wait until I'm done processing requests for this host". Besides, it's too open-ended. You end up having to roll your own subroutines and pray that you've done them right. 3d. Create a -hostcallback option tied to the $session - This sounds like the best solution. It's built into Net::SNMP, requires the least amount of code, and provides a way of adding that hook easily. Requests can be kept tracked with an internal variable bound to the $session object, similar to how requests are registered and deregistered on the file descriptors. When it reaches zero on the final request, the dispatcher sends all of the cached data to the callback sub. 4. There's more than one way for this one, too: 4a. When the host is finished, queue another one - If 3d was solved and implemented, the hostcallback sub itself could do that job. Though, it's a common operation to say "okay, I've processed 50 hosts and I want to add one to continue the batch". Why should that common code be outside the module? However, that would also result in multiple event queues: one for real-time processing, and one for processing after a host has finished up its results. That's doable with another internal "queuing" variable (or even some extra variables in the event objects). But what would be the queue size? It could just be a variable for the dispatcher, but that just gives the random figure to user to "control". AnyEvent::SNMP does the same thing with a MIN/MAX queue size, but it's just an arbitrary figure. I could process 500 hosts and do just fine for one set of data, or process 20 hosts and have problems on another set of data. Another way would be to... 4b. ...Be able to queue all of the hosts/requests prior to calling the dispatcher - Sounds better than messing with queue numbers, but now there could be some potential overload issues here. Let's say I want to query 4400 hosts to grab 75 different SNMP (real) tables. (Seriously, this is my real-world scenario right now.) All of the send PDUs are sent to all of the hosts first, so it would end up being a massive cluster UDP traffic getting dumped on the buffer (and eventually thrown out). Some of the hosts would respond quickly enough to bring back the results, but it's questionable if the zero-event queuing of replies (including multi-reply/requests, like get_bulk and get_table) would balance out the massive list of requests. A saving grace here is that the event scheduler somewhat prioritizes existing reply/requests. If the file descriptor is constantly available for data (as it should be, but not so much that it tosses out data, either), it would only send one new request, and then process the file descriptor right away. That file descriptor would process the results, produce a new PDU, send it out, and then control would return back to the dispatcher. You still run into the problem of new/old IP ratios there, though. Every single new request would likely have many different reply/requests, but you're only looking at one file descriptor for UDP, and thus, only process a single reply/request from an existing IP. The result is that all 4400 hosts would eventually be asked to grab data while the dispatcher is trying to process the many more replies that generates. Again, that results in buffer overload. (One fix would be to add a file descriptor for each IP, regardless if it's UDP or not. This would force the event handler to process all of the busy descriptors before exiting the handler.) All of the individual hosts would be not be limited to one request at a time, either, as it could be sending requests to a busy host before it has received the result back from it. Depending on the quantity of requests, this could overload the other side, or it might end up handling it well, and sending even more junk to your UDP buffer, overloading THAT. 4c. Find a smart way to detect overload - Okay, so spamming thousands of requests to many hosts is not a good idea, but being able dump the entire list of events in the event queue is ideal and would more or less be proper use of the queue. Limits on numbers of events is arbitrary. The UDP buffer (unfortunately) can't be queried for its existing length. (And even if it could, would you want to read a large buffer over and over again like that?) I believe the file descriptor idle time is the answer, though. If it's always busy, we're either exactly keeping up or are overloaded. If it's idle at any point, then the buffer is empty most of the time. So, an array of times could keep track of the last 30-50 FD idle times vs. the code processing time between FD select calls. If 95% of the total processing time is from the code and only 5% is from idle time, that would be ideal. The number of sessions with active events (has processed through its first PDU) can be tracked and then the event handler can be forced to stop sending new requests to new IPs if it meets that threshold. This threshold could be adjustable to how "friendly" the user wants the process to be, in regards to CPU/resource time. Setting it to 99.9% would also be an option, which would require 30 selects of zero response time to prevent new requests. A hard limit on the number of multiple requests to a single IP can also be managed, possibly tied to individual IP response time. A second process % threshold could also start adding delay seconds to the existing "Send PDU" events, so that the FD queries can keep up. Sorry for the long bug report, but let me know what you think. I want to help drive that direction and release some patches.

Tue Aug 02 10:22:26 2011 BitCard [...] ResonatorSoft.org - Correspondence added

From:

BitCard [...] ResonatorSoft.org

Okay, here is the new patch with most of the above fixed. This implements the following: * Changed the behavior of the _event_handle sub to completely process all of the receive buffers (using instant/zero-second select calls) before continuing back to sending requests. * Implemented a new "max_requests" parameter, defaulted at 3, which controls the rate that an individual host is processing requests at the same time. (More detail below.) * Added better event debug detail by adding a small _event_info sub that captures the ref address (as the old one did), plus the name of the callback and hostname of the destination. * Removed some overly verbose debug messages within get_entries * Added a debug flag for SNMP.pm itself The "max_requests" parameter occupies the majority of the code, since it required the following changes: * A new $event array variable called _HOSTNAME * Changes to register/deregister to keep track of hostnames in a very similar fashion to the descriptor object. * The %SUBREFS global, to keep track of names of subroutines to their ref addresses * Code at the top of _event_handle to skip over events that are past the max request limit until another request has finished * An auto-change of the max_requests to 1 if the host times out once, since it obviously cannot handle the requests it already had. * Changing the name of the unused "return_response_pdu" to "send_pdu_priority", and using that throughout SNMP.pm for _existing_ requests. This way, existing requests for, say, get_tables are sent immediately through the pipe and only the receive timers get put into the event list. This makes existing requests immune to the max_request limits (and post-select lag), and ensures that the host is not waiting too long for our reply for more information. * A new parameter (plus help) in both SNMP.pm and Net::SNMP::Transport The new help information describes the summarized overview of the new Dispatcher code better than I could here, including the reasons for why it's needed. I know this seems like a lot of code, but it really amounts to a bunch of small changes everywhere. Besides, the benefits are worth it. The new code f'ing screams! I'm able to run through about 88 hosts and query around 65 tables on each of them in less than a minute, averaging about 450-500 rows/sec without XS and double that with XS. Since there are no overloads on either side, there is little to no timing out. If a host can't handle 3 requests at the same time, the time out code will taper it down to 1 and try again, usually with successful results. The new dispatcher code makes all other event loop modules obsolete. Still have a few other items to patch for: * Documentation of the -rowcallback feature * Force -rowcallback to clear out the row from var_bind_list, so that it's not using that memory. (Ugh, I see mods to Net::SNMP::XS in my future...) * Add the -hostcallback feature * Some added documentation of the various callback functions

Subject:

Net-SNMP.patch.txt.gz

Download Net-SNMP.patch.txt.gz
application/x-gzip 8k

Message body not shown because it is not plain text.

Wed Aug 03 10:50:58 2011 dtown [...] cpan.org - Correspondence added 10 min

The Dispatcher is a simple event loop which has two primary functions. It has to handle asynchronous network I/O and execute scheduled events. It provides a public interface to those functionalities with the register/deregister and schedule/cancel methods respectively. These interfaces establish an agreement where the Dispatcher will provide notification of network events and execute scheduled events as close as possible to the requested execution time. Your changes are customizing the Dispatcher to a specific situation. In your scenario, you are requesting that the Dispatcher send a large quantity of messages at the same time with the expectation that all those transaction to complete within a fixed time period. The Dispatcher is attempting to honor your request, but due to resource limits it is not possible. Your modifications are changing the agreement between the Dispatcher and user. You are still requesting the same thing, but the Dispatcher is now at liberty to ignore requested time period for execution based upon a perceived resource exhaustion. This issue should really be addressed at the application level. Instead of queuing all the requests initially, the requests could be cascaded. The first request would query the first table. When that response is returned, the application (i.e. the callback for the request) would determine the next course of action. If the request timed out, then the request could be repeated with a delay or not at all. If the response is good, store the data and request the second table. This is what your changes are attempting to do in the Dispatcher. It is more suitable for the application to make these decisions because it “knows” the content and context of the message. The Dispatcher should just be responsible for handling events.

Wed Aug 03 10:50:59 2011 The RT System itself - Status changed from 'new' to 'open'

Wed Aug 03 10:50:59 2011 dtown [...] cpan.org - Given to DTOWN

Thu Aug 04 12:06:17 2011 BitCard [...] ResonatorSoft.org - Correspondence added

From:

BitCard [...] ResonatorSoft.org

On Wed Aug 03 10:50:58 2011, DTOWN wrote: Show quoted text

> Your changes are customizing the Dispatcher to a specific situation. In > your scenario, you are requesting that the Dispatcher send a large > quantity of messages at the same time with the expectation that all > those transaction to complete within a fixed time period. The > Dispatcher is attempting to honor your request, but due to resource > limits it is not possible. Your modifications are changing the > agreement between the Dispatcher and user. You are still requesting the > same thing, but the Dispatcher is now at liberty to ignore requested > time period for execution based upon a perceived resource exhaustion.

I disagree. This isn't really a specific situation. I've been using this module for around 4-5 years, and a grand majority of the use has been for mass device communication. I don't think I'm alone in that type of usage, especially in regards to Dispatcher use. The problems with the approaches I've done in the past are that: 1. They are host-queuing based approaches with an arbitrary host limit, and is essentially just duplicating a queuing system on top of the already established event loop of the dispatcher. 2. It turns into common code which I eventually stuff into a common module, instead of attempting to address the features within Net::SNMP itself. Show quoted text

> This issue should really be addressed at the application level. Instead > of queuing all the requests initially, the requests could be cascaded. > The first request would query the first table. When that response is > returned, the application (i.e. the callback for the request) would > determine the next course of action. If the request timed out, then the > request could be repeated with a delay or not at all. If the response > is good, store the data and request the second table. This is what your > changes are attempting to do in the Dispatcher. It is more suitable for > the application to make these decisions because it “knows” the content > and context of the message. The Dispatcher should just be responsible > for handling events.

But, again, this results in common code. Common code is more appropriate in the module itself. This is also asking the application to act as a queuing agent, which a duplicative effort when the Dispatcher already has its own queuing code. The Dispatcher actually has more information available to be able to tell if requests should be sent or not, such as packets flowing into the receive buffer and whether tables are still being processed. Really, the changes break down to two key features: * Re-reading the receive buffer for any existing packets in the queue, which is just a smart way of keeping the UDP buffers clear. * Throttling new requests to a _user-defined_ threshold on a per-host basis. Again, this is user-defined, so the application could turn the code off and make the Dispatcher spam as many requests to the host as possible. However, the defaults and new changes are just a way of making the Dispatcher smarter about how it sends its data, not a warping of the tool to a specific design. These changes are appropriate for every application and every use. It's merely allowing the Dispatcher to say, "I will send these events as close to the time frames as possible, without overloading myself or the hosts I send to." There would never be a situation where you would actually want the Dispatcher to overload its receive buffer, nor does anybody wants to spam too many requests to the hosts they are communicating against. Again, if the application (or programmer) has more information about the hosts' ability to receive requests, they can adjust the parameter themselves. The result is a Dispatcher than works well with both small requests, specific requests, and large batches of data, and because of the advanced knowledge the Dispatcher has of the events around it, it will process them in the fastest way possible.

Wed Aug 10 18:50:36 2011 BitCard [...] ResonatorSoft.org - Correspondence added

From:

BitCard [...] ResonatorSoft.org

How should we handle this one? Should I break up the patch into its separate parts and figure out which ones we agree on, and argu...ahem, debate on the rest?

Mon Aug 29 12:20:44 2011 BitCard [...] ResonatorSoft.org - Correspondence added

From:

BitCard [...] ResonatorSoft.org

On Wed Aug 10 18:50:36 2011, SineSwiper wrote: Show quoted text

> How should we handle this one? Should I break up the patch into its > separate parts and figure out which ones we agree on, and argu...ahem, > debate on the rest?

Okay, I have them broken out. I'll post these as separate bugs, since it would be easier to manage. We can close this bug, though I might reference it for existing conversations.

Thu Oct 20 11:20:14 2011 BBYRD [...] cpan.org - Correspondence added

Do you have time to talk about these patches? I'm close to testing out a DBD::SNMP module and I'd like to do the right thing and have these patches included into the Net::SNMP module, instead of overloading methods within my module.

Tue Nov 01 22:28:23 2011 dtown [...] cpan.org - Correspondence added

On Thu Oct 20 11:20:14 2011, BBYRD wrote: Show quoted text

> Do you have time to talk about these patches? I'm close to testing out > a DBD::SNMP module and I'd like to do the right thing and have these > patches included into the Net::SNMP module, instead of overloading > methods within my module.

I currently have no plans to release a new version of the Net::SNMP module any time soon. I will put a brief update in each ticket.

Thu Mar 15 10:55:22 2012 dtown [...] cpan.org - Correspondence added 1 min

This ticket was broken out into tickets 70580, 70583, 70584, 70586, and 70589.

Thu Mar 15 10:55:25 2012 dtown [...] cpan.org - Status changed from 'open' to 'rejected'