Subject: | Please add histograms for timing data |
NYTProf provides average time per call in its reports, like so:
# spent 31.1ms (2.60+28.5) within Eval::Closure::eval_closure which was called 149 times, avg
209µs/call:
# 126 times (2.31ms+25.9ms) by Class::MOP::Method::Generated::_compile_code at line 63 of
Class/MOP/Method/Generated.pm, avg 224µs/call
# 23 times (292µs+2.64ms) by Moose::Meta::TypeConstraint::_actually_compile_type_constraint
at line 320 of Moose/Meta/TypeConstraint.pm, avg 128µs/call
However, averages hide a lot of data. One of the lessons learned from DTrace, eg in
http://dtrace.org/blogs/brendan/2011/12/18/visualizing-device-utilization/ , is that averages
hide many types of interesting data: the rare spike, bimodal distributions, etc. At the Moving to
Moose Hackathon, we saw a practical example of how NYTProf's averages were not telling the
whole story. regcomp was taking a significant chunk of a program's time, which NYTProf of
course pointed out like a champ. But we scratched our heads thinking it was a bug in Perl until
we generated a histogram using DTrace, which showed that 3/4ths of the regcomps were super
fast, and 1/4th of them were many orders of magnitude slower. Once we attained that bit of
information, we instantly saw that the problem was a particular regular expression, and tweaked
it to not expose the problem. I'm sure there are many other examples of where a histogram
would highlight performance bottlenecks.
So it would be fantastic if NYTProf's reports included a histogram of how long each call took.
DTrace again is a good model for how to render this kind of data: list each power of 2 in the first
column, representing milliseconds. Then, in the next column, list how many times the number of
calls that took that chunk of time (so for example, there might be 500 calls between 16-31ms,
and 4 calls between 1024 and 2048ms). Then you can also render that information visually.
Here's an example of DTrace's output from quantize (note the first and last values will generally
have no hits, they're just there to bookend the data).
value ------------- Distribution ------------- count
32768 | 0
65536 |@@ 1
131072 |@@@@@@@@@@@@@@@@@@@@ 9
262144 |@@@@@@@@@@@@@@@@ 7
524288 | 0
1048576 |@@ 1
2097152 | 0
I'd be very happy to implement this if you're interested in the idea and you could guide me!