Support more than one statistic for summarizing time lists #127

MaineC · 2023-09-21T08:44:48Z

This commit switches code for computing averages to using numpy.

This is a first step towards supporting more than one statistic for summarizing lists of times for state change of issues. #84

Averages are easy to compute but sensitive to outliers. In particular when thinking of SLAs for times for first responses it is often more helpful to look at median, quartiles or even 90th percentile.

Author/Contributor

If documentation is needed for this change, has that been included in this pull request
run make lint and fix any issues that you have introduced
run make test and ensure you have test coverage for the lines you are introducing
Label as either bug, documentation, enhancement, infrastructure, or breaking

MaineC · 2023-10-10T08:52:31Z

Not yet configurable, but a first run through the code of where changes would be needed to move to more than one statistic. At a first glance, the table doesn't look that bad.

At least me, I always look at more than one statistic (I believe I started that back when I discovered the possibility to create "describe" stats with R when starting to look at series of numbers). The question here is which stats exactly to look at.

As for the state of the change:

there are still references to "average" in some of the function names that really should be stats
going through making these changes I was wondering if it would make sense to pull some of the stats computation out of time_to_first_response, time_to_close etc.

MaineC · 2023-10-13T14:07:54Z

In the latest commit I replaced references to averages where in reality they already are a combination of median, average, 90th percentile.

This commit switches code for computing averages to using numpy. This is a first step towards supporting more than one statistic for summarizing lists of times for state change of issues. Averages are easy to compute but sensitive to outliers. In particular when thinking of SLAs for times for first responses it is often more helpful to look at median, quartiles or even 90th percentile.

Currently we are only looking at averages. This adds median and 90th percentile to issues stats. Those are much less sensitive to outliers. (Think issue cleanup sessions where very old issues get closed out but then dominate the issue stats at the end of the month).

In previous commits pure averages were replace by a dict of avg, median, 90th percentile. This makes that switch visible in docs, method names, variable names.

zkoppert

Tested this out and its working great! Love the new additional information. Thanks for the improvements @MaineC !

spier · 2023-10-16T20:24:17Z

@MaineC @zkoppert I noticed some changes introduced by this that might not have been intended.
Tried my best to capture them in #144.

MaineC force-pushed the configurable-stats branch from 6c40950 to 1e2ab07 Compare September 21, 2023 09:07

zkoppert added enhancement New feature or request infrastructure labels Sep 22, 2023

MaineC force-pushed the configurable-stats branch from 8650250 to 4b67615 Compare October 10, 2023 08:34

MaineC added 4 commits October 13, 2023 16:10

Fix import order

57ddbc0

Switch to actual median and tidy up lint warnings.

6626a9b

MaineC force-pushed the configurable-stats branch from 3536281 to aabb02f Compare October 13, 2023 14:14

MaineC marked this pull request as ready for review October 13, 2023 14:14

MaineC requested a review from zkoppert as a code owner October 13, 2023 14:14

Replaces references to avg with stats.

6eeef9b

In previous commits pure averages were replace by a dict of avg, median, 90th percentile. This makes that switch visible in docs, method names, variable names.

MaineC force-pushed the configurable-stats branch from aabb02f to 6eeef9b Compare October 13, 2023 14:17

zkoppert approved these changes Oct 16, 2023

View reviewed changes

zkoppert added breaking and removed breaking labels Oct 16, 2023

zkoppert merged commit 5427ace into github:main Oct 16, 2023
5 checks passed

spier mentioned this pull request Oct 16, 2023

Formatting of stats output #144

Closed

This was referenced Oct 16, 2023

Feature: Report time-quantiles #84

Closed

feat: Add the result images in the README. #150

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support more than one statistic for summarizing time lists #127

Support more than one statistic for summarizing time lists #127

MaineC commented Sep 21, 2023 •

edited by zkoppert

Loading

MaineC commented Oct 10, 2023

MaineC commented Oct 13, 2023

zkoppert left a comment

spier commented Oct 16, 2023

Support more than one statistic for summarizing time lists #127

Support more than one statistic for summarizing time lists #127

Conversation

MaineC commented Sep 21, 2023 • edited by zkoppert Loading

Author/Contributor

MaineC commented Oct 10, 2023

MaineC commented Oct 13, 2023

zkoppert left a comment

Choose a reason for hiding this comment

spier commented Oct 16, 2023

MaineC commented Sep 21, 2023 •

edited by zkoppert

Loading