Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support more than one statistic for summarizing time lists #127

Merged
merged 5 commits into from
Oct 16, 2023

Conversation

MaineC
Copy link
Contributor

@MaineC MaineC commented Sep 21, 2023

This commit switches code for computing averages to using numpy.

This is a first step towards supporting more than one statistic for summarizing lists of times for state change of issues. #84

Averages are easy to compute but sensitive to outliers. In particular when thinking of SLAs for times for first responses it is often more helpful to look at median, quartiles or even 90th percentile.

Author/Contributor

  • If documentation is needed for this change, has that been included in this pull request
  • run make lint and fix any issues that you have introduced
  • run make test and ensure you have test coverage for the lines you are introducing
  • Label as either bug, documentation, enhancement, infrastructure, or breaking

@MaineC
Copy link
Contributor Author

MaineC commented Oct 10, 2023

Not yet configurable, but a first run through the code of where changes would be needed to move to more than one statistic. At a first glance, the table doesn't look that bad.

At least me, I always look at more than one statistic (I believe I started that back when I discovered the possibility to create "describe" stats with R when starting to look at series of numbers). The question here is which stats exactly to look at.

As for the state of the change:

  • there are still references to "average" in some of the function names that really should be stats
  • going through making these changes I was wondering if it would make sense to pull some of the stats computation out of time_to_first_response, time_to_close etc.

@MaineC
Copy link
Contributor Author

MaineC commented Oct 13, 2023

In the latest commit I replaced references to averages where in reality they already are a combination of median, average, 90th percentile.

This commit switches code for computing averages to using numpy.

This is a first step towards supporting more than one statistic for
summarizing lists of times for state change of issues.

Averages are easy to compute but sensitive to outliers. In particular
when thinking of SLAs for times for first responses it is often more
helpful to look at median, quartiles or even 90th percentile.
Currently we are only looking at averages. This adds median and
90th percentile to issues stats. Those are much less sensitive
to outliers. (Think issue cleanup sessions where very old issues
get closed out but then dominate the issue stats at the end of the
month).
@MaineC MaineC marked this pull request as ready for review October 13, 2023 14:14
@MaineC MaineC requested a review from zkoppert as a code owner October 13, 2023 14:14
In previous commits pure averages were replace by a dict of avg, median, 90th percentile.
This makes that switch visible in docs, method names, variable names.
Copy link
Member

@zkoppert zkoppert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested this out and its working great! Love the new additional information. Thanks for the improvements @MaineC !

@zkoppert zkoppert merged commit 5427ace into github:main Oct 16, 2023
5 checks passed
@spier spier mentioned this pull request Oct 16, 2023
@spier
Copy link
Contributor

spier commented Oct 16, 2023

@MaineC @zkoppert I noticed some changes introduced by this that might not have been intended.
Tried my best to capture them in #144.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request infrastructure
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants