Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure even distribution of versions across interval/range #54

Merged
merged 1 commit into from
Aug 21, 2020

Conversation

owenfarrell
Copy link
Member

@owenfarrell owenfarrell commented Jul 21, 2020

This change implements a request by @vito to generate unique versions per pipeline.

This implementation hashes a combination of the BUILD_TEAM_NAME and BUILD_PIPELINE_NAME environmental variables using the FNV-1a algorithm. I chose FNV-1a based as the "best bang for the buck" (best distribution for algorithms readily available in Go).

The Offset function then determines what percentile the hashed value falls in, relative to the maximum possible value, and returns a time that is in the same percentile of the specified range/interval (as calculated by the TimeLord instance). All math rounds down to the nearest minute to align with the existing precision provided by this resource.

I think this approach creates:

  • an even distribution of versions generated by different pipelines across an identical range. So if lots of pipelines have resources that define a span of the same 4 hour range or are on a 4 hour interval, those pipelines will progressively trigger across the range/interval (but I need some help checking the logic here)
  • identical percentiles for versions generated by the same pipeline across different ranges. So if a pipeline has multiple resources defined that span a 4 hour range or are on a 4 hour interval, all of those resources will generate the exact same offset for each resource instance (🤔 is this is a problem?)

I know the common approach to these kinds of requests is to do some bitwise manipulation of inputs. But I took this approach in hopes of implementing a mechanism that is good enough to meet the need while promoting readability and maintainability.

One implication to this percentile approach is that non-standard days (e.g. the start/end of DST) will trigger pipelines in the same order as any other day. If/when days are 23 or 25 hours, the time that each pipeline triggers will adjust slightly (up to ~57 minutes), but each pipeline will have a similar offset. There won't be any wild swings of multiple hours due to a locale-related time zone change.

NOTE: This implementation is intentionally not integrated in to any of the executables and does not address any documentation updates (in order to avoid confusion).

Copy link
Member

@vito vito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned in #53, sorry for the long wait. 🙏

The approach makes a lot of sense to me!

identical percentiles for versions generated by the same pipeline across different ranges. So if a pipeline has multiple resources defined that span a 4 hour range or are on a 4 hour interval, all of those resources will generate the exact same offset for each resource instance (🤔 is this is a problem?)

It is indeed a problem, but not one we need to solve today, and probably not a huge deal anyway. In the future, I plan to address this by modeling time as a var source, rather than a resource, which will allow for per-job intervals. For now, per-pipeline is as good as it gets.

All in all, I think this looks good, but I do have some somewhat minor feedback before merging.

offset.go Show resolved Hide resolved
offset_test.go Show resolved Hide resolved
Copy link
Member

@vito vito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving again but have another tiny nitpick in case you agree with my suggestion. Thanks!

offset_test.go Outdated
reference = time.Date(2012, time.November, 4, 13, now.Minute(), now.Second(), now.Nanosecond(), loc)
})

dayDuration = 25 * time.Hour
Copy link
Member

@vito vito Aug 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These "naked" (out of BeforeEach) assignments to a shared var here are a little spooky since they require somewhat intimate knowledge of how things are evaluated in the test framework, while looking like the other regular old vars which are set during test setup/execution.

Maybe this could be passed as an argument to RunIntervalAndOrRangeTests instead? 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I don't think there's a good reason for this assignment to live outside of a BeforeEach setup. Similarly, I moved the rangeDuration assignment in to a BeforeEach.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much better - thanks! Gonna go ahead and merge.

offset_test.go Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants