state: remove TimeTable and rely on objects' modify times instead #24112

pkazmierczak · 2024-10-02T14:45:03Z

Core scheduler relies on a special table in the state store—the TimeTable—to figure out which objects can be GC'd. The TimeTable correlates Raft indices with objects insertion time, a solution we used before most of the objects we store in the state contained timestamps. This introduced a bit of a memory overhead and complexity, but most importantly meant that any GC threshold users set greater than timeTableLimit = 72 * time.Hour was ignored. This PR removes the TimeTable and relies on object timestamps to determine whether they could be GCd or not.

Fixes #16359
Resolves #17233

internal ref: https://hashicorp.atlassian.net/browse/NET-10269

schmichael

Will terminal objects that didn't have a timestamp before be GC'd on their next gc interval? That seems fine.

I think we should make sure to really call this out in Version Specific Upgrade notes. While the vast majority of users shouldn't notice anything, I think there are 2 classes of users who will:

Users who either intentionally or unintentionally used gc thresholds > 72h are in for a surprise.
Users, especially those with custom high gc thresholds, with lots of terminal objects that suddenly get GC'd "early" post-upgrade (as long as my understanding of the upgrade path is accurate).

This is a pretty astounding PR from an archeological standpoint: lots of 2015 Armon code getting axed! The fact that it Just Worked for most use cases for almost a decade is quite an achievement! 🎉

nomad/structs/structs.go

nomad/core_sched_test.go

schmichael · 2024-10-31T22:53:12Z

nomad/core_sched_test.go

+	eval.CreateTime = time.Now().Add(-6 * time.Hour).UnixNano() // make sure objects we insert are older than GC thresholds
+	eval.ModifyTime = time.Now().Add(-5 * time.Hour).UnixNano()


This code is fine in this test, but in general I prefer using hardcoded times to avoid flakes. For example if you run a test that relies on Now().Add(-6) being 1 hour before Add(-5) at just the wrong time it will fail because -6 and -5 is the same time: https://go.dev/play/p/JpWEEWUO7i4

Using Now().UTC() would also fix this since UTC is blessedly free from daylight saving, but I think there's other fun time hijinks that could occur (leap seconds?).

Again: this test is fine, but in the future lets hardcode a time.Date(...).

schmichael · 2024-10-31T23:02:26Z

nomad/core_sched_test.go

+			ModifyTime:       now.UnixNano(),
+			CutoffTime:       now.Add(-1 * time.Hour),


Oh hey! I think this is susceptible to daylight savings time bugs! Don't change it to half an hour either as that could still break if @philrenaud runs the tests at just the wrong time while visiting Newfoundland: https://en.wikipedia.org/wiki/Newfoundland_Time_Zone (I may enjoy timezone hijinks too much.)

nomad/fsm.go

nomad/structs/structs.go

tgross

LGTM once comments are resolved.

We should also include both a changelog entry and an upgrade guide note for 1.9.2. One thing in particular that stands out is that because some objects like Deployments are created in the scheduler (which could be on a follower), this requires that servers have at least roughly-sync'd clocks for correct GC. We probably already do implicitly require this somewhere in the code base but it'd be good to have a note about it just in case someone is doing something weird and has been getting away with it for a while.

api/csi.go

nomad/structs/csi.go

schmichael · 2024-11-01T15:42:43Z

One thing in particular that stands out is that because some objects like Deployments are created in the scheduler (which could be on a follower), this requires that servers have at least roughly-sync'd clocks for correct GC.

Timetable also had the problem of using followers' clocks. I have typed and deleted a lot of conjectures around local skew vs remote skew and what the optimal way to handle time in a distributed system is....... but I don't think any of it is worth worrying about. Our time-based GC can't escape clock skew problems of some kind. Even if we always used the leader's time for insertion and sent the leader's time with gc evals... leadership can change at any time! Clock skew is inescapable, and hopefully our users understand any time-based parameters are only as accurate as their clocks.

Whenever setting objects creation/modify time, we should always use UTC. #24112 introduced some inconsistencies in this area, and this PR fixes it.

When we removed the time table in #24112 we introduced a bug where if a previous version of Nomad had written a time table entry, we'd return from the restore loop early and never load the rest of the FSM. This will result in a mostly or partially wiped state for that Nomad node, which would then be out of sync with its peers (which would also have the same problem on upgrade). The bug only occurs when the FSM is being restored from snapshot, which isn't the case if you test with a server that's only written Raft logs and not snapshotted them. While fixing this bug, we still need to ensure we're reading the time table entries even if we're throwing them away, so that we move the snapshot reader along to the next full entry. Fixes: #24411

vercel bot deployed to Preview – nomad-ui October 2, 2024 14:46 View deployment

pkazmierczak marked this pull request as draft October 2, 2024 15:43

pkazmierczak force-pushed the f-gc-limits-3-days branch from b01bf87 to 0643dc0 Compare October 21, 2024 09:45

vercel bot deployed to Preview – nomad-ui October 21, 2024 09:47 View deployment

vercel bot deployed to Preview – nomad-ui October 24, 2024 17:08 View deployment

vercel bot deployed to Preview – nomad-ui October 24, 2024 18:16 View deployment

vercel bot deployed to Preview – nomad-ui October 28, 2024 14:51 View deployment

vercel bot deployed to Preview – nomad-ui October 28, 2024 16:56 View deployment

vercel bot deployed to Preview – nomad-ui October 30, 2024 10:21 View deployment

pkazmierczak self-assigned this Oct 30, 2024

pkazmierczak added 16 commits October 30, 2024 11:29

fsm: adjust timeTableLimit according to longest GC threshold

5741bc7

simplify

807efc5

remove timeTable from fsm

5e79452

remove timetable

acd05b0

remove tt completely

811cde5

adjust core sched for jobs, nodes and deployments

9763110

add create and modify time to deployments

e79d38f

remove threshold index from other objects in the core scheduler

065f03e

i love that we mix unix and unixnano

fdead8e

csi volumes create/modify time

4583cb6

deployment create/modify times

11e29a0

csi plugin create/modify time on upsert

a97b72b

oh this is tedious

b3fc1cb

i am miserable now

a3b1538

removed time.Now from fsm and state store methods

e92767d

signatures change

d5c378c

pkazmierczak force-pushed the f-gc-limits-3-days branch from cdf7e35 to d5c378c Compare October 30, 2024 13:41

vercel bot deployed to Preview – nomad-ui October 30, 2024 13:42 View deployment

pkazmierczak added 2 commits October 30, 2024 14:46

revert more tests

0d25ee1

state store and test fixes

2e722c6

pkazmierczak added 2 commits October 31, 2024 17:56

TestCoreScheduler_CSIPluginGC fix

3e1feee

pruneUnblockIndexes

d88e6b9

vercel bot deployed to Preview – nomad-ui October 31, 2024 20:18 View deployment

pkazmierczak marked this pull request as ready for review October 31, 2024 20:44

pkazmierczak requested review from schmichael and tgross October 31, 2024 20:44

schmichael approved these changes Oct 31, 2024

View reviewed changes

review comments

6f60a45

vercel bot deployed to Preview – nomad-ui November 1, 2024 12:25 View deployment

tgross approved these changes Nov 1, 2024

View reviewed changes

api/csi.go Outdated Show resolved Hide resolved

nomad/structs/csi.go Outdated Show resolved Hide resolved

pkazmierczak added 2 commits November 1, 2024 14:02

Tim's comment about API package UnixNano explanations

c31a097

cl

a3ff3a2

pkazmierczak added 2 commits November 1, 2024 19:18

review comment

3e43b56

TestCoreScheduler_EvalGC_Batch fix

da2d741

vercel bot deployed to Preview – nomad-ui November 1, 2024 18:20 View deployment

pkazmierczak added the backport/1.9.x backport to 1.9.x release line label Nov 1, 2024

upgrade guide entry

752d927

vercel bot deployed to Preview – nomad-ui November 1, 2024 18:29 View deployment

vercel bot deployed to Preview – nomad November 1, 2024 18:35 View deployment

pkazmierczak merged commit f7847c6 into main Nov 1, 2024
28 checks passed

pkazmierczak deleted the f-gc-limits-3-days branch November 1, 2024 18:38

hc-github-team-nomad-core mentioned this pull request Nov 1, 2024

Backport of state: remove TimeTable and rely on objects' modify times instead into release/1.9.x #24355

Merged

pkazmierczak mentioned this pull request Nov 7, 2024

gc: be consistent with setting create/modify timestamp tz #24389

Merged

hc-github-team-nomad-core mentioned this pull request Nov 7, 2024

Backport of gc: be consistent with setting create/modify timestamp tz into release/1.9.x #24400

Merged

tgross mentioned this pull request Nov 9, 2024

fsm: fix bug in snapshot restore for removed timetable #24412

Merged

hc-github-team-nomad-core mentioned this pull request Nov 11, 2024

Backport of fsm: fix bug in snapshot restore for removed timetable into release/1.9.x #24417

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

state: remove TimeTable and rely on objects' modify times instead #24112

state: remove TimeTable and rely on objects' modify times instead #24112

pkazmierczak commented Oct 2, 2024 •

edited

Loading

schmichael left a comment

schmichael Oct 31, 2024

schmichael Oct 31, 2024

tgross left a comment

schmichael commented Nov 1, 2024

		eval.CreateTime = time.Now().Add(-6 * time.Hour).UnixNano() // make sure objects we insert are older than GC thresholds
		eval.ModifyTime = time.Now().Add(-5 * time.Hour).UnixNano()

		ModifyTime: now.UnixNano(),
		CutoffTime: now.Add(-1 * time.Hour),

state: remove TimeTable and rely on objects' modify times instead #24112

state: remove TimeTable and rely on objects' modify times instead #24112

Conversation

pkazmierczak commented Oct 2, 2024 • edited Loading

schmichael left a comment

Choose a reason for hiding this comment

schmichael Oct 31, 2024

Choose a reason for hiding this comment

schmichael Oct 31, 2024

Choose a reason for hiding this comment

tgross left a comment

Choose a reason for hiding this comment

schmichael commented Nov 1, 2024

pkazmierczak commented Oct 2, 2024 •

edited

Loading