Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modeling linearizable keys with the list-append workload #27

Open
snaury opened this issue Feb 12, 2024 · 7 comments
Open

Modeling linearizable keys with the list-append workload #27

snaury opened this issue Feb 12, 2024 · 7 comments

Comments

@snaury
Copy link

snaury commented Feb 12, 2024

I'm trying to test our database with jepsen, and having trouble to model our (probably unusual) guarantees. Currently our transactions guarantee serializable isolation, but additionally non-stale reads (all committed changes must be visible in subsequent transactions), and due to implementation it's probably per key (or even per shard) linearizable. Due to single-shard optimizations we are not yet strict-serializable however: independent single-key writes may be assigned local shard timestamps, which appear reordered from the global time perspective.

Example of a possible history:

(require '[elle.list-append :as a]
         '[jepsen.history :as h])

(def h (h/history
        [{:process 0, :time 1000, :type :invoke, :f :txn, :value [[:r :x nil] [:r :y nil]]}
         {:process 1, :time 1001, :type :invoke, :f :txn, :value [[:append :x 1]]}
         {:process 1, :time 1010, :type :ok, :f :txn, :value [[:append :x 1]]}
         {:process 1, :time 1011, :type :invoke, :f :txn, :value [[:append :y 2]]}
         {:process 1, :time 1020, :type :ok, :f :txn, :value [[:append :y 2]]}
         {:process 0, :time 1030, :type :ok, :f :txn, :value [[:r :x []] [:r :y [2]]]}]))

(pprint (a/check {:consistency-models [:serializable]} h))
(pprint (a/check {:consistency-models [:strict-serializable]} h))

This history is obviously serializable, but not strict serializable.

Consider this history however:

(def h (h/history
        [{:process 1, :time 1001, :type :invoke, :f :txn, :value [[:append :x 1]]}
         {:process 1, :time 1010, :type :ok, :f :txn, :value [[:append :x 1]]}
         {:process 1, :time 1011, :type :invoke, :f :txn, :value [[:r :x nil]]}
         {:process 1, :time 1020, :type :ok, :f :txn, :value [[:r :x []]]}]))

This history is also serializable, and not strict serializable. But it also shows stale read, and is not per key linearizable.

Unfortunately, both :strong-read-committed and :linearizable show this history as valid for some reason:

...=> (pprint (a/check {:consistency-models [:strong-read-committed]} h))
{:valid? true}
nil
...=> (pprint (a/check {:consistency-models [:linearizable]} h))
{:valid? true}
nil

I'm not sure whether it's because :strong-read-committed and :linearizable are not implemented, or whether I misunderstand what they mean. Unfortunately testing with :strict-serializable shows G-single-item-realtime and G-nonadjacent-item-realtime anomalies, which appear to be due to reordered writes. And :serializable is too weak.

I'm a novice in clojure, so don't quite understand how to extend elle to do what I want, can you please point me in the right direction on how to check for additional realtime restrictions over serializable isolation?

@aphyr
Copy link
Contributor

aphyr commented Feb 12, 2024 via email

@snaury
Copy link
Author

snaury commented Feb 12, 2024

I think the easiest way to get to this is probably to write an alternative variant of elle.core/realtime-graph (on my phone, guessing at the name) which partitions the realtime graph it builds by separate keys. Redefine the existing Elle realtime graph fn to be your custom version. Then ask for strict serializability.

Oh, thanks! I was thinking about using :additional-graphs for adding a modified realtime-graph, although txn/additional-graphs seems to add elle/realtime-graph automatically, whenever realtime-anomaly-types is present. I didn't realize it could be redefined, gonna find out how to do that (I guess it's possible using with-redefs).

@aphyr
Copy link
Contributor

aphyr commented Feb 12, 2024 via email

@snaury
Copy link
Author

snaury commented Feb 12, 2024

Another trick I've used is to test only a projection of the history. For example, you could check the whole history for serializable, then project the history to just transactions on a single key, and check that for strict serializability. Repeat for all keys. That would get you per-key linearizability!

Interesting. I've written a serializability checker in python some years ago, which helped me find quite a few post-refactoring out-of-order anomalies back then (it could only handle a few seconds of shotgun-style transactions though, since it used rw-register style workload and graph inference was very inefficient). One of the most interesting cases I had to investigate was reordering in staircase-style transactions, where e.g. tx 1 works on keys 1 and 2, then tx 2 works on keys 2 and 3, then tx 3 works on keys 3 and 4 and so on, until finally some tx N works on both keys N and 1 (don't remember details too well though), and where some additional transactions added linearizability edges. For rw-register style workload it was very important to have those additional edges for inference, so I assumed it would be important here too.

But append workload is a lot less ambiguous: every read establishes a full order of writes before it to any given key, so only order between multiple same-result reads is ambiguous, even when history is filtered to a single key. I wonder whether additional edges lost due to filtering matter, or any violation with them would already cause serializability anomalies. I can't seem to think of a counter example so far (e.g. where tx would have a read reordered with write for one key, but causally restricted via a different key, without also being causally restricted with the first key or causing anomalies).

Just to be on a safe side I would prefer checking unfiltered graph for now, but thanks for the idea!

@snaury
Copy link
Author

snaury commented Feb 12, 2024

I think I managed to construct a counter example:

tx1          [:append 1 1]               [:append 3 1]
 |  tx2                      [:r 2 nil]  [:append 3 2]
 |      tx3  [:r 1 nil/[1]]  [:r 2 nil]
 |  tx4                      [:r 2 nil]  [:append 3 3]
tx5                                      [:r 3 [1 2 3]]

Or with code:

(require '[elle.list-append :as a]
         '[jepsen.history :as h])

(def h (h/history
	[{:process 0, :time 1000, :type :invoke, :f :txn, :value [[:append 1 1] [:append 3 1]]}
	 {:process 1, :time 1010, :type :invoke, :f :txn, :value [[:r 2 nil] [:append 3 2]]}
	 {:process 1, :time 1020, :type :ok,     :f :txn, :value [[:r 2 nil] [:append 3 2]]}
	 {:process 2, :time 1030, :type :invoke, :f :txn, :value [[:r 1 nil] [:r 2 nil]]}
	 {:process 2, :time 1040, :type :ok,     :f :txn, :value [[:r 1 nil] [:r 2 nil]]} ; [:r 1 [1]] passes
	 {:process 1, :time 1050, :type :invoke, :f :txn, :value [[:r 2 nil] [:append 3 3]]}
	 {:process 1, :time 1060, :type :ok,     :f :txn, :value [[:r 2 nil] [:append 3 3]]}
	 {:process 0, :time 1070, :type :ok,     :f :txn, :value [[:append 1 1] [:append 3 1]]}
	 {:process 0, :time 1080, :type :invoke, :f :txn, :value [[:r 3 nil]]}
	 {:process 0, :time 1090, :type :ok,     :f :txn, :value [[:r 3 [1 2 3]]]}]))

(pprint (a/check {:consistency-models [:serializable]} h))
(pprint (a/check {:consistency-models [:strict-serializable]} h))

Note how tx1 and tx3 intersect in time, and [:r 1] can return either nil or [1]. Suppose it returns nil:

  • When checking with :serializable tx3 can be reordered to the top and it passes.
  • Filtered by key 1 there only tx1 and tx3 without realtime violations: [:append 1 1] and [:r 1] intersect in time, any order is valid.
  • Filtered by key 2 there are tx2, tx3 and tx4, no realtime violations either, besides any order would be valid
  • Filtered by key 3 there are tx1, tx2, tx4 and tx5, and they match realtime.
  • Treated as a whole tx3 is sandwitched between tx2 and tx4, and order of tx1, tx2 and tx4 is established by key 3, but the result of :r1 1 in tx3 is nil, which implies tx3 happened before tx1, a contradiction

Maybe I misunderstood how history should be projected, but this counter-example seems to show that filtering out non-matching keys wouldn't work. And including non-matching ops wouldn't work either? (e.g. it might include a :r of a key, while skipping some :append).

@snaury
Copy link
Author

snaury commented Feb 13, 2024

I think I managed to make a modified realtime-graph that does per-key realtime edges, with some tests that I expected to pass/fail:

https://gist.github.com/snaury/eeac381106a377c3d3f6069157cd37e4

Not sure whether it's effecient, but seems to work ok so far (on histories of ~10 minutes). To be honest I can't think of a way to make it a proper PR to elle, though. I thought about somehow duplicating -realtime anomalies with -realtime-per-key suffix, then dispatching those to a modified realtime-graph, then somehow hooking all that to a :linearizable model, but to be honest I'm at a loss of how to really do latter parts, especially since strict-serializable anomalies would need to match only when :linearizable and :serializable are combined. Looks way too messy and hacky even before starting.

Anyway, thanks for your help and suggestions, it was spot on! :)

@aphyr
Copy link
Contributor

aphyr commented Feb 13, 2024

Ah, yes, quite right: if you want to couple the per-key linearizable order to full transaction serializable order, you'll need a combined graph--splitting up the history and checking keys individually won't work. It sounds like you've managed to get something like that working though!

I'm honestly not sure how we'd go about establishing a linearizable consistency model either. If this particular use case is common, it might be best to add a specific option that uses the per-key realtime-graph as opposed to the normal global realtime-graph. The per-key one should be... not a subset, exactly, but compatible with the global graph, so calling those edges "realtime" is probably fine; then we can re-use all the current anomalies prohibited under strict-1SR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants