Modeling linearizable keys with the list-append workload #27

snaury · 2024-02-12T10:36:28Z

I'm trying to test our database with jepsen, and having trouble to model our (probably unusual) guarantees. Currently our transactions guarantee serializable isolation, but additionally non-stale reads (all committed changes must be visible in subsequent transactions), and due to implementation it's probably per key (or even per shard) linearizable. Due to single-shard optimizations we are not yet strict-serializable however: independent single-key writes may be assigned local shard timestamps, which appear reordered from the global time perspective.

Example of a possible history:

(require '[elle.list-append :as a]
         '[jepsen.history :as h])

(def h (h/history
        [{:process 0, :time 1000, :type :invoke, :f :txn, :value [[:r :x nil] [:r :y nil]]}
         {:process 1, :time 1001, :type :invoke, :f :txn, :value [[:append :x 1]]}
         {:process 1, :time 1010, :type :ok, :f :txn, :value [[:append :x 1]]}
         {:process 1, :time 1011, :type :invoke, :f :txn, :value [[:append :y 2]]}
         {:process 1, :time 1020, :type :ok, :f :txn, :value [[:append :y 2]]}
         {:process 0, :time 1030, :type :ok, :f :txn, :value [[:r :x []] [:r :y [2]]]}]))

(pprint (a/check {:consistency-models [:serializable]} h))
(pprint (a/check {:consistency-models [:strict-serializable]} h))

This history is obviously serializable, but not strict serializable.

Consider this history however:

(def h (h/history
        [{:process 1, :time 1001, :type :invoke, :f :txn, :value [[:append :x 1]]}
         {:process 1, :time 1010, :type :ok, :f :txn, :value [[:append :x 1]]}
         {:process 1, :time 1011, :type :invoke, :f :txn, :value [[:r :x nil]]}
         {:process 1, :time 1020, :type :ok, :f :txn, :value [[:r :x []]]}]))

This history is also serializable, and not strict serializable. But it also shows stale read, and is not per key linearizable.

Unfortunately, both :strong-read-committed and :linearizable show this history as valid for some reason:

...=> (pprint (a/check {:consistency-models [:strong-read-committed]} h))
{:valid? true}
nil
...=> (pprint (a/check {:consistency-models [:linearizable]} h))
{:valid? true}
nil

I'm not sure whether it's because :strong-read-committed and :linearizable are not implemented, or whether I misunderstand what they mean. Unfortunately testing with :strict-serializable shows G-single-item-realtime and G-nonadjacent-item-realtime anomalies, which appear to be due to reordered writes. And :serializable is too weak.

I'm a novice in clojure, so don't quite understand how to extend elle to do what I want, can you please point me in the right direction on how to check for additional realtime restrictions over serializable isolation?

The text was updated successfully, but these errors were encountered:

aphyr · 2024-02-12T13:47:15Z

Hi Aleksei!It sounds like you're looking for something Elle doesn't have: a serializable graph augmented with only per-object realtime edges. Elle has a : linearizable model, but it's not wired up to any actual anomalies yet. :strong-read-committed is what you probably expect: it proscribes g1a, g1b, and g1c-realtime, which is any cycle of realtime, ww, or wr edges. Your example history doesn't contain those anomalies.I think the easiest way to get to this is probably to write an alternative variant of elle.core/realtime-graph (on my phone, guessing at the name) which partitions the realtime graph it builds by separate keys. Redefine the existing Elle realtime graph fn to be your custom version. Then ask for strict serializability. Any violations found should be limited to those single key nonlinearizable anomalies.Alternatively, you could build out a specific per key linearizable checker as a part of Elle. I think it'd be a reasonable addition, just haven't been able to build it myself yet.--KyleOn Feb 12, 2024 04:36, Aleksei Borzenkov ***@***.***> wrote: I'm trying to test our database with jepsen, and having trouble to model our (probably unusual) guarantees. Currently our transactions guarantee serializable isolation, but additionally non-stale reads (all committed changes must be visible in subsequent transactions), and due to implementation it's probably per key (or even per shard) linearizable. Due to single-shard optimizations we are not yet strict-serializable however: independent single-key writes may be assigned local shard timestamps, which appear reordered from the global time perspective. Example of a possible history: (require '[elle.list-append :as a] '[jepsen.history :as h]) (def h (h/history [{:process 0, :time 1000, :type :invoke, :f :txn, :value [[:r :x nil] [:r :y nil]]} {:process 1, :time 1001, :type :invoke, :f :txn, :value [[:append :x 1]]} {:process 1, :time 1010, :type :ok, :f :txn, :value [[:append :x 1]]} {:process 1, :time 1011, :type :invoke, :f :txn, :value [[:append :y 2]]} {:process 1, :time 1020, :type :ok, :f :txn, :value [[:append :y 2]]} {:process 0, :time 1030, :type :ok, :f :txn, :value [[:r :x []] [:r :y [2]]]}])) (pprint (a/check {:consistency-models [:serializable]} h)) (pprint (a/check {:consistency-models [:strict-serializable]} h)) This history is obviously serializable, but not strict serializable. Consider this history however: (def h (h/history [{:process 1, :time 1001, :type :invoke, :f :txn, :value [[:append :x 1]]} {:process 1, :time 1010, :type :ok, :f :txn, :value [[:append :x 1]]} {:process 1, :time 1011, :type :invoke, :f :txn, :value [[:r :x nil]]} {:process 1, :time 1020, :type :ok, :f :txn, :value [[:r :x []]]}])) This history is also serializable, and not strict serializable. But it also shows stale read, and is not per key linearizable. Unfortunately, both :strong-read-committed and :linearizable show this history as valid for some reason: ...=> (pprint (a/check {:consistency-models [:strong-read-committed]} h)) {:valid? true} nil ...=> (pprint (a/check {:consistency-models [:linearizable]} h)) {:valid? true} nil I'm not sure whether it's because :strong-read-committed and :linearizable are not implemented, or whether I misunderstand what they mean. Unfortunately testing with :strict-serializable shows G-single-item-realtime and G-nonadjacent-item-realtime anomalies, which appear to be due to reordered writes. And :serializable is too weak. I'm a novice in clojure, so don't quite understand how to extend elle to do what I want, can you please point me in the right direction on how to check for additional realtime restrictions over serializable isolation? —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

snaury · 2024-02-12T15:38:24Z

I think the easiest way to get to this is probably to write an alternative variant of elle.core/realtime-graph (on my phone, guessing at the name) which partitions the realtime graph it builds by separate keys. Redefine the existing Elle realtime graph fn to be your custom version. Then ask for strict serializability.

Oh, thanks! I was thinking about using :additional-graphs for adding a modified realtime-graph, although txn/additional-graphs seems to add elle/realtime-graph automatically, whenever realtime-anomaly-types is present. I didn't realize it could be redefined, gonna find out how to do that (I guess it's possible using with-redefs).

aphyr · 2024-02-12T17:08:33Z

Yeah--additional-graphs is in something of an odd position at the moment. The search system in elle.txn is carefully tuned to find anomalies, but it does that only for specific cycles of the eight predefined relationships for txn graphs. I actually don't know how well additional-graphs interacts with :realtime edges right now. In general, there's no notion of a custom relationship between txns, or a way to define additional anomalies. I have a hunch that it may be easier, rather than allowing full extension/customization of anomalies and relationships, to let people override just, say, the definition of realtime edges, and to use the existing anomaly finding machinery as is. I don't have a strong opinion though, and I'm certainly open to input!Another trick I've used is to test only a projection of the history. For example, you could check the whole history for serializable, then project the history to just transactions on a single key, and check that for strict serializability. Repeat for all keys. That would get you per-key linearizability!On Feb 12, 2024 09:38, Aleksei Borzenkov ***@***.***> wrote: I think the easiest way to get to this is probably to write an alternative variant of elle.core/realtime-graph (on my phone, guessing at the name) which partitions the realtime graph it builds by separate keys. Redefine the existing Elle realtime graph fn to be your custom version. Then ask for strict serializability. Oh, thanks! I was thinking about using :additional-graphs for adding a modified realtime-graph, although txn/additional-graphs seems to add elle/realtime-graph automatically, whenever realtime-anomaly-types is present. I didn't realize it could be redefined, gonna find out how to do that. —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: ***@***.***>

snaury · 2024-02-12T19:48:51Z

Another trick I've used is to test only a projection of the history. For example, you could check the whole history for serializable, then project the history to just transactions on a single key, and check that for strict serializability. Repeat for all keys. That would get you per-key linearizability!

Interesting. I've written a serializability checker in python some years ago, which helped me find quite a few post-refactoring out-of-order anomalies back then (it could only handle a few seconds of shotgun-style transactions though, since it used rw-register style workload and graph inference was very inefficient). One of the most interesting cases I had to investigate was reordering in staircase-style transactions, where e.g. tx 1 works on keys 1 and 2, then tx 2 works on keys 2 and 3, then tx 3 works on keys 3 and 4 and so on, until finally some tx N works on both keys N and 1 (don't remember details too well though), and where some additional transactions added linearizability edges. For rw-register style workload it was very important to have those additional edges for inference, so I assumed it would be important here too.

But append workload is a lot less ambiguous: every read establishes a full order of writes before it to any given key, so only order between multiple same-result reads is ambiguous, even when history is filtered to a single key. I wonder whether additional edges lost due to filtering matter, or any violation with them would already cause serializability anomalies. I can't seem to think of a counter example so far (e.g. where tx would have a read reordered with write for one key, but causally restricted via a different key, without also being causally restricted with the first key or causing anomalies).

Just to be on a safe side I would prefer checking unfiltered graph for now, but thanks for the idea!

snaury · 2024-02-12T20:54:35Z

I think I managed to construct a counter example:

tx1          [:append 1 1]               [:append 3 1]
 |  tx2                      [:r 2 nil]  [:append 3 2]
 |      tx3  [:r 1 nil/[1]]  [:r 2 nil]
 |  tx4                      [:r 2 nil]  [:append 3 3]
tx5                                      [:r 3 [1 2 3]]

Or with code:

(require '[elle.list-append :as a]
         '[jepsen.history :as h])

(def h (h/history
	[{:process 0, :time 1000, :type :invoke, :f :txn, :value [[:append 1 1] [:append 3 1]]}
	 {:process 1, :time 1010, :type :invoke, :f :txn, :value [[:r 2 nil] [:append 3 2]]}
	 {:process 1, :time 1020, :type :ok,     :f :txn, :value [[:r 2 nil] [:append 3 2]]}
	 {:process 2, :time 1030, :type :invoke, :f :txn, :value [[:r 1 nil] [:r 2 nil]]}
	 {:process 2, :time 1040, :type :ok,     :f :txn, :value [[:r 1 nil] [:r 2 nil]]} ; [:r 1 [1]] passes
	 {:process 1, :time 1050, :type :invoke, :f :txn, :value [[:r 2 nil] [:append 3 3]]}
	 {:process 1, :time 1060, :type :ok,     :f :txn, :value [[:r 2 nil] [:append 3 3]]}
	 {:process 0, :time 1070, :type :ok,     :f :txn, :value [[:append 1 1] [:append 3 1]]}
	 {:process 0, :time 1080, :type :invoke, :f :txn, :value [[:r 3 nil]]}
	 {:process 0, :time 1090, :type :ok,     :f :txn, :value [[:r 3 [1 2 3]]]}]))

(pprint (a/check {:consistency-models [:serializable]} h))
(pprint (a/check {:consistency-models [:strict-serializable]} h))

Note how tx1 and tx3 intersect in time, and [:r 1] can return either nil or [1]. Suppose it returns nil:

When checking with :serializable tx3 can be reordered to the top and it passes.
Filtered by key 1 there only tx1 and tx3 without realtime violations: [:append 1 1] and [:r 1] intersect in time, any order is valid.
Filtered by key 2 there are tx2, tx3 and tx4, no realtime violations either, besides any order would be valid
Filtered by key 3 there are tx1, tx2, tx4 and tx5, and they match realtime.
Treated as a whole tx3 is sandwitched between tx2 and tx4, and order of tx1, tx2 and tx4 is established by key 3, but the result of :r1 1 in tx3 is nil, which implies tx3 happened before tx1, a contradiction

Maybe I misunderstood how history should be projected, but this counter-example seems to show that filtering out non-matching keys wouldn't work. And including non-matching ops wouldn't work either? (e.g. it might include a :r of a key, while skipping some :append).

snaury · 2024-02-13T21:06:35Z

I think I managed to make a modified realtime-graph that does per-key realtime edges, with some tests that I expected to pass/fail:

https://gist.github.com/snaury/eeac381106a377c3d3f6069157cd37e4

Not sure whether it's effecient, but seems to work ok so far (on histories of ~10 minutes). To be honest I can't think of a way to make it a proper PR to elle, though. I thought about somehow duplicating -realtime anomalies with -realtime-per-key suffix, then dispatching those to a modified realtime-graph, then somehow hooking all that to a :linearizable model, but to be honest I'm at a loss of how to really do latter parts, especially since strict-serializable anomalies would need to match only when :linearizable and :serializable are combined. Looks way too messy and hacky even before starting.

Anyway, thanks for your help and suggestions, it was spot on! :)

aphyr · 2024-02-13T21:39:42Z

Ah, yes, quite right: if you want to couple the per-key linearizable order to full transaction serializable order, you'll need a combined graph--splitting up the history and checking keys individually won't work. It sounds like you've managed to get something like that working though!

I'm honestly not sure how we'd go about establishing a linearizable consistency model either. If this particular use case is common, it might be best to add a specific option that uses the per-key realtime-graph as opposed to the normal global realtime-graph. The per-key one should be... not a subset, exactly, but compatible with the global graph, so calling those edges "realtime" is probably fine; then we can re-use all the current anomalies prohibited under strict-1SR.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modeling linearizable keys with the list-append workload #27

Modeling linearizable keys with the list-append workload #27

snaury commented Feb 12, 2024

aphyr commented Feb 12, 2024 via email

snaury commented Feb 12, 2024 •

edited

Loading

aphyr commented Feb 12, 2024 via email

snaury commented Feb 12, 2024 •

edited

Loading

snaury commented Feb 12, 2024 •

edited

Loading

snaury commented Feb 13, 2024

aphyr commented Feb 13, 2024

Modeling linearizable keys with the list-append workload #27

Modeling linearizable keys with the list-append workload #27

Comments

snaury commented Feb 12, 2024

aphyr commented Feb 12, 2024 via email

snaury commented Feb 12, 2024 • edited Loading

aphyr commented Feb 12, 2024 via email

snaury commented Feb 12, 2024 • edited Loading

snaury commented Feb 12, 2024 • edited Loading

snaury commented Feb 13, 2024

aphyr commented Feb 13, 2024

snaury commented Feb 12, 2024 •

edited

Loading

snaury commented Feb 12, 2024 •

edited

Loading

snaury commented Feb 12, 2024 •

edited

Loading