-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kafka workflow crashes with NullPointerException #79
Comments
That's probably best understood as a checker bug--I'll bet you that part of the checker is trying to find e.g. the time between the first and last successful ops, and if there aren't any, it blows up, or something like that. Feel free to dig into the Kafka checker in Jepsen if you'd like to fix this one--I imagine it should be a simple check to make sure at least a handful of required ops actually exist.On Sep 13, 2023 05:37, Andrea Ambu ***@***.***> wrote:
If maelstrom gets a binary that ignores all the kafka commands, for example like this modified demo/clojure/echo.clj, then maelstrom fails with a NullPointerException rather than some useful errors.
For other workflows, eg broadcast/g-counter, maelstroms fails with some timeout exceptions that are more understandable.
Is this the correct way to fail for the kafka workflow?
$ lein run -- test -w kafka --bin demo/clojure/lenient_echo.clj # see gist
[...]
INFO [2023-09-13 10:24:40,229] jepsen node n1 - maelstrom.net Shutting down Maelstrom network
INFO [2023-09-13 10:24:40,233] jepsen test runner - jepsen.core Analyzing...
INFO [2023-09-13 10:24:40,294] clojure-agent-send-off-pool-5 - jepsen.tests.kafka Wrote /Users/ndr/coding/maelstrom-src/maelstrom/store/kafka/20230913T102325.079+0100/consume-counts.edn
WARN [2023-09-13 10:24:40,295] clojure-agent-send-off-pool-5 - jepsen.checker Error while checking history:
java.lang.NullPointerException: Cannot invoke "java.lang.Number.doubleValue()" because "x" is null
at clojure.lang.Numbers.divide(Numbers.java:3899)
at jepsen.util$nanos__GT_secs.invokeStatic(util.clj:386)
at jepsen.util$nanos__GT_secs.invoke(util.clj:386)
at clojure.core$update.invokeStatic(core.clj:6231)
at clojure.core$update.invoke(core.clj:6223)
at jepsen.tests.kafka$checker$reify__19272.check(kafka.clj:2080)
at jepsen.checker$check_safe.invokeStatic(checker.clj:86)
at jepsen.checker$check_safe.invoke(checker.clj:79)
at jepsen.checker$compose$reify__10709$fn__10711.invoke(checker.clj:102)
at clojure.core$pmap$fn__8552$fn__8553.invoke(core.clj:7089)
at clojure.core$binding_conveyor_fn$fn__5823.invoke(core.clj:2047)
at clojure.lang.AFn.call(AFn.java:18)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at java.base/java.lang.Thread.run(Thread.java:1623)
INFO [2023-09-13 10:24:40,428] jepsen test runner - jepsen.core Analysis complete
INFO [2023-09-13 10:24:40,434] jepsen results - jepsen.store Wrote /Users/ndr/coding/maelstrom-src/maelstrom/store/kafka/20230913T102325.079+0100/results.edn
INFO [2023-09-13 10:24:40,456] jepsen test runner - jepsen.core {:perf {:latency-graph {:valid? true},
:rate-graph {:valid? true},
:valid? true},
:timeline {:valid? true},
:exceptions {:valid? true},
:stats {:valid? false,
:count 67,
:ok-count 0,
:fail-count 11,
:info-count 56,
:by-f {:assign {:valid? false,
:count 11,
:ok-count 0,
:fail-count 11,
:info-count 0},
:crash {:valid? false,
:count 7,
:ok-count 0,
:fail-count 0,
:info-count 7},
:poll {:valid? false,
:count 20,
:ok-count 0,
:fail-count 0,
:info-count 20},
:send {:valid? false,
:count 29,
:ok-count 0,
:fail-count 0,
:info-count 29}}},
:availability {:valid? true, :ok-fraction 0.0},
:net {:all {:send-count 70,
:recv-count 70,
:msg-count 70,
:msgs-per-op 1.0447761},
:clients {:send-count 70, :recv-count 70, :msg-count 70},
:servers {:send-count 0,
:recv-count 0,
:msg-count 0,
:msgs-per-op 0.0},
:valid? true},
:workload {:valid? :unknown,
:error "java.lang.NullPointerException: Cannot invoke \"java.lang.Number.doubleValue()\" because \"x\" is null\n at clojure.lang.Numbers.divide (Numbers.java:3899)\n jepsen.util$nanos__GT_secs.invokeStatic (util.clj:386)\n jepsen.util$nanos__GT_secs.invoke (util.clj:386)\n clojure.core$update.invokeStatic (core.clj:6231)\n clojure.core$update.invoke (core.clj:6223)\n jepsen.tests.kafka$checker$reify__19272.check (kafka.clj:2080)\n jepsen.checker$check_safe.invokeStatic (checker.clj:86)\n jepsen.checker$check_safe.invoke (checker.clj:79)\n jepsen.checker$compose$reify__10709$fn__10711.invoke (checker.clj:102)\n clojure.core$pmap$fn__8552$fn__8553.invoke (core.clj:7089)\n clojure.core$binding_conveyor_fn$fn__5823.invoke (core.clj:2047)\n clojure.lang.AFn.call (AFn.java:18)\n java.util.concurrent.FutureTask.run (FutureTask.java:317)\n java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1144)\n java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:642)\n java.lang.Thread.run (Thread.java:1623)\n"},
:valid? false}
Analysis invalid! (ノಥ益ಥ)ノ ┻━┻
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
Thanks, I've shared a tentative fix at jepsen-io/jepsen#580 - comments welcome! |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
If maelstrom gets a binary that ignores all the kafka commands, for example like this modified demo/clojure/echo.clj, then maelstrom fails with a NullPointerException rather than some useful errors.
For other workflows, eg broadcast/g-counter, maelstroms fails with some timeout exceptions that are more understandable.
Is this the correct way to fail for the kafka workflow?
The text was updated successfully, but these errors were encountered: