A JVM implementation of the Jsonnet configuration language.
Sjsonnet can be used from Java:
<dependency>
<groupId>com.databricks</groupId>
<artifactId>sjsonnet_2.13</artifactId>
<version>0.4.13</version>
</dependency>
sjsonnet.SjsonnetMain.main0(
new String[]{"foo.jsonnet"},
new DefaultParseCache,
System.in,
System.out,
System.err,
os.package$.MODULE$.pwd(),
scala.None$.empty()
);
From Scala:
"com.databricks" %% "sjsonnet" % "0.4.13" // SBT
ivy"com.databricks::sjsonnet:0.4.13" // Mill
sjsonnet.SjsonnetMain.main0(
Array("foo.jsonnet"),
new DefaultParseCache,
System.in,
System.out,
System.err,
os.pwd, // working directory
None
);
As a standalone executable assembly:
$ curl -L https://github.com/databricks/sjsonnet/releases/download/0.4.13/sjsonnet-0.4.13.jar > sjsonnet.jar
$ chmod +x sjsonnet.jar
$ ./sjsonnet.jar
error: Need to pass in a jsonnet file to evaluate
usage: sjsonnet [sjsonnet-options] script-file
-i, --interactive Run Mill in interactive mode, suitable for opening REPLs and taking user input
-n, --indent How much to indent your output JSON
-J, --jpath Specify an additional library search dir (left-most wins)
-o, --output-file Write to the output file rather than stdout
...
$ ./sjsonnet.jar foo.jsonnet
Or from Javascript:
$ curl -L https://github.com/databricks/sjsonnet/releases/download/0.4.13/sjsonnet-0.4.13.js > sjsonnet.js
$ node
> require("./sjsonnet.js")
> SjsonnetMain.interpret("local f = function(x) x * x; f(11)", {}, {}, "", (wd, imported) => null)
121
> SjsonnetMain.interpret(
"local f = import 'foo'; f + 'bar'", // code
{}, // extVars
{}, // tlaVars
"", // initial working directory
// import callback: receives a base directory and the imported path string,
// returns a tuple of the resolved file path and file contents or file contents resolve method
(wd, imported) => [wd + "/" + imported, "local bar = 123; bar + bar"],
// loader callback: receives the tuple from the import callback and returns the file contents
([path, content]) => content
)
'246bar'
Note that since Javascript does not necessarily have access to the
filesystem, you have to provide an explicit import callback that you can
use to resolve imports yourself (whether through Node's fs
module, or
by emulating a filesystem in-memory)
The depth of recursion is limited by JVM stack size. You can run Sjsonnet with increased stack size as follows:
java -Xss100m -cp sjsonnet.jar sjsonnet.SjsonnetMain foo.jsonnet
The -Xss option above is responsible for JVM stack size. Please try this if you
ever run into sjsonnet.Error: Internal Error ... Caused by: java.lang.StackOverflowError ...
.
There is no analog of --max-stack
/-s
option of google/jsonnet.
The only stack size limit is the one of the JVM.
Sjsonnet is implementated as an optimizing interpreter. There are roughly 4 phases:
-
sjsonnet.Parser
: parses an inputString
into asjsonnet.Expr
, which is a Syntax Tree representing the Jsonnet document syntax, using the Fastparse parsing library -
sjsonnet.StaticOptimizer
is a single AST transform that performs static checking, essential rewriting (e.g. assigning indices in the symbol table for variables) and optimizations. The result is anothersjsonnet.Expr
per input file that can be stored in the parse cache and reused. -
sjsonnet.Evaluator
: recurses over thesjsonnet.Expr
produced by the optimizer and converts it into asjsonnet.Val
, a data structure representing the Jsonnet runtime values (basically lazy JSON which can contain function values). -
sjsonnet.Materializer
: recurses over thesjsonnet.Val
and converts it into an outputujson.Expr
: a non-lazy JSON structure without any remaining un-evaluated function values. This can be serialized to a string formatted in a variety of ways
These three phases are encapsulated in the sjsonnet.Interpreter
object.
Some notes on the values used in parts of the pipeline:
-
sjsonnet.Expr
: this represents{...}
object literal nodes,a + b
binary operation nodes,function(a) {...}
definitions andf(a)
invocations, etc.. Also keeps track of source-offset information so failures can be correlated with line numbers. -
sjsonnet.Val
: essentially the JSON structure (objects, arrays, primitives) but with two modifications. The first is that functions likefunction(a){...}
can still be present in the structure: in Jsonnet you can pass around functions as values and call then later on. The second is that object values & array entries are lazy: e.g.[error 123, 456][1]
does not raise an error because the first (erroneous) entry of the array is un-used and thus not evaluated. -
Classes representing literals extend
sjsonnet.Val.Literal
which in turn extends both,Expr
andVal
. This allows the evaluator to skip over them instead of having to convert them from one representation to the other.
Due to pervasive caching, sjsonnet is much faster than google/jsonnet. See this blog post for more details:
Here's the latest set of benchmarks I've run (as of 18 May 2023) comparing Sjsonnet against google/go-jsonnet and google/jsonnet, measuring the time taken to evaluate an arbitrary config file in the Databricks codebase:
Sjsonnet 0.4.3 | google/go-jsonnet 0.20.0 | google/jsonnet 0.20.0 | |
---|---|---|---|
staging/runbot-app.jsonnet (~6.6mb output JSON) | ~0.10s | ~6.5s | ~67s |
Sjsonnet was run as a long-lived daemon to keep the JVM warm,
while go-jsonnet and google/jsonnet were run as subprocesses, following typical
usage patterns. The Sjsonnet command
line which is run by all of these is defined in
MainBenchmark.mainArgs
. You need to change it to point to a suitable input
before running a benchmark or the profiler.
Benchmark example:
sbt bench/jmh:run -jvmArgs "-XX:+UseStringDeduplication" sjsonnet.MainBenchmark
Profiler:
sbt bench/run
There's also a benchmark for memory usage:
Execute and print stats:
sbt 'set fork in run := true' 'set javaOptions in run ++= Seq("-Xmx6G", "-XX:+UseG1GC")' 'bench/runMain sjsonnet.MemoryBenchmark'
Execute and pause - this is useful if you want to attach a profiler after the run and deep dive the object utilization.
sbt 'set fork in run := true' 'set javaOptions in run ++= Seq("-Xmx6G", "-XX:+UseG1GC")' 'bench/runMain sjsonnet.MemoryBenchmark --pause'
The Jsonnet language is lazy: expressions don't get evaluated unless
their value is needed, and thus even erroneous expressions do not cause
a failure if un-used. This is represented in the Sjsonnet codebase by
sjsonnet.Lazy
: a wrapper type that encapsulates an arbitrary
computation that returns a sjsonnet.Val
.
sjsonnet.Lazy
is used in several places, representing where
laziness is present in the language:
-
Inside
sjsonnet.Scope
, representing local variable name bindings -
Inside
sjsonnet.Val.Arr
, representing the contents of array cells -
Inside
sjsonnet.Val.Obj
, representing the contents of object values
Val
extends Lazy
so that an already computed value can be treated as
lazy without having to wrap it.
Unlike google/jsonnet, Sjsonnet caches the results of lazy computations the first time they are evaluated, avoiding wasteful re-computation when a value is used more than once.
Different from google/jsonnet, Sjsonnet
does not implement the Jsonnet standard library std
in Jsonnet code. Rather,
those functions are implemented as intrinsics directly in the host language (in
Std.scala
). This allows both better error messages when the input types are
wrong, as well as better performance for the more computationally-intense
builtin functions.
Sjsonnet comes with a built in thin-client and background server, to help mitigate the unfortunate JVM warmup overhead that adds ~1s to every invocation down to 0.2-0.3s. For the simple non-client-server executable, you can use
./mill -i show sjsonnet[2.13.15].jvm.assembly
To create the executable. For the client-server executable, you can use
./mill -i show sjsonnet[2.13.15].server.assembly
By default, the Sjsonnet background server lives in ~/.sjsonnet
, and lasts 5
minutes before shutting itself when inactive.
Since the Sjsonnet client still has 0.2-0.3s of overhead, if using Sjsonnet
heavily it is still better to include it in your JVM classpath and invoke it
programmatically via new Interpreter(...).interpret(...)
.
To publish, make sure the version number in build.sc
is correct, then run the following commands:
./mill -i mill.scalalib.PublishModule/publishAll \
--sonatypeCreds $SONATYPE_USER:$SONATYPE_PASSWORD --publishArtifacts __.publishArtifacts --release true \
--gpgArgs --passphrase=$GPG_PASSPHRASE,--batch,--yes,-a,-b,--pinentry-mode=loopback
./mill -i show "sjsonnet.js[2.13.15].fullOpt"
./mill -i show "sjsonnet.jvm[2.13.15].assembly"
- Fix a bug in new strict mode for set in std.setUnion #242
- Add support for Java 21 and dropped support for Java 11.
- Implemented every missing methods in
std
. - Improved readability of stack traces when
std
methods are involved. - Cleaned up default Main class to allow for repeated flags.
- Updated mill to 0.11.9 and added JDK17 build.
- Fixed sjsonnet handling of 64bits integers #191
- Stopped manifesting functions with no arguments in objects by default #168
- Fixed "Duplicate Local Variables in Object Scope" #178
- Fix a bug leading to the truncation of the data returned by
std.gzip
andstd.xz
#221.
- Fix a bug introduced with 0.4.11 with synthetic paths #215
- Fix thread-safety bug in Obj.getAllKeys #217
- Implement
std.isEmpty
,std.xor
,std.xnor
,std.trim
,std.equalsIgnoreCase
,std.sha1
,std.sha256
,std.sha512
,std.sha3
#204 - fix: std.manifestJsonMinified and empty arrays/objects #207
- fix: Use different chars for synthetic paths. #208
- Fix sorting algorithm to work for all array types #211
- Add better error handling for format #212
- Switch from CRC32 to XXHash64 for import cache keys #198
- Ensure almost-octal-number-like strings are quoted when
--yaml-out
is passed, to avoid issues with non-compliant YAML parsers #183
- Add
std.parseYaml
-
Make Jsonnet standard library configurable and non-global #166, fixing a race condition on library initialization in multithreaded environments and allowing custom
std.*
functions to be passed in by the user -
Added a flag
--no-duplicate-keys-in-comprehension
to follow upstream google/jsonnet behavior of failing if a dictionary comprehension #156 ea8720f. This is optional for migration purposes but will likely become the default in future -
Disallow Jsonnet identifiers that start with numbers #161
-
Fix parsing of
+:
in dictionary comprehensions #155 -
Fix parse failure when passing
id == ...
to a function arguments #151 -
Add a flag
--strict-import-syntax
to disallow the syntaximport "foo".bar
, when it should be(import "foo").bar
, following upstream google/jsonnet. This is optional for migration purposes but will likely become the default in future #153 ccbe6e -
Allow assertions to take non-string error messages #170
-
Fix parsing of object keys starting with the prefix
assert
#169 -
Properly propagate assertions during inheritance #172, and add the flag
--strict-inherited-assertions
to fix a bug where assertions inherited from a shared object were only triggering once, rather than once-per-inheritor 95342. This is optional for migration purposes but will likely become the default in future -
Add
std.slice
,std.manifestJsonMinified
, fix handling of numbers inmanifestXmlJsonml
, handling of code inextCode
#171 -
Add the ability to include code in
--tla-code
and--tla-code-file
#175 -
Add
std.reverse
a425342 -
Fixes to main method handling of various combinations of
--exec
,--yaml-out
,-yaml-stream
,--multi
, and--output-file
#174
- Update Mill to 0.10.12
- Fix parsing of k/v cli arguments with an "=" in the value
- Make lazy initialization of static Val.Obj thread-safe #136
- Deduplicate strings in the parser #137
- Update the JS example #141
- Performance improvements with lots of internal changes #117
- Bump uJson version to 1.3.7
- Bump uJson version to 1.3.0
- Avoid catching fatal exceptions during evaluation
- Add
--yaml-debug
flag to add source-line comments showing where each line of YAML came from #105#105 - Add
objectValues
andobjectVlauesAll
to stdlib #104
- Allow direct YAML output generation via
--yaml-out
- Do not allow duplicate field in object when evaluating list list comprehension #100
- Fix compiler crash when '+' signal is true in a field declaration inside a list comprehension #98
- Fix error message for too many arguments with at least one named arg #97
- Streaming JSON output to disk for lower memory usage #85
- Static detection of duplicate fields #86
- Strict mode to disallow error-prone adjacent object literals #88
- Add
std.flatMap
,std.repeat
,std.clamp
,std.member
,std.stripChars
,std.rstripChars
,std.lstripChars
- Add support for syntactical key ordering #53
- Bump dependency versions
- Bump verion of Scalatags, uPickle
- Bump version of FastParse
- Bump versions of OS-Lib, uJson, Scalatags
- Support std lib methods that take a key lambda #40
- Handle hex in unicode escaoes #41
- Add encodeUTF8, decodeUTF8 std lib methdos #42
- Properly fail on non-boolean conditionals #44
- Support YAML-steam output #45
- ~2x performance increase
- Javascript support, allowing Sjsonnet to be used in the browser or on Node.js
- Performance improvements
- Scala 2.13 support
- Performance improvements
- Add
std.mod
,std.min
andstd.max
- Performance improvements
- Improvements to error reporting when types do not match
- Performance improvements to the parser via upgrading to Fastparse 2.x
- First release