Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add coll.JQ using gojq library #1585

Merged
merged 16 commits into from
Dec 29, 2022
Merged

Conversation

ahochsteger
Copy link
Contributor

@ahochsteger ahochsteger commented Dec 21, 2022

This PR adds support for the jq language to filter/transform complex data structures.
See [docs/content/functions/coll.md] for usage documentation.

Tasks:

  • Add coll.JQ implementation
  • Add documentation
  • Add tests

Questions:

  • Is it allowed for collection functions (coll.JQ in this case) to return nil if the expression evaluates to nil (or JSON null) or should an error be returned instead?
    -> Yes, should be ok. Should be mentioned in the documentation.
  • I'm not sure if it is a good idea to magically turn an array with one result into a single item as it is done for jsonpath and currently done for jq the same way. If an array returns just one element by accident it may break the template.
    -> Since there's no easy way to distinguish between an array with a single element and just a single element in the result it is kept the way it is done for jsonpath.

Future enhancements:

  • Add variable binding (use gomplate variables in JQ expressions). It's not straight forward to implement in go templates.

Resolves #1584

@ahochsteger
Copy link
Contributor Author

@hairyhenderson, I'd like to allow variables defined in the template to be used within JQ expressions.

gojq provides a way to do that using gojq.WithVariables(), but I'm unable to find a way to get the required information (names + values of variables).

This is how I'd use it (using hard-coded variables + values as an example):

func JQ(jqExpr string, in interface{}) (interface{}, error) {
	query, err := gojq.Parse(jqExpr)
	variables := []string{"$testVariable1", "$testVariable2"} // TODO: Use variable names from context
	code, err := gojq.Compile(query, gojq.WithVariables(variables))
	iter := code.Run(in, "test value 1", "test value 2") // TODO: Use variable values from context
        ...

Is there a way to do what I want?

@ahochsteger ahochsteger marked this pull request as draft December 22, 2022 09:16
@ahochsteger ahochsteger changed the title [WIP] Add coll.JQ using gojq library Add coll.JQ using gojq library Dec 22, 2022
coll/jq.go Outdated Show resolved Hide resolved
coll/jq_test.go Outdated Show resolved Hide resolved
ahochsteger and others added 3 commits December 29, 2022 16:31
coll/jq.go Outdated Show resolved Hide resolved
@hairyhenderson
Copy link
Owner

Thanks for this, @ahochsteger - as to your questions:

Add variable binding (use gomplate variables in JQ expressions)

I'm not sure if this is going to be straightforward. Template functions don't have access to the template's context other than through passing in arguments. To allow this would probably open up security risks in the general case.

Perhaps an approach similar to the template keyword (like tmpl.Exec) could be used, where a context value (often a map) is passed as an argument. For coll.JQ, it could look like this:

...
{{ "some input" | coll.JQ "some expression" (dict "a" "foo") }}

In that case a variable $a could be defined, with a value of "foo".

However, it also may be a good idea to simply defer this and not support variable binding to begin with. It can be added later if there's demand.

Is it allowed for collection functions (coll.JQ in this case) to return nil if the expression evaluates to nil (or JSON null) or should an error be returned instead?

I don't know that I see a reason to not allow this, especially as it's certainly possible to have an explicit JSON value of null:

$ echo '[null]' | jq '.[0]'
null

What's unfortunate about jq (and most JSON parsers) is there's effectively no difference between {"a": null} and {}. If I query .a for each of these inputs the result is always null.

Either way, I think it's OK to return nil in these sorts of cases. Just be sure to document it.

@ahochsteger
Copy link
Contributor Author

What's unfortunate about jq (and most JSON parsers) is there's effectively no difference between {"a": null} and {}. If I query .a for each of these inputs the result is always null.

For this use case jq allows to check if a key exists using the has function:

$ gojq -n '{"a":null} | has("a")' 
true
$ gojq -n '{"a":null} | has("b")' 
false

coll/jq.go Show resolved Hide resolved
@ahochsteger ahochsteger marked this pull request as ready for review December 29, 2022 17:11
coll/jq_test.go Outdated
Comment on lines 136 to 137
// TODO: Check if this is a valid test case (taken from jsonpath_test.go) since the struct
// had to be converted to JSON and parsed from it again to be able to process using gojq.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔

I suppose this is because JQ can't be used with arbitrary structs or other types, which is perhaps something we should allow...

In the same vein, it might be useful to be able to pass the template context (gomplate.tmplctx) as the input for something like:

$ gomplate -c books=https://openlibrary.org/subjects/fantasy.json -i '{{ jq `.works[].title` . }}'

Right now this fails with:

executing JQ failed: %!v(PANIC=Error method: invalid type: *gomplate.tmplctx (&map[books:map[key:/subjects/fantasy...

What's interesting in that particular case is that gomplate.tmplctx is a map[string]interface{}, which is in theory OK.

I'm going to hack on this a bit and make a commit if I can figure It out...

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've gotten this to work:

$ bin/gomplate -c books=https://openlibrary.org/subjects/fantasy.json \
-i '{{ range jq `.books.works[].title` . -}}
title: {{.}}
{{end}}'
title: Alice's Adventures in Wonderland
title: Gulliver's Travels
title: Treasure Island
title: Through the Looking-Glass
title: A Midsummer Night's Dream
title: Il principe
title: The Wonderful Wizard of Oz
title: Avventure di Pinocchio
title: Alice's Adventures in Wonderland / Through the Looking Glass
title: Five Children and It
title: The Hobbit
title: Harry Potter and the Philosopher's Stone

Note that I've referenced the books datasource in the . context passed through as an argument.

coll/jq.go Outdated Show resolved Hide resolved
Copy link
Owner

@hairyhenderson hairyhenderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for working on this, @ahochsteger!

@hairyhenderson
Copy link
Owner

I've added an integration test, and once the CI passes, this'll merge.

@hairyhenderson hairyhenderson enabled auto-merge (squash) December 29, 2022 21:52
@hairyhenderson hairyhenderson merged commit e045af5 into hairyhenderson:main Dec 29, 2022
@ahochsteger
Copy link
Contributor Author

Thanks for merging it, @hairyhenderson!
Any plans for a new release containing this PR?

@hairyhenderson
Copy link
Owner

@ahochsteger it'll be in 4.0.0 - too much has changed in main for it to be straightforward to go into a new 3.x release unfortunately 😞

I have no target date for 4.0, but in the meantime you can build your own binary from main to use this...

moshloop pushed a commit to flanksource/gomplate that referenced this pull request Apr 1, 2023
* feat: add coll.JQ using gojq library

* fix: jq function naming (gojq -> jq)

* test: add tests (take from jsonpath_test.go)

* chore: add TODO for nil values (are they allowed?)

* refactor: use fmt.Errorf instead of errors.Wrapf

Co-authored-by: Dave Henderson <[email protected]>

* fix: wrong alias for coll.JQ

Co-authored-by: Dave Henderson <[email protected]>

* docs: add links to JQ

Co-authored-by: Dave Henderson <[email protected]>

* test: add assertions after json marshal/unmarshal

Co-authored-by: Dave Henderson <[email protected]>

* refactor: use fmt.Errorf instead of errors.Wrapf

Co-authored-by: Dave Henderson <[email protected]>

* fix: test syntax and null handling

* docs: improve documentation

* docs: add blank line

* Support cancellation

Signed-off-by: Dave Henderson <[email protected]>

* Support (almost) all types, not just map[string]interface{} and []interface{}

Signed-off-by: Dave Henderson <[email protected]>

* add an integration test for coll.JQ

Signed-off-by: Dave Henderson <[email protected]>

Signed-off-by: Dave Henderson <[email protected]>
Co-authored-by: Andreas Hochsteger <[email protected]>
Co-authored-by: Dave Henderson <[email protected]>
@hairyhenderson hairyhenderson added this to the v4.0.0 milestone Jun 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature Request: Add jq support
2 participants