Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Literal whitespace vs literal node #49

Open
vinipsmaker opened this issue Nov 14, 2020 · 7 comments
Open

Literal whitespace vs literal node #49

vinipsmaker opened this issue Nov 14, 2020 · 7 comments

Comments

@vinipsmaker
Copy link
Contributor

So, I was playing with lua bindings to trial.protocol (actually it's a bigger project that happens to have JSON support as well and I'm using trial.protocol for that) and I end up finally having an use-case for the writer API. So, the following lua code:

local json = require('json')

local writer = json.writer.new()

writer:begin_object()
writer:value('foo')
writer:value('bar')
writer:end_object()
print(writer:generate())

will print:

{"foo":"bar"}

So far so good, but the literal API seems incomplete. There are two types of literals — nodes and insignificant linear whitespace. Linear whitespace literals are useful to indent the generated document, but node literals are useful in serialization libraries. For my lua bindings, the user might write a __tojson() metamethod as:

function __tojson(self, state)
    local writer = state.writer

    writer:begin_object()
    writer:value('type')
    writer:value(0)
    writer:end_object()
end

But that's kind of verbose. And given the type to be serialized has a constant representation, that's also inefficient. I would like to be able to write the following:

function __tojson(self, state)
    local writer = state.writer

    writer:literal('{"type":0}')
end

But that obviously isn't going to work. When called as part of a bigger object as in:

writer:begin_object()
writer:value('hello')
writer:value('world')
writer:value('foo')
encode_foo(foo, writer)
writer:end_object()
print(writer:generate())

The output will be:

{"hello":"world","foo"{"type":0}}

An invalid JSON.

So, we need a type of literal that accounts for a raw node to be written. I don't think that's hard. I think the bikeshedding to choose the function name is going to be more demanding than the implementation itself.

Anyway, I'm not in a rush to see this issue solved. There are other features besides JSON I have to work on before releasing my project.

@breese
Copy link
Owner

breese commented Nov 15, 2020

What kind of API do you have in mind?

@vinipsmaker
Copy link
Contributor Author

I think a...

size_t raw_node(view_type v);

It's the same signature as of literal(). The difference is that the usual separators will be inserted as if a node was inserted.

An alternative approach would be to change literal() parameters so you can state your intention and implementation choice there, but I don't like this idea.

TBH I think the hard task will be to choose the name. Is raw_node() a good name? The implementation itself doesn't seem challenging.

@breese
Copy link
Owner

breese commented Nov 15, 2020

The requirement is that the inserted fragment is a valid JSON element. Otherwise the writer could end up being confused about the current separator and nesting level.

So we could call it element().

@vinipsmaker
Copy link
Contributor Author

The name element() works for me.

@vinipsmaker
Copy link
Contributor Author

On a second thought, I think the code will be more readable if you use the name raw_element(). To the non-initiated user, there is no difference between value and element (and indeed there is little difference between these two grammar rules in the json spec). To someone that just started to hack on a new codebase, the name choices between value() and element() would seem arbitrary and not intuitive. raw_value() would be yet another option (but then again we're just one jump away from literal_value() and two jumps from literal()). Maybe rename literal() to literal_ws() and use the name literal() for this function (but it'd be a bold API breakage here).

@breese
Copy link
Owner

breese commented Dec 6, 2020

An alternative solution is a separator() function that writes the correct separator depending on context -- comma separator in a JSON array, alternating colon and comma separators in JSON object, and nothing in the top-level scope.

separator() must be called before inserting raw data. This would allow us to insert raw data in chunks using multiple literal() calls.

  writer.value<begin_array>();
  writer.separator();
  writer.literal("null");
  writer.separator();
  writer.literal("nu");
  writer.literal("ll");
  writer.value<end_array>();

@vinipsmaker
Copy link
Contributor Author

That works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants