Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

State greediness of GraphQL #564

Closed
wants to merge 1 commit into from
Closed

Conversation

budde377
Copy link

Explicitly state that the GraphQL grammar is greedy to clear up possible disambiguity.

Closes #539

@facebook-github-bot
Copy link

Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. In order for us to review and merge your code, please sign up at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need the corporate CLA signed.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

@facebook-github-bot
Copy link

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks!

@IvanGoncharov IvanGoncharov added the 🤷‍♀️ Ambiguity An issue/PR which identifies or fixes spec ambiguity label Feb 13, 2019
Explicitly state that the GraphQL grammar is greedy to clear up possible disambiguity.

Closes graphql#539
@leebyron
Copy link
Collaborator

leebyron commented Jul 3, 2019

Sorry, just coming through these editorial changes now.

After reading this change, honestly I'm still confused. In fact I'm worried that this clarification is actually not how we expect parsers to behave. I attempted to translate this to the leftmost / rightmost terminology but that just made it worse.

I think the answer here is not to describe rules for interpreting ambiguous grammars but instead to edit the grammar to be unambiguous.

I think we can address the specific case for #539 - but we'll need to address individual other cases if they occur. @budde377 did you encounter any other grammar ambiguities?

@budde377
Copy link
Author

budde377 commented Jul 4, 2019

@leebyron I did not encounter any other ambiguities, then this one from the extension + query-shorthand syntax.

@IvanGoncharov
Copy link
Member

I did not encounter any other ambiguities, then this one from the extension + query-shorthand syntax.

@budde377 @leebyron Just as a clarification since we support "empty" types they are also affected:

interface Foo
{
  bar: String
}

Since interface Foo is a valid interface definition on its own.

@IvanGoncharov
Copy link
Member

@leebyron I just remembered that we use the same notation both for lexer and parser rules so this problem is even more prominent in the lexer, e.g.

{
  foo(arg: [""""""])
}

Since spaces and commas are optional this arg can be interpreted as a single blockstring or as 3 separate strings, like this:

{
  foo(arg: ["" "" ""])
}

So if rules are non-greedy we need a special rule just for lexer or we would trouble even with simple numbers as I pointed here: #539 (comment)

leebyron added a commit that referenced this pull request Jul 23, 2019
Partial fix to #564, adds lookahead constraints to type system extensions to remove ambiguity or inefficiency from the grammar.

The GraphQL grammar should be parsed in linear type with at most one lookahead. Since extensions have an optional `{ }` body, finding the token `{` should always attempt to parse the relevant portion of the type extension rather than completing the type extension and attempting to parse the query shorthand SelectionSet.
leebyron added a commit that referenced this pull request Jul 23, 2019
GraphQL syntactical grammars intend to be unambiguous. While lexical grammars should also be - there has historically been an assumption that lexical parsing is greedy. This is obvious for numbers and words, but less obvious for empty block strings.

Either way, the additional clarity removes ambiguity from the spec

Partial fix for #564

Specifically addresses #564 (comment)
@leebyron
Copy link
Collaborator

I just added two PRs for clarifying that the lexical grammar is greedy, and removing grammar ambiguity from type system extensions.

However:

Just as a clarification since we support "empty" types they are also affected

I'm thinking we want to revisit this. I can't remember our original rational for supporting this since later there are rules that object and interface types must have fields. Did we support this before there was support for type extensions? @IvanGoncharov & @mjmahone are you either aware of where or why we support this?

Our options here are to either apply the same restriction as #598 or just require the full definition. I favor the later for simplicity if this just removes a historical oddity instead of a reasonably supported feature

leebyron added a commit that referenced this pull request Jul 23, 2019
GraphQL syntactical grammars intend to be unambiguous. While lexical grammars should also be - there has historically been an assumption that lexical parsing is greedy. This is obvious for numbers and words, but less obvious for empty block strings.

Either way, the additional clarity removes ambiguity from the spec

Partial fix for #564

Specifically addresses #564 (comment)
leebyron added a commit that referenced this pull request Jul 23, 2019
GraphQL syntactical grammars intend to be unambiguous. While lexical grammars should also be - there has historically been an assumption that lexical parsing is greedy. This is obvious for numbers and words, but less obvious for empty block strings.

Either way, the additional clarity removes ambiguity from the spec

Partial fix for #564

Specifically addresses #564 (comment)
leebyron added a commit that referenced this pull request Jul 23, 2019
GraphQL syntactical grammars intend to be unambiguous. While lexical grammars should also be - there has historically been an assumption that lexical parsing is greedy. This is obvious for numbers and words, but less obvious for empty block strings.

Either way, the additional clarity removes ambiguity from the spec

Partial fix for #564

Specifically addresses #564 (comment)
leebyron added a commit that referenced this pull request Jul 23, 2019
GraphQL syntactical grammars intend to be unambiguous. While lexical grammars should also be - there has historically been an assumption that lexical parsing is greedy. This is obvious for numbers and words, but less obvious for empty block strings.

Either way, the additional clarity removes ambiguity from the spec

Partial fix for #564

Specifically addresses #564 (comment)
@IvanGoncharov
Copy link
Member

I'm thinking we want to revisit this. I can't remember our original rational for supporting this since later there are rules that object and interface types must have fields.

This rule concern fully built an "executable schema" that is used to execute all queries (including introspection) but it doesn't affect the intermediate schemas.

Did we support this before there was support for type extensions? @IvanGoncharov & @mjmahone are you either aware of where or why we support this?

People using SDL to defining schema wanted to split it into multiple files so they needed the ability to define empty root types and latter extend it. Here is the relevant thread:
#90 (comment)

Reflecting back on this discussion I would actually prefer allowing empty list both for definitions and extensions as in legacy SDL syntax:
https://github.com/graphql/graphql-js/blob/670bbaca5ed897d0ee2caadbc8ac3a0bbf281b87/src/language/parser.js#L73-L81

@IvanGoncharov
Copy link
Member

@leebyron Also forget to mention that "empty types" is very important for the ability to print intermediate schema or to define types reusable between different APIs.

leebyron added a commit that referenced this pull request Jul 23, 2019
GraphQL syntactical grammars intend to be unambiguous. While lexical grammars should also be - there has historically been an assumption that lexical parsing is greedy. This is obvious for numbers and words, but less obvious for empty block strings.

Either way, the additional clarity removes ambiguity from the spec

Partial fix for #564

Specifically addresses #564 (comment)
leebyron added a commit that referenced this pull request Jul 30, 2019
Partial fix to #564, adds lookahead constraints to type system extensions to remove ambiguity or inefficiency from the grammar.

The GraphQL grammar should be parsed in linear type with at most one lookahead. Since extensions have an optional `{ }` body, finding the token `{` should always attempt to parse the relevant portion of the type extension rather than completing the type extension and attempting to parse the query shorthand SelectionSet.
leebyron added a commit that referenced this pull request Jul 30, 2019
GraphQL syntactical grammars intend to be unambiguous. While lexical grammars should also be - there has historically been an assumption that lexical parsing is greedy. This is obvious for numbers and words, but less obvious for empty block strings.

This also removes regular expression representation from the lexical grammar notation, since it wasn't always clear.

Either way, the additional clarity removes ambiguity from the spec

Partial fix for #564

Specifically addresses #564 (comment)
leebyron added a commit that referenced this pull request Jul 30, 2019
GraphQL syntactical grammars intend to be unambiguous. While lexical grammars should also be - there has historically been an assumption that lexical parsing is greedy. This is obvious for numbers and words, but less obvious for empty block strings.

This also removes regular expression representation from the lexical grammar notation, since it wasn't always clear.

Either way, the additional clarity removes ambiguity from the spec

Partial fix for #564

Specifically addresses #564 (comment)
leebyron added a commit that referenced this pull request Aug 7, 2019
GraphQL syntactical grammars intend to be unambiguous. While lexical grammars should also be - there has historically been an assumption that lexical parsing is greedy. This is obvious for numbers and words, but less obvious for empty block strings.

This also removes regular expression representation from the lexical grammar notation, since it wasn't always clear.

Either way, the additional clarity removes ambiguity from the spec

Partial fix for #564

Specifically addresses #564 (comment)
leebyron added a commit that referenced this pull request Aug 7, 2019
GraphQL syntactical grammars intend to be unambiguous. While lexical grammars should also be - there has historically been an assumption that lexical parsing is greedy. This is obvious for numbers and words, but less obvious for empty block strings.

This also removes regular expression representation from the lexical grammar notation, since it wasn't always clear.

Either way, the additional clarity removes ambiguity from the spec

Partial fix for #564

Specifically addresses #564 (comment)
leebyron added a commit that referenced this pull request Aug 7, 2019
GraphQL syntactical grammars intend to be unambiguous. While lexical grammars should also be - there has historically been an assumption that lexical parsing is greedy. This is obvious for numbers and words, but less obvious for empty block strings.

This also removes regular expression representation from the lexical grammar notation, since it wasn't always clear.

Either way, the additional clarity removes ambiguity from the spec

Partial fix for #564

Specifically addresses #564 (comment)
leebyron added a commit that referenced this pull request Jan 10, 2020
GraphQL syntactical grammars intend to be unambiguous. While lexical grammars should also be - there has historically been an assumption that lexical parsing is greedy. This is obvious for numbers and words, but less obvious for empty block strings.

This also removes regular expression representation from the lexical grammar notation, since it wasn't always clear.

Either way, the additional clarity removes ambiguity from the spec

Partial fix for #564

Specifically addresses #564 (comment)
leebyron added a commit that referenced this pull request Jan 10, 2020
GraphQL syntactical grammars intend to be unambiguous. While lexical grammars should also be - there has historically been an assumption that lexical parsing is greedy. This is obvious for numbers and words, but less obvious for empty block strings.

This also removes regular expression representation from the lexical grammar notation, since it wasn't always clear.

Either way, the additional clarity removes ambiguity from the spec

Partial fix for #564

Specifically addresses #564 (comment)
leebyron added a commit that referenced this pull request Jan 10, 2020
GraphQL syntactical grammars intend to be unambiguous. While lexical grammars should also be - there has historically been an assumption that lexical parsing is greedy. This is obvious for numbers and words, but less obvious for empty block strings.

This also removes regular expression representation from the lexical grammar notation, since it wasn't always clear.

Either way, the additional clarity removes ambiguity from the spec

Partial fix for #564

Specifically addresses #564 (comment)
Base automatically changed from master to main February 3, 2021 04:50
@leebyron leebyron added this to the May2021 milestone Apr 6, 2021
leebyron added a commit that referenced this pull request Apr 8, 2021
Partial fix to #564, adds lookahead constraints to type system extensions to remove ambiguity or inefficiency from the grammar.

The GraphQL grammar should be parsed in linear type with at most one lookahead. Since extensions have an optional `{ }` body, finding the token `{` should always attempt to parse the relevant portion of the type extension rather than completing the type extension and attempting to parse the query shorthand SelectionSet.
leebyron added a commit that referenced this pull request Apr 8, 2021
Partial fix to #564, adds lookahead constraints to type system extensions to remove ambiguity or inefficiency from the grammar.

The GraphQL grammar should be parsed in linear type with at most one lookahead. Since extensions have an optional `{ }` body, finding the token `{` should always attempt to parse the relevant portion of the type extension rather than completing the type extension and attempting to parse the query shorthand SelectionSet.
leebyron added a commit that referenced this pull request Apr 8, 2021
Partial fix to #564, adds lookahead constraints to type system extensions to remove ambiguity or inefficiency from the grammar.

The GraphQL grammar should be parsed in linear type with at most one lookahead. Since extensions have an optional `{ }` body, finding the token `{` should always attempt to parse the relevant portion of the type extension rather than completing the type extension and attempting to parse the query shorthand SelectionSet.
@leebyron
Copy link
Collaborator

Coming back to this to recap what's happened since this PR was opened.

In a comment above, I wanted to resolve this by removing grammar ambiguities rather than introducing a rule for how to interpret them

I think the answer here is not to describe rules for interpreting ambiguous grammars but instead to edit the grammar to be unambiguous.

After that, two RFCs were introduced and ultimately approved and landed:

#598 - Solves the specific ambiguity pointed out in this PR - incomplete type definitions/extensions followed by a query shorthand.

#599 - Requires that the lexer be greedy, adds prose describing this behavior, and adds negative lookaheads so the grammar alone produces the greedy behavior.

At this point, I think we no longer need this PR and my plan is to close it. Any thoughts otherwise @graphql/tsc ?

@leebyron
Copy link
Collaborator

Given the above (most clear result https://spec.graphql.org/draft/#note-ba22f) I'll close this PR.

I'm happy to revisit this later if there are additional issues

@leebyron leebyron closed this Apr 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤷‍♀️ Ambiguity An issue/PR which identifies or fixes spec ambiguity
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Clear-up disambiguity when using TypeExtension's and query shorthand
4 participants