Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tsql: escaped data types don't parse #2102

Closed
dmoore247 opened this issue Aug 22, 2023 · 2 comments
Closed

tsql: escaped data types don't parse #2102

dmoore247 opened this issue Aug 22, 2023 · 2 comments

Comments

@dmoore247
Copy link
Contributor

Thanks for cranking through these parse errors.
One more, this is weird, in TSQL you can escape data types.

sqlglot generates parse error when square bracked and double quote escapes are used on the data type name.
Stumbled upon this while working with real code.

Fully reproducible code snippet

sqlglot.parse("CREATE TABLE [dbo].[zipcode]([zip_cd] [int] NOT NULL)",read="tsql")
sqlglot.errors.ParseError: Expecting ). Line 1, Col: 43.
  CREATE TABLE [dbo].[zipcode]([zip_cd] [int] NOT NULL)

and
image

Works in sqlfiddle with square bracket escape:
image

with double quote escape:
image

And works as you might expect with the square bracket escapes
image

Official Documentation
Create Table https://learn.microsoft.com/en-us/sql/t-sql/statements/create-table-transact-sql?view=sql-server-ver16

@dmoore247
Copy link
Contributor Author

Log & traceback

arseError                                Traceback (most recent call last)
File <command-1570387049187564>, line 1
----> 1 sqlglot.parse("""CREATE TABLE [dbo].[zipcode]([zip_cd] "int" NOT NULL)""",read="tsql")

File /local_disk0/.ephemeral_nfs/envs/pythonEnv-40d0a615-c306-4575-b7f6-dcdca910ee33/lib/python3.10/site-packages/sqlglot/__init__.py:86, in parse(sql, read, dialect, **opts)
     73 """
     74 Parses the given SQL string into a collection of syntax trees, one per parsed SQL statement.
     75 
   (...)
     83     The resulting syntax tree collection.
     84 """
     85 dialect = Dialect.get_or_raise(read or dialect)()
---> 86 return dialect.parse(sql, **opts)

File /local_disk0/.ephemeral_nfs/envs/pythonEnv-40d0a615-c306-4575-b7f6-dcdca910ee33/lib/python3.10/site-packages/sqlglot/dialects/dialect.py:289, in Dialect.parse(self, sql, **opts)
    288 def parse(self, sql: str, **opts) -> t.List[t.Optional[exp.Expression]]:
--> 289     return self.parser(**opts).parse(self.tokenize(sql), sql)

File /local_disk0/.ephemeral_nfs/envs/pythonEnv-40d0a615-c306-4575-b7f6-dcdca910ee33/lib/python3.10/site-packages/sqlglot/parser.py:901, in Parser.parse(self, raw_tokens, sql)
    887 def parse(
    888     self, raw_tokens: t.List[Token], sql: t.Optional[str] = None
    889 ) -> t.List[t.Optional[exp.Expression]]:
    890     """
    891     Parses a list of tokens and returns a list of syntax trees, one tree
    892     per parsed SQL statement.
   (...)
    899         The list of the produced syntax trees.
    900     """
--> 901     return self._parse(
    902         parse_method=self.__class__._parse_statement, raw_tokens=raw_tokens, sql=sql
    903     )

File /local_disk0/.ephemeral_nfs/envs/pythonEnv-40d0a615-c306-4575-b7f6-dcdca910ee33/lib/python3.10/site-packages/sqlglot/parser.py:967, in Parser._parse(self, parse_method, raw_tokens, sql)
    964 self._tokens = tokens
    965 self._advance()
--> 967 expressions.append(parse_method(self))
    969 if self._index < len(self._tokens):
    970     self.raise_error("Invalid expression / Unexpected token")

File /local_disk0/.ephemeral_nfs/envs/pythonEnv-40d0a615-c306-4575-b7f6-dcdca910ee33/lib/python3.10/site-packages/sqlglot/parser.py:1151, in Parser._parse_statement(self)
   1148     return None
   1150 if self._match_set(self.STATEMENT_PARSERS):
-> 1151     return self.STATEMENT_PARSERS[self._prev.token_type](self)
   1153 if self._match_set(Tokenizer.COMMANDS):
   1154     return self._parse_command()

File /local_disk0/.ephemeral_nfs/envs/pythonEnv-40d0a615-c306-4575-b7f6-dcdca910ee33/lib/python3.10/site-packages/sqlglot/parser.py:521, in Parser.<lambda>(self)
    452 COLUMN_OPERATORS = {
    453     TokenType.DOT: None,
    454     TokenType.DCOLON: lambda self, this, to: self.expression(
   (...)
    483     ),
    484 }
    486 EXPRESSION_PARSERS = {
    487     exp.Cluster: lambda self: self._parse_sort(exp.Cluster, TokenType.CLUSTER_BY),
    488     exp.Column: lambda self: self._parse_column(),
   (...)
    512     "JOIN_TYPE": lambda self: self._parse_join_parts(),
    513 }
    515 STATEMENT_PARSERS = {
    516     TokenType.ALTER: lambda self: self._parse_alter(),
    517     TokenType.BEGIN: lambda self: self._parse_transaction(),
    518     TokenType.CACHE: lambda self: self._parse_cache(),
    519     TokenType.COMMIT: lambda self: self._parse_commit_or_rollback(),
    520     TokenType.COMMENT: lambda self: self._parse_comment(),
--> 521     TokenType.CREATE: lambda self: self._parse_create(),
    522     TokenType.DELETE: lambda self: self._parse_delete(),
    523     TokenType.DESC: lambda self: self._parse_describe(),
    524     TokenType.DESCRIBE: lambda self: self._parse_describe(),
    525     TokenType.DROP: lambda self: self._parse_drop(),
    526     TokenType.FROM: lambda self: exp.select("*").from_(
    527         t.cast(exp.From, self._parse_from(skip_from_token=True))
    528     ),
    529     TokenType.INSERT: lambda self: self._parse_insert(),
    530     TokenType.LOAD: lambda self: self._parse_load(),
    531     TokenType.MERGE: lambda self: self._parse_merge(),
    532     TokenType.PIVOT: lambda self: self._parse_simplified_pivot(),
    533     TokenType.PRAGMA: lambda self: self.expression(exp.Pragma, this=self._parse_expression()),
    534     TokenType.ROLLBACK: lambda self: self._parse_commit_or_rollback(),
    535     TokenType.SET: lambda self: self._parse_set(),
    536     TokenType.UNCACHE: lambda self: self._parse_uncache(),
    537     TokenType.UPDATE: lambda self: self._parse_update(),
    538     TokenType.USE: lambda self: self.expression(
    539         exp.Use,
    540         kind=self._match_texts(("ROLE", "WAREHOUSE", "DATABASE", "SCHEMA"))
    541         and exp.var(self._prev.text),
    542         this=self._parse_table(schema=False),
    543     ),
    544 }
    546 UNARY_PARSERS = {
    547     TokenType.PLUS: lambda self: self._parse_unary(),  # Unary + is handled as a no-op
    548     TokenType.NOT: lambda self: self.expression(exp.Not, this=self._parse_equality()),
    549     TokenType.TILDA: lambda self: self.expression(exp.BitwiseNot, this=self._parse_unary()),
    550     TokenType.DASH: lambda self: self.expression(exp.Neg, this=self._parse_unary()),
    551 }
    553 PRIMARY_PARSERS = {
    554     TokenType.STRING: lambda self, token: self.expression(
    555         exp.Literal, this=token.text, is_string=True
   (...)
    574     TokenType.SESSION_PARAMETER: lambda self, _: self._parse_session_parameter(),
    575 }

File /local_disk0/.ephemeral_nfs/envs/pythonEnv-40d0a615-c306-4575-b7f6-dcdca910ee33/lib/python3.10/site-packages/sqlglot/dialects/tsql.py:583, in TSQL.Parser._parse_create(self)
    582 def _parse_create(self) -> exp.Create | exp.Command:
--> 583     create = super()._parse_create()
    585     if isinstance(create, exp.Create):
    586         table = create.this.this if isinstance(create.this, exp.Schema) else create.this

File /local_disk0/.ephemeral_nfs/envs/pythonEnv-40d0a615-c306-4575-b7f6-dcdca910ee33/lib/python3.10/site-packages/sqlglot/parser.py:1254, in Parser._parse_create(self)
   1251 self._match(TokenType.COMMA)
   1252 extend_props(self._parse_properties(before=True))
-> 1254 this = self._parse_schema(this=table_parts)
   1256 # exp.Properties.Location.POST_SCHEMA and POST_WITH
   1257 extend_props(self._parse_properties())

File /local_disk0/.ephemeral_nfs/envs/pythonEnv-40d0a615-c306-4575-b7f6-dcdca910ee33/lib/python3.10/site-packages/sqlglot/parser.py:3503, in Parser._parse_schema(self, this)
   3499     return this
   3501 args = self._parse_csv(lambda: self._parse_constraint() or self._parse_field_def())
-> 3503 self._match_r_paren()
   3504 return self.expression(exp.Schema, this=this, expressions=args)

File /local_disk0/.ephemeral_nfs/envs/pythonEnv-40d0a615-c306-4575-b7f6-dcdca910ee33/lib/python3.10/site-packages/sqlglot/parser.py:4879, in Parser._match_r_paren(self, expression)
   4877 def _match_r_paren(self, expression: t.Optional[exp.Expression] = None) -> None:
   4878     if not self._match(TokenType.R_PAREN, expression=expression):
-> 4879         self.raise_error("Expecting )")

File /local_disk0/.ephemeral_nfs/envs/pythonEnv-40d0a615-c306-4575-b7f6-dcdca910ee33/lib/python3.10/site-packages/sqlglot/parser.py:1011, in Parser.raise_error(self, message, token)
    999 error = ParseError.new(
   1000     f"{message}. Line {token.line}, Col: {token.col}.\n"
   1001     f"  {start_context}\033[4m{highlight}\033[0m{end_context}",
   (...)
   1007     end_context=end_context,
   1008 )
   1010 if self.error_level == ErrorLevel.IMMEDIATE:
-> 1011     raise error
   1013 self.errors.append(error)

ParseError: Expecting ). Line 1, Col: 43.
  CREATE TABLE [dbo].[zipcode]([zip_cd] "int" NOT NULL)

@dmoore247
Copy link
Contributor Author

Wow, amazing; 61 changes, 10 files. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant