Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FieldNotFound but field is shown in valid_fields #6699

Open
MidasLamb opened this issue Jun 16, 2023 · 2 comments
Open

FieldNotFound but field is shown in valid_fields #6699

MidasLamb opened this issue Jun 16, 2023 · 2 comments
Labels
bug Something isn't working

Comments

@MidasLamb
Copy link

Describe the bug

I'm doing a query with aliases, and when trying to execute it, it gives back a SchemaError saying FieldNotFound, however that error also contains a list of valid fields, and the field it's saying isn't found, is listed in there.
The error (formatted):

SchemaError(FieldNotFound {
field:
    Column { relation: Some(Bare { table: "linkedtwintag" }), name: "776ff32c7d7bfb1c1a49d5c0d94e0db8" },
valid_fields: [
    Column { relation: Some(Bare { table: "75620d04a8b7b34c70c68f7af68953af" }), name: "qid" },
    Column { relation: Some(Bare { table: "75620d04a8b7b34c70c68f7af68953af" }), name: "timestamp" },
    Column { relation: Some(Bare { table: "75620d04a8b7b34c70c68f7af68953af" }), name: "createdDate" },
    Column { relation: Some(Bare { table: "75620d04a8b7b34c70c68f7af68953af" }), name: "82ad7656c6b6be56e9abfac772c89b4c" },
    Column { relation: Some(Bare { table: "linkedtwintag" }), name: "dataScope" },
    Column { relation: Some(Bare { table: "linkedtwintag" }), name: "qid" },
    Column { relation: Some(Bare { table: "linkedtwintag" }), name: "timestamp" },
    Column { relation: Some(Bare { table: "linkedtwintag" }), name: "createdDate" },
    Column { relation: Some(Bare { table: "linkedtwintag" }), name: "776ff32c7d7bfb1c1a49d5c0d94e0db8" }
] })

As you can see, the field that is not found is the last one in the valid_fields list.

To Reproduce

Create two mysql tables, and use the following crate to pull them in: https://github.com/splitgraph/seafowl/tree/main/datafusion_remote_tables.

Execute the following (or similar query):

SELECT "75620d04a8b7b34c70c68f7af68953af".*
FROM "75620d04a8b7b34c70c68f7af68953af" 
JOIN "f4be58656cd5e54b9bffa1706c3c49ff" AS "linkedTwintag" ON "75620d04a8b7b34c70c68f7af68953af"."82ad7656c6b6be56e9abfac772c89b4c" = "linkedTwintag"."776ff32c7d7bfb1c1a49d5c0d94e0db8" 
WHERE 1 = 1  AND "linkedTwintag"."createdDate" = ?  AND "linkedTwintag"."qid" = ?  ORDER BY "75620d04a8b7b34c70c68f7af68953af"."qid" ASC"

Expected behavior

I get back a normal resultbatch.

Additional context

I've not used datafusion a lot, so I'm not 100% sure whether its a bug in the crate for pulling in mysql tables, or if its a bug in datafusion itself.

@MidasLamb MidasLamb added the bug Something isn't working label Jun 16, 2023
@MidasLamb
Copy link
Author

I'm digging what deeper and it seems to be caused by the capitalization inside of "linkedTwintag", as putting some dbg! statements everywhere, I see that at some point it compares

Some(
        Bare {
            table: "linkedTwintag",
        },
    ),
    Some(
        Bare {
            table: "linkedtwintag",
        },
    ),

which results in it not being a match

@alamb
Copy link
Contributor

alamb commented Jun 30, 2023

I saw something similar the other day: #6790

i wonder if it is possible to create a datafusion only reproducer (with the datafusion-cli for example)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants