Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with Extra Slash in S3 Location Path Affecting Table Accessibility Across Trino Versions #24230

Closed
rvishureddy opened this issue Nov 22, 2024 · 1 comment

Comments

@rvishureddy
Copy link

rvishureddy commented Nov 22, 2024

When creating Iceberg tables in Trino (version 427), using an S3 location path with an extra trailing slash (e.g., s3a://bucket-name//table-path/), the table becomes inaccessible when attempting to query it using a different version of Trino (version 463). This issue seems to be specific to Iceberg tables with such paths, and tables with a single slash (s3a://bucket-name/table-path/) work correctly across both versions.

Steps to Reproduce:
In Trino 427, create an Iceberg table with an S3 location containing an extra slash:

CREATE TABLE iceberg.analytics.tableCreatedViaTrino427(
    region varchar,
    sim varchar,
    icid varchar,
    lastupdated timestamp(6)
)
WITH (
    format = 'ORC',
    format_version = 2,
    location = 's3a://bucketName//tableCreatedViaTrino427'
);

Now attempt to access this table using Trino 463 in the same environment.

SELECT * FROM iceberg.analytics.tableCreatedViaTrino427_1;

SQL Error [84148239]: Query failed (#20241122_143137_00011_sckk8): Metadata not found in metadata location for table analytics.tablecreatedviatrino427_1

Debug Log from Trino 463
Caused by: org.apache.iceberg.exceptions.NotFoundException: Failed to open input stream for file: s3a://bucketName//tableCreatedViaTrino427/metadata/00103-74b8fcc3-29ac-4aa6-9068-d6d2ab827885.metadata.json
at io.trino.plugin.iceberg.fileio.ForwardingInputFile.newStream(ForwardingInputFile.java:55)
at org.apache.iceberg.TableMetadataParser.read(TableMetadataParser.java:279)

Verified through aws console the metadata file exists at the location. 

Expected behavior: The table should be accessible across both Trino versions.
Actual behavior: The table is not accessible in Trino 463, with an error indicating an issue with the table location.

This issue seems to stem from Trino's handling of the extra slash in the location path. The problem doesn't occur when the path has a single slash (e.g., s3a://bucketName/tableCreatedViaTrino427_1). We believe the extra slash may cause inconsistencies in how Trino resolves the location across different versions, resulting in table inaccessibility.

Is there a way to configure Trino to ignore extra slashes in the location path for Iceberg tables, or should this be addressed directly in the Iceberg connector?

Request for Help:
We would appreciate any guidance on whether this behavior is a known issue . Also, if there is a configuration parameter or a workaround to make both versions of Trino handle the extra slash correctly, that would be extremely helpful.

If the table is created with location having double slash, That table is accessible from the Trino version that created it and not accessible from a different version of Trino.

In the Trino v463 i am using s3 native support in all connectors and hive is my metadata store.

@ebyhr
Copy link
Member

ebyhr commented Nov 23, 2024

Let's continue in #23097

@ebyhr ebyhr closed this as not planned Won't fix, can't repro, duplicate, stale Nov 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants