Trino Iceberg s3 path handling incompatibility when using `s3a://` or `s3n://` paths with `//` #23097

sopel39 · 2024-08-21T20:36:51Z

Introduction:

Trino legacy S3 FS converts // -> / for s3a/s3n paths, but preserves // for s3 paths
Trino native S3 FS does preserve // for all s3/s3a/s3n
Other engines, e.g: Spark do preserve // for s3/s3a/s3n paths
Iceberg catalog stores location of metadata file (with //) as part of table information

Trino breaking behaviour

Create table in Trino with legacy FS:

trino> create table iceberg.schema.table (x bigint) with (location='s3a://bucket/schema//table')
CREATE TABLE

trino> show create table iceberg.schema.table;
                                               Create Table
----------------------------------------------------------------------------------------------------------
 CREATE TABLE iceberg.schema.table (
    x bigint
 )
 WITH (
    format = 'PARQUET',
    format_version = 2,
    location='s3a://bucket/schema//table')
 )

Query with native FS

trino> select * from iceberg.schema.table;
...
Caused by: java.io.FileNotFoundException: s3a://bucket/schema//table/metadata/00002-279c8ac6-0628-4687-aeb5-ef58b53d3b4a.metadata.json
	at io.trino.filesystem.s3.S3InputStream.seekStream(S3InputStream.java:199)

Query data with other engines, e.g: spark:

spark-sql ()> select * from iceberg.schema.table;
24/08/21 13:35:46 ERROR SparkSQLDriver: Failed in [select * from iceberg.schema.table]
org.apache.iceberg.exceptions.ServiceFailureException: Server error: NotFoundException: Location does not exist: s3a://bucket/schema//table/metadata/00002-279c8ac6-0628-4687-aeb5-ef58b53d3b4a.metadata.json

Problem

How to preserve forward/backward compatibility when native FS is finally turned on assuming it's HARD to rewrite existing, potentially broken data?

Solution proposals

Make native FS behave like legacy FS with regards to s3a/s3n (under feature flag)?
Try to open both locations (with/without //), but write only with // for both legacy and native?

cc @electrum @osscm @rstyp

The text was updated successfully, but these errors were encountered:

findepi · 2024-08-21T22:01:52Z

There was a problem with // in paths with resulted in both (a) // being replaced with / + (b) url-encoded path being appended to the path. It looks like this is a different problem, but would be great if you could confirm.

findepi · 2024-08-22T09:53:06Z

#17803 might be the issue, but not sure, it does not have examples.

sopel39 · 2024-08-22T10:00:43Z

#17803 might be the issue, but not sure, it does not have examples.

Seems like similar issue. #17958 doesn't help if other engine creates Hive such table with //. Anyway, mine issue is more about Iceberg than Hive.

findinpath · 2024-08-23T06:42:12Z

@sopel39 is this issue reproducible with Trino 454 ?

sopel39 · 2024-08-23T09:16:33Z

@findinpath yes, the issue is reproducible since when io.trino.filesystem.hdfs.HadoopPaths#hadoopPath was introduced, so many years backward.

We are actually leaning towards creating an Iceberg table property, e.g: native_fs_legacy_path_mode, which would remove // from S3 paths for selected tables when native FS is used. This would

allow to access "broken" tables from native FS
make new Iceberg correct going forward.

osscm · 2024-08-23T15:37:38Z

Yes, we need this, as soon legacy fs will be out, so we need way to read the broken tables from the native.

We are also leaning to keep this mode/functionality exactly same as legacy fs.
Where it is handling only s3 for // and not s3a or s3n

zhaner08 · 2024-08-28T18:22:09Z

we definitely have people relying on the legacy behavior, at least a flag to support the behavior if we dont want to make it as a default

findinpath · 2024-08-30T09:35:33Z

We went through with the investigation on this one (although it took a longer time to get to look deeper into it).
apologies for the delay in replying.

tldr;

Moving on the object storage the files from s3://bucket/double/slashes/table1 to s3://bucket/double//slashes/table1 (notice / ->// before slashes) fixes the problem and corresponds to the location of the files linked from the metadata and manifest files in the Iceberg tables.

What's left to do is to adjust HadoopPaths in legacy fs to support not only s3 , but also s3a & s3n

trino/lib/trino-hdfs/src/main/java/io/trino/filesystem/hdfs/HadoopPaths.java

Line 32 in 6aba748

    
           if ("s3".equals(hadoopPath.toUri().getScheme()) && !path.equals(hadoopPath.toString())) {

We'll follow-up soon with a PR and we'll work with @mosabua on documenting best a "fix" plan for this unfortunate hick-up.

Why is "move" (copy to new location & delete old files) in detriment of a backward-compatibility table property being suggested?

The table created on s3a or s3n without properly encoding the paths is effectively broken because the it has metadata files pointing to invalid locations.
I see no reason to bend the trinodb/trino s3 native file system to deal with broken tables.

mosabua · 2024-08-30T18:41:39Z

Why is "move" (copy to new location & delete old files) in detriment of a backward-compatibility table property being suggested?

The table created on s3a or s3n without properly encoding the paths is effectively broken because the it has metadata files pointing to invalid locations. I see no reason to bend the trinodb/trino s3 native file system to deal with broken tables.

I think we can not force users to perform this move when the migrate from the legacy file system to the new file system support. After all they used Trino in the first place to get the tables in place. It potentially comes with large cost and risk and we should offer a way for users to migrate to the new file system without a move.

mosabua · 2024-08-30T18:42:58Z

Also we are actively discussing this on slack and it will take some time to make a decision together with @electrum and others.

https://trinodb.slack.com/archives/C07ABNN828M/p1724946906029559

mayankvadariya · 2024-09-06T17:52:06Z

Meeting minutes with @mosabua @electrum @findinpath @mayankvadariya

We've decided to add a Trino Iceberg table migration procedure to update table metadata/manifests to contain the correct S3 path so that the table will be queryable in Native file system.

sopel39 added the bug Something isn't working label Aug 21, 2024

sopel39 mentioned this issue Aug 21, 2024

Do not remove double-slashes for s3a and s3n paths #23081

Closed

findepi changed the title ~~Trino Iceberg s3 path handling incompatibility~~ Trino Iceberg s3 path handling incompatibility when using s3a:// or s3n:// paths with // Aug 22, 2024

findinpath assigned mayankvadariya Aug 29, 2024

mayankvadariya mentioned this issue Sep 1, 2024

Do not normalize path with s3a/s3n scheme in Hadoop file system #23226

Closed

ebyhr mentioned this issue Nov 23, 2024

Issue with Extra Slash in S3 Location Path Affecting Table Accessibility Across Trino Versions #24230

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trino Iceberg s3 path handling incompatibility when using `s3a://` or `s3n://` paths with `//` #23097

Trino Iceberg s3 path handling incompatibility when using `s3a://` or `s3n://` paths with `//` #23097

sopel39 commented Aug 21, 2024 •

edited

Loading

findepi commented Aug 21, 2024

findepi commented Aug 22, 2024

sopel39 commented Aug 22, 2024 •

edited

Loading

findinpath commented Aug 23, 2024

sopel39 commented Aug 23, 2024

osscm commented Aug 23, 2024

zhaner08 commented Aug 28, 2024

findinpath commented Aug 30, 2024 •

edited

Loading

mosabua commented Aug 30, 2024

mosabua commented Aug 30, 2024

mayankvadariya commented Sep 6, 2024 •

edited

Loading

Trino Iceberg s3 path handling incompatibility when using s3a:// or s3n:// paths with // #23097

Trino Iceberg s3 path handling incompatibility when using s3a:// or s3n:// paths with // #23097

Comments

sopel39 commented Aug 21, 2024 • edited Loading

findepi commented Aug 21, 2024

findepi commented Aug 22, 2024

sopel39 commented Aug 22, 2024 • edited Loading

findinpath commented Aug 23, 2024

sopel39 commented Aug 23, 2024

osscm commented Aug 23, 2024

zhaner08 commented Aug 28, 2024

findinpath commented Aug 30, 2024 • edited Loading

mosabua commented Aug 30, 2024

mosabua commented Aug 30, 2024

mayankvadariya commented Sep 6, 2024 • edited Loading

Trino Iceberg s3 path handling incompatibility when using `s3a://` or `s3n://` paths with `//` #23097

Trino Iceberg s3 path handling incompatibility when using `s3a://` or `s3n://` paths with `//` #23097

sopel39 commented Aug 21, 2024 •

edited

Loading

sopel39 commented Aug 22, 2024 •

edited

Loading

findinpath commented Aug 30, 2024 •

edited

Loading

mayankvadariya commented Sep 6, 2024 •

edited

Loading