Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create plan for Kinesis connector changes #23921

Closed
mosabua opened this issue Oct 25, 2024 · 7 comments
Closed

Create plan for Kinesis connector changes #23921

mosabua opened this issue Oct 25, 2024 · 7 comments
Assignees
Labels
roadmap Top level issues for major efforts in the project

Comments

@mosabua
Copy link
Member

mosabua commented Oct 25, 2024

The Kinesis connector seems to have very limited usage from looking at slack conversations and vendor data.

The current connector codebase uses a old, deprecated SDK for Kinesis that is now in maintenance mode and will be completely deprecated in 2025. Nobody is available to improve the connector and upgrade it to the newer SDK.

This issue was raised in a maintainer call on the 24th of October 2024 by @wendigo and discussed with all the present maintainers.

We are contemplating removal of the connector similar to #23792 .

We are currently looking for more input and data about potential usage. Please comment in this ticket if you are using the Kinesis connector.

More importantly we are also looking for contributors who are willing to update the connector to the new SDK.

@mosabua mosabua added the roadmap Top level issues for major efforts in the project label Oct 25, 2024
@mosabua mosabua self-assigned this Oct 25, 2024
@raunaqmorarka
Copy link
Member

That connector was contributed to Trino by one of my colleagues at Qubole. We contributed it because we had no customers using it, didn't want to invest any more in maintaining it and thought that giving it to the community might breathe some life into it.
My impression is that it has stayed unused in Trino as well and it should be okay to remove it.

@mosabua
Copy link
Member Author

mosabua commented Oct 25, 2024

I prepared a PR in case we want to go ahead with removal - which seems likely at this stage.

@mosabua
Copy link
Member Author

mosabua commented Oct 29, 2024

I reached out to AWS team and Amazon Kinesis team .. hopefully we get some input from them.

mosabua added a commit to simpligility/trino that referenced this issue Oct 29, 2024
The connector seems unused from all interaction on the community
and following an investigation with users, vendors, and customers.

The SDK used is deprecated and will be removed.

More details about the removal are captured in
trinodb#23921
@mosabua
Copy link
Member Author

mosabua commented Dec 9, 2024

At this stage we will ask again at Trino Summit but will most likely remove the connector. No vendors chimed in with any interest either.

@mosabua
Copy link
Member Author

mosabua commented Jan 7, 2025

User survey from https://trino.io/blog/2025/01/07/2024-and-beyond includes a question for this change.

mosabua added a commit to simpligility/trino that referenced this issue Jan 27, 2025
The connector seems unused from all interaction on the community
and following an investigation with users, vendors, and customers.

The SDK used is deprecated and will be removed.

More details about the removal are captured in
trinodb#23921
@mosabua
Copy link
Member Author

mosabua commented Jan 27, 2025

No usage of the connector was reported in the survey, the January 2025 contributor call and further discussions on slack. As such we are proceeding to remove the connector with the Trino 470 release.

mosabua added a commit to simpligility/trino that referenced this issue Jan 28, 2025
The connector seems unused from all interaction on the community
and following an investigation with users, vendors, and customers.

The SDK used is deprecated and will be removed.

More details about the removal are captured in
trinodb#23921
mosabua added a commit to simpligility/trino that referenced this issue Jan 28, 2025
The connector seems unused from all interaction on the community
and following an investigation with users, vendors, and customers.

The SDK used is deprecated and will be removed.

More details about the removal are captured in
trinodb#23921
mosabua added a commit that referenced this issue Jan 28, 2025
The connector seems unused from all interaction on the community
and following an investigation with users, vendors, and customers.

The SDK used is deprecated and will be removed.

More details about the removal are captured in
#23921
@mosabua
Copy link
Member Author

mosabua commented Jan 28, 2025

PR for removal is merged. It will be part of the upcoming Trino 470 release.

@mosabua mosabua closed this as completed Jan 28, 2025
mayurjpatel pushed a commit to mayurjpatel/trino that referenced this issue Jan 29, 2025
The connector seems unused from all interaction on the community
and following an investigation with users, vendors, and customers.

The SDK used is deprecated and will be removed.

More details about the removal are captured in
trinodb#23921
hantmac pushed a commit to hantmac/trino that referenced this issue Feb 3, 2025
The connector seems unused from all interaction on the community
and following an investigation with users, vendors, and customers.

The SDK used is deprecated and will be removed.

More details about the removal are captured in
trinodb#23921
hantmac pushed a commit to hantmac/trino that referenced this issue Feb 4, 2025
Inline method addPrimaryKeyToCopyTable

Inline method `addPrimaryKeyToCopyTable` in `TestPostgreSqlJdbcConnectionAccesses`
and `TestPostgreSqlJdbcConnectionCreation` for readability

Fix session property description for NON_TRANSACTIONAL_MERGE

Also change the description of `NON_TRANSACTIONAL_INSERT`
to the "Enables support for non-transactional INSERT"

Fix storage table data clean up while dropping iceberg materialized view

This is for cleaning up data files in legacy mode i.e iceberg.materialized-views.hide-storage-table is set to false.
Delegating the drop table to metastore does not clean up the data files since for HMS,
the iceberg table is registered as an "external" table. So to fix this instead of delegating to metastore,
have the connector do the drop of the table and data files associated with it.

Rename partitions to partition_summaries in Iceberg manifests table

Restore removed assertions

These assertions are still useful since they don't use the MERGE code path. These were mistakenly removed in e88e2b1.

Only enable MERGE for MERGE specific tests

Before this change MERGE code path could inadvertently be used in places where we are not interested in testing MERGE.

Use getFileSystemFactory in BaseIcebergMaterializedViewTest

Extract helper method to get HiveMetastore

Inject execution interceptors using multibinder

The GlueMetastoreModule is not truly following the inversion of control
paradigm. Many things are created directly in the methods without using
injection. Using set multibinder for execution interceptors allows to
independently define execution interceptors and use Guice injection.

Co-authored-by: Grzegorz Kokosiński <[email protected]>

Extract table features constants and add ProtocolEntry builder

Add missing table features when adding timestamp_ntz in Delta

Update Hudi library to 1.0.0

Update airbase to 204

Update airlift to 292

Update metrics-core to 4.2.29

Update reactor-core to 3.7.1

Update swagger to 2.2.27

Update AWS SDK v2 to 2.29.31

Update JLine to 3.28.0

Update nessie to 0.101.1

Update s3mock testcontainers to 3.12.0

Update exasol to 24.2.1

Update google-sheets api to v4-rev20241203-2.0.0

Add assertConsistently

Use BigQuery storage read API when reading external BigLake tables

The storage APIs support reading BigLake external tables (ie external
tables with a connection). But the current implementation uses views
which can be expensive, because it requires a query. This PR adds
support to read BigLake tables directly using the storage API.

There are no behavior changes for external tables and BQ native tables -
they use the view and storage APIs respectively.

Added a new test for BigLake tables.

Co-authored-by: Marcin Rusek <[email protected]>

Make OutputBufferInfo not comparable

This is not needed

Expose exchange sink metrics in operator and stage stats

Inline constant

Expose output buffer metrics in query completion event

Expose filesystem exchange sink stats

Correctly categorize filesystem error in Iceberg connector

Webapp Preview: Cluster Overview with sparklines

Use executor service for iceberg scan planning system tables

Add Python UDF support to binaries

Use data size for delta metadata cache

Reduces chances of coordinator OOM by accounting
for retained size of objects in delta metadata cache

Increase default delta.metadata.cache-ttl to 30m

TTL can be higher because the cached metadata is immutable
and the space occupied by it in memory is accounted for

Update config description for insert.non-transactional-insert.enabled

Document default value of Iceberg object_store_enabled table property

Improve performance of Python functions

Fix and enable kudu update test

Add iceberg.bucket-execution to documentation

Introduce NodeStateManagerModule

A refactor - rename to prepare for adding new logic.

Reactivation of worker nodes

Adds new node states to enable full control over shutdown and reactivation of workers.
- state: DRAINING - a reversible shutdown,
- state: DRAINED - all tasks are finished, server can be safely and quickly stopped. Can still go back to ACTIVE.

Update AWS SDK v2 to 1.12.780

Update docker-java to 3.4.1

Update flyway to 11.1.0

Update AWS SDK v2 to 2.29.34

Update airbase to 205

Restructure SQL routine docs

Move them in appropriate folders for user-defined functions
and SQL user-defined functions. Update all references so that
the docs build process fully works.

Add redirectors for SQL routine change

Reword from SQL routine to SQL UDF

And generally introduce user-defined functions (UDF) as a term.

Move SQL UDF content

Into the separate page, and adjust the generic content to be suitable
for any UDF language.

Move SQL UDF content

Into the separate page, and adjust the generic content to be suitable
for any UDF language.

Add docs for Python UDFs

Remove unnecessary annotation in Kudu connector test

Update to oryd/hydra:v1.11.10 for OAuth testing

Fix build issues on newer OS/hardware. Use latest 1.x release since
2.x causes container start issues without further changes.

Update airbase to 206 and airlift to 293

Update AWS SDK v2 to 2.29.35

Update netty to 4.1.116.Final

Update gcs connector to hadoop3-2.2.26

Add example for inline and catalog Python UDF

Improve inline and catalog SQL UDF docs

Add Trino 468 release notes

[maven-release-plugin] prepare release 468

[maven-release-plugin] prepare for next development iteration

Do not require Java presence for RPM installation

This allows for custom JDK to be used when running Trino
with launcher --jvm-dir argument.

Allow left side as update target in join pushdown

Allow left side as update target when try to pushdown join
into table scan. The change prevent the pushdown join into the
table scan instead of throwing exception

Co-Authored-By: Łukasz Osipiuk <[email protected]>

Bind filesystem cache metrics per catalog

Cleanup InformationSchemaPageSource projection

Avoids unnecessary Integer boxing and Block[] allocations in
InformationSchemaPageSource by using Page#getColumns.

Update docker image version to 107

Rename HiveMinioDataLake to Hive3MinioDataLake

Extract HiveMinioDataLake class

Extract BaseTestHiveOnDataLake to reuse it across Hive3/4 test

Add TestHive3OnDataLake test

Add S3 Hive4 query runner

Add TestHive4OnDataLake test

Add TestTrinoHive4CatalogWithHiveMetastore test

Extract requireEnv in a util class SystemEnvironmentUtils

Remove unnecessay requirenment of password in test

in TestSalesforceBasicAuthenticator#createAuthenticatedPrincipalRealBadPassword

Use SystemEnvironmentUtils#requireEnv

Add and use SystemEnvironmentUtils#isEnvSet method

Fix misspelling

Test concurrent update without partition

Extract KuduColumnProperties from KuduTableProperties

Fix S3InputStream's handling of large skips

When the skip(n) method is called the MAX_SKIP_BYTES check is skipped,
resulting in the call potentially blocking for a long time.

Instead of delegating to the underlying stream, set the nextReadPosition
value. This allows the next read to decide if it is best to keep the existing
s3 object stream or open a new one.

This behavior matches the implementations for Azure and GCS.

Update google cloud SDK to 26.52.0

Update AWS SDK v2 to 2.29.37

Use QUERY_EXCEEDED_COMPILER_LIMIT error code

Include deletion vector when filtering active add entries in Delta

Add nonnull check for directExchangeClientSupplier

Refactor the PlanTester to pass the nonnull `directExchangeClientSupplier`.
Also add nonnull check for the `sourceId`, `serdeFactory` in `ExchangeOperator`

Make FilesTable.toJson method package-private

Add $entries metadata table to Iceberg

Run Iceberg concurrent tests multiple times

Update client driver and application sections

Document sort direction and null order in Iceberg

Rename executor to icebergScanExecutor

Improve performance when listing columns in Iceberg

Remove support for Databricks 9.1 LTS

Pin openpolicyagent/opa version as 0.70.0

Add tests for dropMaterializedView

The dropMaterializedView on TrinoHiveCatalog that uses FileHiveMetastore
works only if unique table locations are enabled.

Replace testView method override with getViewType

Extend testListTables with other relation types

TrinoCatalog.listTables returns not only iceberg tables but also
other relations like views, materialized view or non-iceberg tables.

Add HiveMetastore.getTableNamesWithParameters

Add TrinoCatalog.listIcebergTables

Add system.iceberg_tables table function

Add the ability to list only iceberg tables from the iceberg catalog.
Before this change, there was no way to list only iceberg tables.
The SHOW TABLES statement, information_schema.tables, and jdbc.tables will all
return all tables that exist in the underlying metastore, even if the table cannot
be handled in any way by the iceberg connector. This can happen if other connectors
like hive or delta, use the same metastore, catalog, and schema to store its tables.
The function accepts an optional parameter with the schema name.
Sample statements:
SELECT * FROM TABLE(iceberg.system.iceberg_tables());
SELECT * FROM TABLE(iceberg.system.iceberg_tables(SCHEMA_NAME => 'test'));

Fix failures in iceberg cloud tests

Support MERGE for Ignite connector

Remove unused delta-lake-databricks-104 test group

Add $all_entries metadata table to Iceberg

Delete the oldest tracked version metadata files after commit

Using airlift log in SimulationController

Stream large transaction log jsons instead of storing in-memory

Operations fetching metadata and protocol entries can skip reading
the rest of the json file after those entries are found

Move `databricksTestJdbcUrl()` method after the constructor

Use `ZSTD` Parquet compression codec for Delta Lake by default

Add query execution metrics to JDBC QueryStats

Adds planningTimeMillis, analysisTimeMillis, finishingTimeMillis,
physicalInputBytes, physicalWrittenBytes and internalNetworkInputBytes
to allow JDBC clients to get some important metrics about query execution

Allow configuring parquet_bloom_filter_columns in Iceberg

Verify invalid bloom filter properties in Iceberg

Remove unspecified bloom filter when setting properties in Iceberg

Support managing views in the Faker connector

Add views support to the Faker connector docs

Add min, max, and allowed_values column properties in Faker connector

Allow constraining generated values by setting the min, max, or
allowed_values column properties.

Remove predicate pushdowns in the Faker connector

Predicate pushdown in the Faker connector violates the SQL semantics,
because when applied to separate columns, correlation between columns is
not preserved, and returned results are not deterministic. The `min`,
`max`, and `options` column properties should be used instead.

Refactor Faker tests to be more readable

Remove outdated limitations in Faker's docs

Rename to Trino in product tests

Extract dep.gib.version property

Convert testRollbackToSnapshotWithNullArgument to integration test

Add newTrinoTable method to AbstractTestQueryFramework

Allow running product tests on IPv6 stack

Update nimbus-jose-jwt to 9.48

Update jna to 5.16.0

Update AWS SDK v2 to 2.29.43

Update openlineage-java to 1.26.0

Update airbase to 209

Update airlift to 294

Update freemarker 2.3.34

Remove unreachable code in OrderedPeriodParser

Avoid parsing min and max twice

Avoid and/or in Faker docs

Refactor FakerPageSource to have fewer faker references

Extract typed ranges in Faker's page source

Fix handling upper bounds in FakerPageSource

Fix handling upper bounds for floating point types. The implementation
did not account for rounding issue near the bound, and the test was
using values outside of the allowed range.

Refactor FakerPageSource

Refactor to make subsequent commit's diff smaller

Extract a method in FakerColumnHandle

Support generating sequences in the Faker connector

Configure SSL for unauthenticated client

Unauthenticated client can connect to the Trino cluster when loading segment data.
If cluster has its' own certificate chain - client needs to accept it according to the configuration.

Simplify conditions

Allow resource access type on class-level

For io.trino resources it's now impossible to use class-level @ResourceType
annotations due to the invalid condition.

Check whole class hierarchy for @ResourceSecurity

Use class-level @ResourceSecurity annotations

Fix parsing of negative 0x, 0b, 0o long literals

Update lucene-analysis-common to 10.1.0

Correctly merge multiple splits info

Previously SplitOperatorInfo wasn't Mergeable and hence base
OperatorInfo(OperatorStats#getMergeableInfoOrNull) was null.

Prepare to implement a page source provider for Redshift

Fetch Redshift query results unloaded to S3

Co-authored-by: Mayank Vadariya <[email protected]>

Copy all TPCH tables during initialization in TestRedshiftUnload

Add physicalInputTimeMillis to io.trino.jdbc.QueryStats

Fix listing of files in AlluxioFileSystem

Co-authored by: JiamingMai <[email protected]>

Add info about views in memory only for Faker connector

Minor improvements to Python UDF docs

Improve docs for non-transactional merge

As applicable for PostgreSQL connector for now. Also extract into
a fragment so it can be reused in other connectors.

Improve SQL support section in JDBC connectors

- No content changes but...
- Consistent wording
_ Markdown link syntax
- Move related configs to SQL support section
- Improve list and rejig as small local ToC, add links

Add non-transactional MERGE docs for Phoenix

Add non-transactional MERGE docs for Ignite

Remove unused metastore classes

Remove unused TestingIcebergHiveMetastoreCatalogModule

Replace usage of RetryDriver in HiveMetadata

Move partition utility methods to Partitions class

Move HiveMetastoreFactory to metastore module

Cleanup binding of FlushMetadataCacheProcedure

Move CachingHiveMetastore to metastore module

Move Glue v1 converters into Glue package

Move already exists exceptions to metastore module

Move ForHiveMetastore annotation to thrift package

Move RetryDriver to thrift package

Move Avro utility method to ThriftMetastoreUtil

Add SSE-C option on native-filesystem security mapping

Remove SPI exclusion from previous release

Remove support for connector event listeners

Inline partition projection name in HiveConfig

Remove shadowed field from MockConnectorMetadata

Remove optional binder for HivePageSourceProvider

Remove unused HiveMaterializedViewPropertiesProvider

Remove unused HiveRedirectionsProvider

Remove unused DeltaLakeRedirectionsProvider

Remove unused function providers from Hive connector

Remove deprecated ConnectorMetadata.getTableHandleForExecute

Remove deprecated ConnectorMetadata.beginMerge

Remove optional binder for IcebergPageSourceProviderFactory

Make SqlVarbinary map serialization consistent with other types

Other Sql* classes are serialized to JSON on-the-wire format,
using @JsonValue annotated toString methods, except for SqlVarbinary
which was serialized using its' getBytes() method that was Base64-encoded
to a map key.

Decouple Sql types from JSON serialization

The new JSON serialization is not using ObjectMapper to serialize these values anymore.
We want to decouple SPI types from JSON representation to be able to introduce
alternative encoding formats.

Derive aws sdk retry count from request count

Minor cleanup in Hive procedures

Move createParquetMetadata to ParquetMetadata

Parse parquet footer row groups lazily

Write row group fileOffset in parquet file footer

Parse only required row groups from parquet footer

Extend AbstractTestQueryFramework in TestRedshiftUnload

Use correct table operations provider for Thrift metastore in Delta

Update zstd-jni to 1.5.6-9

Update nimbus-jose-jwt to 10.0

Update AWS SDK v2 to 2.29.44

Update nessie to 0.101.3

Allow configuring gcs service endpoint

Update delta-kernel to 3.3.0

Allow configuring orc_bloom_filter_columns table property in Iceberg

Lower log level

Add rollback_to_snapshot table procedure in Iceberg

Remove <name> from trino-ranger pom

We are not using it for other modules which results in the build output being inconsistent

Instantiate tpch connector using bootstrap

Synchronize types in (Client)StandardTypes

Enumerate all types while decoding

This makes it explicit in regard to the ClientStandardTypes list of types

Fix deserialization of KDB_Tree and BingTile

These types have a custom serialization logic utilizing JsonCreator/JsonProperty
annotations.

Improve SetDigest serialization

Inline SetDigestType.NAME

Use StandardTypes.BING_TILE const

Use StandardTypes.GEOMETRY const

Use StandardTypes.KDB_TREE const

Use StandardTypes.SPHERICAL_GEOMETRY const

Fix deserialization of Color type

Clarify that spooling locations must not be shared

Add docs for gcs.endpoint

Enable Phoenix product test

Fix wrong config name in OPA documentation

Aplhabetize additional IDE configurations on docs

Remove defunct hive metastore property from docs
Removing property `hive.metastore.thrift.batch-fetch.enabled` from docs, which was marked defunct in Trino 443.

Fix correctness issue when writing deletion vectors in Delta Lake

Improve developer docs for connector MERGE support

Fix gcs.endpoint property name in docs

Enable dynamic catalogs for product-tests

Since this uses the default catalog store set to `file`, there should not be issues with switching it all over to dynamic catalog management.

Improve docs for Redshift parallel read

Add counter for tasks created on worker

Expose worker tracked tasks count

Update okio to 3.10.1

Update nimbus-jose-jwt to 10.0.1

Update minio to 8.5.15

Update snowflake-jdbc to 3.21.0

Update checker-qual to 3.48.4

Update flyway to 11.1.1

Update AWS SDK v2 to 2.29.47

Update org.json to 20250107

Update airbase to 210

Test timestamp parsing in Delta Lake

Allow parsing ISO8601 timestamp in Delta lake transaction log

Update docker-images to 108

Disable spooling through session property

Remove obsolete JVM configuration

Performance-wise this doesn't improve memory usage

Reduce the query and task expiration times

This reduces the amount of memory needed to run the product test cluster

Suppress warning when building with mvnd

Require JDK 23.0.0

Add docs for spooling_protocol_enabled session property

Document $entries table in Iceberg

Document limit pushdown in BigQuery connector

Grant privileges only on TPCH tables in Redshift query runner

Granting privileges on all tables may cause unintended failures, as
temporary tables created in one test class may not have been fully
dropped or cleaned up from internal Redshift tables while other test
class executes grant privileges on all tables.

Remove unused NoopFunctionProvider

Fix incorrect result when reading deletion vectors in Delta Lake

This issue happens when Parquet file contains several pages and
predicates filter page when parquet_use_column_index is set to true.

Fix query runners failing to expose local ports

The logic was mistakenly inverted in
a99d96e.

Geospatial function ST_GeomFromKML

Add executeWithoutResults to StandaloneQueryRunner

It is needed to test the query execution without reading the
final query info which triggers the QueryCompletedEvent

Always fire QueryCompletedEvent for DDL queries

Previously the event was fired because client protocol
is reading the final query info. That is brittle and theoretically
could be removed making DDL queries fail to trigger query completed event.

Add authentication to the Preview Web UI

Ensure visibility of finalSinkMetrics

Without exchangeSink not being volatile it was possible that other
thread could observe nulled exchangeSink but still not set
finalSinkMetrics.

Fix logical merge conflict

Fix missing ts dependencies

Move JdbcRecordSetProvider construction to a module

Fix error message to actual variable name

Update supported clickhouse versions

Revert "Update supported clickhouse versions"

This reverts commit e188f97.

Reapply "Update supported clickhouse versions"

This reverts commit 523a305.

Add support for validating JDBC connections

Add docs for optimizer push filter

Clarify where to set legacy Hive properties for S3

The `trino.s3.use-web-identity-token-credentials-provider` property must
be set in the Hadoop config file, not as the connector property. This
needs to be clarified in the docs.

Clarify docs for multiple access controls

Bump Scylla docker images version to 6.2

Bump latest Cassandra docker images version to 5.0.2

WebUI: Sort queries on worker by descending reserved memory

Adjust docs for  metadata table in Iceberg

Improve code blocks and wording in Iceberg connector docs

Fix code formatting in Preview Web UI

Support unpartitioned tables in Kudu

Fix flaky TestQueryManagerConfig

This test relies on rounding large integers and will sometimes fail
depending on the amount of memory available to the test runner.

This commit reuses the exact same calculation for tests as it does in
the code, so that we will always get the correct value for the default.

Replace airlift's auth preserving client with okhttp

Update AWS SDK v2 to 2.29.50

Update httpcore5 to 5.3.2

Update google-api-client to 2.7.1

Update airbase to 211

Update okio to 3.10.2

Update commons-codec to 1.17.2

Update mongo to 5.3.0

Update airlift to 295

Add retry_policy session property docs

- also fix a bunch of style and formatting issues on the page

Avoid writing NaN and Infinity with json format table

Adjust docs for  manifests metadata table in Iceberg

Expose BigQuery RPC configuration settings

Document BigQuery RPC settings

Make the sentence assertive

Use only the enforced partition constraints dependency

Fix typo

Extract logic for appending transaction log file for OPTIMIZE

Add concurrent writes reconciliation for OPTIMIZE in Delta Lake

Allow committing OPTIMIZE operations in a concurrent context
by placing these operations right after any other previously
concurrently completed write operations.

Add info about Python client spooling support

Update airlift to 296

Add support for filtering by client tags in Web UI

Configure Phoenix server scan page timeout

Update netty to 4.1.117.Final

Update AWS SDK v2 to 2.29.50

Update metrics-core to 4.2.30

Update commons-text to 1.13.0

Update elasticsearch to 7.17.27

Update google sheets api to v4-rev20250106-2.0.0

Update reactor-core to 3.7.2

Add test for JsonSerializer handling

Use Java APIs instead of ByteStreams

Avoid exposing not required ports in Exasol

Co-Authored-By: YotillaAntoni <[email protected]>

Update Exasol image version to 8.32.0

Co-Authored-By: YotillaAntoni <[email protected]>

Remove system.iceberg_tables table function

We decided to add a system table instead.

This reverts commit 5ce80be.

Fix failure when setting NULL in UPDATE statement in JDBC-based connectors

Co-Authored-By: Yuya Ebihara <[email protected]>

Use <arg> instead of <compilerArg> to enable incubator module

`<compilerArg>` does not seem to be a valid child element of
`<compilerArgs>`. Even if it seems to work, it conflicts with the
`errorprone-compiler` profile, which appends additional compiler
arguments using `<arg>` (which is documented to be the correct child
element). When there is also `<compilerArg>`, it gets ignored (at least
by IDEA's POM importer).

Update google-cloud-sdk to 26.53.0

Update nimbus oauth2 sdk to 11.21

Update opencsv to 5.10

Update localstack image to 4.0.3

Update minio to RELEASE.2024-12-18T13-15-44Z

Decode JSON directly from Slice without materialization

Use newInputStream instead of FileInputStream

Support FIRST and AFTER clause when adding a new column in engine

Support FIRST and AFTER clause when adding a new column in Iceberg

Add docs for json_table

Co-authored-by: Michael Eby <[email protected]>

Make ElasticsearchServer Closeable

Derive predicate support from Trino type in ElasticSearch

Add test for column name containing special characters in Elasticsearch

Add dereference pushdown support in ElasticSearch

Reduce task.info.max-age to 5m

Default 15m sometiems causes some significant OOM issues on workers.

Bump docker/setup-qemu-action in the dependency-updates group

Bumps the dependency-updates group with 1 update: [docker/setup-qemu-action](https://github.com/docker/setup-qemu-action).

Updates `docker/setup-qemu-action` from 3.2.0 to 3.3.0
- [Release notes](https://github.com/docker/setup-qemu-action/releases)
- [Commits](docker/setup-qemu-action@49b3bc8...53851d1)

---
updated-dependencies:
- dependency-name: docker/setup-qemu-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: dependency-updates
...

Signed-off-by: dependabot[bot] <[email protected]>

Improve test coverage for ThriftHttpMetastoreClient

Remove meaningless call of Math.max

Update caffeine to 3.2.0

Update flyway to 11.2.0

Update swagger to 2.2.28

Update airbase to 212

Test the effectiveness of partial timestamp partition pruning

This test showcases that there is partition pruning done
at the Iceberg metadata layer even though EXPLAIN showcases
that a filter that does not fully match the partition transform
is not being pushed down.

Co-authored-by: Michiel De Smet <[email protected]>

Remove experimental warning for Python UDF

Rename BigQuery test credentials key config

`bigquery.credentials-key` testing config name clashes with the actual
config name and hence any new bigquery catalogs created through `CREATE
CATALOG` command by running BigQueryQueryRunner may not pick the
different key if provided in command.

Update AWS SDK v2 to 2.30.2

Add validation to segment size configuration

Add spooling session properties

Rename session property for consistency

Update graalvm to 24.1.2

Update AWS SDK v2 to 2.30.3

Update nessie to 0.102.0

Update openlineage to 1.26.0

Update google oauth2 client to 1.37.0

Update oauth2-oidc-sdk to 11.21.2

Remove unused argument

Verify that TableScanNode's assignments match the output symbols

Before this change, only one-way match was verified: that each
output symbol is backed by an assignment.

Suffix S3 path with separator for recursive delete

Without trailing path separator the recursive delete operation fails for
directory buckets (e.g. S3 Express)

Remove deprecated method call

Simplify boolean comparisons

Simplify conditionals

Move return statement to the empty catch block

Use switch expressions

Add configuration for maximum Arrow allocation in BigQuery

This helps bounding the memory allocations

Log failed buffer allocations

Add Arrow allocation stats to BigQuery

Add allocator stats to PageSource metrics

Fix splits generation from iceberg TableChangesSplitSource

Make configuration and session property names consistent

It makes it easier to reason about which session property maps to which configuration property.

More details about literals

Add Parquet writer session properties docs

Document spooling session properties

Refactor commitUpdateAndTransaction in Iceberg

Improve error handling for delete and truncate in Iceberg

Allow add column with position in base jdbc module

Support FIRST and AFTER clause when adding a new column in Mysql

Support FIRST and AFTER clause when adding a new column in MariaDb

Temporarily disable Snowflake tests

Update Java to 23.0.2

This brings updated timezonedb to version 2024b
(openjdk/jdk23u@73b2341)
which amends historical timezone definitions for Mexico/Bahia_Banderas
that we use for testing timezone gap around the unix timestamp epoch.

Corresponding Joda time update also has these timezone definitions
updated. PostgreSQL test server was upgraded to 12 to correctly handle
UTC around epoch. MySQL was updated to 8.0.41 due to the same reason.

Temporary downgrade JDK for ppc64 to make CI happy

Simplify char type docs

Add comparison example queries

Improve formatting and style

Fix error with multiple nested partition columns on Iceberg (trinodb#24629)

Adjust release template for 2025

Add Trino 469 release notes

[maven-release-plugin] prepare release 469

[maven-release-plugin] prepare for next development iteration

Remove deprecated method addColumn from JdbcClient

Support MERGE for MySQL connector

Co-Authored-By: Yuya Ebihara <[email protected]>

Remove the Kinesis connector

The connector seems unused from all interaction on the community
and following an investigation with users, vendors, and customers.

The SDK used is deprecated and will be removed.

More details about the removal are captured in
trinodb#23921

Fix failure when adding columns with dots in Iceberg

Update airbase to 213

Update oshi-core to 6.6.6

Update AWS SDK v2 to 2.30.6

Update mongo to 5.3.1

Update protobuf to 3.25.6

Update MySQL to 9.2.0

Update airlift to 298

Update client libraries to JDK 11

Use newTrinoTable in more places

Sometimes the new and old way were mixed in a single method. It also
makes `createTestTableForWrites` reuse this method to avoid duplication.

Fix null check in the array_histogram

Update Azure SDK to 1.2.31

Update AWS SDK v2 to 2.30.7

Update google api client to 2.7.2

Allow nessie to 0.102.2

Update jetbrains annotations to 26.0.2

Update airbase to 214 and airlift to 299

Set validation to WHEN_REQUIRED for FTE exchange

Use isEmpty instead of not isPresent

Modernize client dependencies

Use ImmutableList.copyOf

Remove redundant format call

Mark fields as final

Use more functional style for Optionals

Use String.isEmpty

Use StandardCharsets.UTF_8

Remove unnecessary .toString() call

Drop inferred type arguments

Replace statement lambda with expression lambda

Drop dead variable that is never read

Remove S3 legacy migration guide considerations

Cast table properties based on its type in Faker connector

Add test for Faker connector renameTable

Introduce Loki connector.

Co-authored-by: Janos <[email protected]>

Extract helper method to get table comment in tests

Co-Authored-By: Sandeep Thandassery <[email protected]>

Test COMMENT ON TABLE in Faker connector

Co-Authored-By: Sandeep Thandassery <[email protected]>

Allow setting catalog type in Iceberg query runner

Fix Timestamp assertion test for metrics query

Minor cleanup in Loki connector

Inform user what are extractable date/time fields

Make the error message for invalid `extract` more helpful.

Support non-lower case variables in functions

Update docs with updated Postgres version

Since 1350a8b we switched testing to 12

Close StatementClient when request is timed out

Add JMX metrics for S3 http client usage in native FS.

When s3 client was changed to use AWS SDK v2 from AWS SDK v1, the set of s3 pool metrics were missed.

With this PR, these http pool metrics are now exposed via JMX beans. The reference for the http metrics exposed by AWS SDK can be found here:
https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/http/HttpMetric.html

Fix a wrong issue link in 469 release notes

Sort connector names in EnvMultinodeAllConnectors

Tweak validation of BaseCaseInsensitiveMappingTest

Some databases may map varchar(5) to different types.

Add DuckDB connector

Clean up leftover schema for BigQuery tests

Clean up test schema in connector tests

Determine range constraints in Faker connector

When creating a table in the Faker connector from an existing table,
gather column statistics to determine range constraints, set them as
column properties.

Use actual row count as default limit

Generate low cardinality values in Faker connector

When creating a table in the Faker connector from an existing table,
using column statistics determine low cardinality columns, and generate
values from a randomly generated set.

Set null_probability based on stats

When creating tables in the Faker connector using CREATE TABLE AS
SELECT, use the NUMBER_OF_NON_NULL_VALUES column statistic to set the
null_probability column property.

Remove unreachable code from Exasol

Fix failure when equality delete updated nested fields in Iceberg

Remove 'LOCAL TEMPORARY' from DuckDbClient

Update gson to 2.12.0

Update AWS SDK v2 to 2.30.8

Update exasol-testcontainers to 7.1.3

Update httpcore5 to 5.3.3

Update resteasy-core to 6.0.3.Final

Update duckdb_jdbc to 1.1.3

Improve pom formatting

Update airbase to 215

Update airlift to 300

Update gcs-connector to 3.0.4

Fix connection leakage in the native Azure filesystem

This solves an issue with connection leaks that are happening for Azure Storage SDK
when OkHttp is used. OkHttp is not actively maintained, which makes the default,
Netty implementation, a better choice for the future as it's actively maintained
and tested.

Simplify code

Event loop group/maximum number of concurrent requests are already configured
so the removed setting was a noop.

Propagate function lifecycle events to SystemSecurityMetadata

Use arrays instead of lists in OpenX RowDecoder

Use array instead of list for Hive JSON decoders

Reuse fieldWritten array between rows

Check non-null constructor arguments

Avoid re-checking isScalarType

Improve case insensitive matching fragement

Reenable Snowflake tests

This reverts commit 5c43149
and applies necessary fixes to the changes that were introduced
in the meantime.

Add docs for authentication with Preview Web UI

Make number of commit retries configurable in Iceberg

Correctly categorize external errors in Iceberg REST catalog

Add TestLokiPlugin

Add Loki to EnvMultinodeAllConnectors

Add more connectors to labeler-config

Support multiple state filters on getAllQueryInfo API

Add workers pages to the Preview Web UI

Allow Hive metastore caching for Iceberg

Remove table listing cache from FileHiveMetastore

Fix typo in Python docs

Remove deprecated precomputed hash optimizer

Update snowflake-jdbc to 3.22.0

Update JLine to 3.29.0

Update commons-codec to 1.18.0

Update flyway to 11.3.0

Update AWS SDK v2 to 2.30.10

Update npm to 11.1.0

Update node to 22.13.1

Deprecate support for IBM COS via Hive

Bump actions/stale from 9.0.0 to 9.1.0 in the dependency-updates group

Bumps the dependency-updates group with 1 update: [actions/stale](https://github.com/actions/stale).

Updates `actions/stale` from 9.0.0 to 9.1.0
- [Release notes](https://github.com/actions/stale/releases)
- [Changelog](https://github.com/actions/stale/blob/main/CHANGELOG.md)
- [Commits](actions/stale@v9.0.0...v9.1.0)

---
updated-dependencies:
- dependency-name: actions/stale
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: dependency-updates
...

Signed-off-by: dependabot[bot] <[email protected]>

Fix Vacuum deleting files with whitespace in paths

When a Delta Lake file path contains whitespace, the VACUUM procedure
unexpectedly removes it. This commit fixes the issue by wrapping the
file path using RFC 2396 URI encoding, ensuring consistency with how
file paths are handled when writing add or remove entries

Update airlift to 301

Close FileInputStream while loading keystore

Update reactor-netty-core to 1.1.26

Update AWS SDK v2 to 2.30.11

Update minio to 8.5.17

Update oauth2-oidc-sdk to 11.21.3

Update gson to 2.12.1

Update commons-pool2	to 2.12.1

Update loki-client to 0.0.3

Update nessie to 0.102.4

Update httpclient5 to 5.4.2

upgrade
hantmac added a commit to hantmac/trino that referenced this issue Feb 4, 2025
Reorder connector behavior in TestPostgreSqlConnectorTest

Inline method addPrimaryKeyToCopyTable

Inline method `addPrimaryKeyToCopyTable` in `TestPostgreSqlJdbcConnectionAccesses`
and `TestPostgreSqlJdbcConnectionCreation` for readability

Fix session property description for NON_TRANSACTIONAL_MERGE

Also change the description of `NON_TRANSACTIONAL_INSERT`
to the "Enables support for non-transactional INSERT"

Fix storage table data clean up while dropping iceberg materialized view

This is for cleaning up data files in legacy mode i.e iceberg.materialized-views.hide-storage-table is set to false.
Delegating the drop table to metastore does not clean up the data files since for HMS,
the iceberg table is registered as an "external" table. So to fix this instead of delegating to metastore,
have the connector do the drop of the table and data files associated with it.

Rename partitions to partition_summaries in Iceberg manifests table

Restore removed assertions

These assertions are still useful since they don't use the MERGE code path. These were mistakenly removed in e88e2b1.

Only enable MERGE for MERGE specific tests

Before this change MERGE code path could inadvertently be used in places where we are not interested in testing MERGE.

Use getFileSystemFactory in BaseIcebergMaterializedViewTest

Extract helper method to get HiveMetastore

Inject execution interceptors using multibinder

The GlueMetastoreModule is not truly following the inversion of control
paradigm. Many things are created directly in the methods without using
injection. Using set multibinder for execution interceptors allows to
independently define execution interceptors and use Guice injection.

Co-authored-by: Grzegorz Kokosiński <[email protected]>

Extract table features constants and add ProtocolEntry builder

Add missing table features when adding timestamp_ntz in Delta

Update Hudi library to 1.0.0

Update airbase to 204

Update airlift to 292

Update metrics-core to 4.2.29

Update reactor-core to 3.7.1

Update swagger to 2.2.27

Update AWS SDK v2 to 2.29.31

Update JLine to 3.28.0

Update nessie to 0.101.1

Update s3mock testcontainers to 3.12.0

Update exasol to 24.2.1

Update google-sheets api to v4-rev20241203-2.0.0

Add assertConsistently

Use BigQuery storage read API when reading external BigLake tables

The storage APIs support reading BigLake external tables (ie external
tables with a connection). But the current implementation uses views
which can be expensive, because it requires a query. This PR adds
support to read BigLake tables directly using the storage API.

There are no behavior changes for external tables and BQ native tables -
they use the view and storage APIs respectively.

Added a new test for BigLake tables.

Co-authored-by: Marcin Rusek <[email protected]>

Make OutputBufferInfo not comparable

This is not needed

Expose exchange sink metrics in operator and stage stats

Inline constant

Expose output buffer metrics in query completion event

Expose filesystem exchange sink stats

Correctly categorize filesystem error in Iceberg connector

Webapp Preview: Cluster Overview with sparklines

Use executor service for iceberg scan planning system tables

Add Python UDF support to binaries

Use data size for delta metadata cache

Reduces chances of coordinator OOM by accounting
for retained size of objects in delta metadata cache

Increase default delta.metadata.cache-ttl to 30m

TTL can be higher because the cached metadata is immutable
and the space occupied by it in memory is accounted for

Update config description for insert.non-transactional-insert.enabled

Document default value of Iceberg object_store_enabled table property

Improve performance of Python functions

Fix and enable kudu update test

Add iceberg.bucket-execution to documentation

Introduce NodeStateManagerModule

A refactor - rename to prepare for adding new logic.

Reactivation of worker nodes

Adds new node states to enable full control over shutdown and reactivation of workers.
- state: DRAINING - a reversible shutdown,
- state: DRAINED - all tasks are finished, server can be safely and quickly stopped. Can still go back to ACTIVE.

Update AWS SDK v2 to 1.12.780

Update docker-java to 3.4.1

Update flyway to 11.1.0

Update AWS SDK v2 to 2.29.34

Update airbase to 205

Restructure SQL routine docs

Move them in appropriate folders for user-defined functions
and SQL user-defined functions. Update all references so that
the docs build process fully works.

Add redirectors for SQL routine change

Reword from SQL routine to SQL UDF

And generally introduce user-defined functions (UDF) as a term.

Move SQL UDF content

Into the separate page, and adjust the generic content to be suitable
for any UDF language.

Move SQL UDF content

Into the separate page, and adjust the generic content to be suitable
for any UDF language.

Add docs for Python UDFs

Remove unnecessary annotation in Kudu connector test

Update to oryd/hydra:v1.11.10 for OAuth testing

Fix build issues on newer OS/hardware. Use latest 1.x release since
2.x causes container start issues without further changes.

Update airbase to 206 and airlift to 293

Update AWS SDK v2 to 2.29.35

Update netty to 4.1.116.Final

Update gcs connector to hadoop3-2.2.26

Add example for inline and catalog Python UDF

Improve inline and catalog SQL UDF docs

Add Trino 468 release notes

[maven-release-plugin] prepare release 468

[maven-release-plugin] prepare for next development iteration

Do not require Java presence for RPM installation

This allows for custom JDK to be used when running Trino
with launcher --jvm-dir argument.

Allow left side as update target in join pushdown

Allow left side as update target when try to pushdown join
into table scan. The change prevent the pushdown join into the
table scan instead of throwing exception

Co-Authored-By: Łukasz Osipiuk <[email protected]>

Bind filesystem cache metrics per catalog

Cleanup InformationSchemaPageSource projection

Avoids unnecessary Integer boxing and Block[] allocations in
InformationSchemaPageSource by using Page#getColumns.

Update docker image version to 107

Rename HiveMinioDataLake to Hive3MinioDataLake

Extract HiveMinioDataLake class

Extract BaseTestHiveOnDataLake to reuse it across Hive3/4 test

Add TestHive3OnDataLake test

Add S3 Hive4 query runner

Add TestHive4OnDataLake test

Add TestTrinoHive4CatalogWithHiveMetastore test

Extract requireEnv in a util class SystemEnvironmentUtils

Remove unnecessay requirenment of password in test

in TestSalesforceBasicAuthenticator#createAuthenticatedPrincipalRealBadPassword

Use SystemEnvironmentUtils#requireEnv

Add and use SystemEnvironmentUtils#isEnvSet method

Fix misspelling

Test concurrent update without partition

Extract KuduColumnProperties from KuduTableProperties

Fix S3InputStream's handling of large skips

When the skip(n) method is called the MAX_SKIP_BYTES check is skipped,
resulting in the call potentially blocking for a long time.

Instead of delegating to the underlying stream, set the nextReadPosition
value. This allows the next read to decide if it is best to keep the existing
s3 object stream or open a new one.

This behavior matches the implementations for Azure and GCS.

Update google cloud SDK to 26.52.0

Update AWS SDK v2 to 2.29.37

Use QUERY_EXCEEDED_COMPILER_LIMIT error code

Include deletion vector when filtering active add entries in Delta

Add nonnull check for directExchangeClientSupplier

Refactor the PlanTester to pass the nonnull `directExchangeClientSupplier`.
Also add nonnull check for the `sourceId`, `serdeFactory` in `ExchangeOperator`

Make FilesTable.toJson method package-private

Add $entries metadata table to Iceberg

Run Iceberg concurrent tests multiple times

Update client driver and application sections

Document sort direction and null order in Iceberg

Rename executor to icebergScanExecutor

Improve performance when listing columns in Iceberg

Remove support for Databricks 9.1 LTS

Pin openpolicyagent/opa version as 0.70.0

Add tests for dropMaterializedView

The dropMaterializedView on TrinoHiveCatalog that uses FileHiveMetastore
works only if unique table locations are enabled.

Replace testView method override with getViewType

Extend testListTables with other relation types

TrinoCatalog.listTables returns not only iceberg tables but also
other relations like views, materialized view or non-iceberg tables.

Add HiveMetastore.getTableNamesWithParameters

Add TrinoCatalog.listIcebergTables

Add system.iceberg_tables table function

Add the ability to list only iceberg tables from the iceberg catalog.
Before this change, there was no way to list only iceberg tables.
The SHOW TABLES statement, information_schema.tables, and jdbc.tables will all
return all tables that exist in the underlying metastore, even if the table cannot
be handled in any way by the iceberg connector. This can happen if other connectors
like hive or delta, use the same metastore, catalog, and schema to store its tables.
The function accepts an optional parameter with the schema name.
Sample statements:
SELECT * FROM TABLE(iceberg.system.iceberg_tables());
SELECT * FROM TABLE(iceberg.system.iceberg_tables(SCHEMA_NAME => 'test'));

Fix failures in iceberg cloud tests

Support MERGE for Ignite connector

Remove unused delta-lake-databricks-104 test group

Add $all_entries metadata table to Iceberg

Delete the oldest tracked version metadata files after commit

Using airlift log in SimulationController

Stream large transaction log jsons instead of storing in-memory

Operations fetching metadata and protocol entries can skip reading
the rest of the json file after those entries are found

Move `databricksTestJdbcUrl()` method after the constructor

Use `ZSTD` Parquet compression codec for Delta Lake by default

Add query execution metrics to JDBC QueryStats

Adds planningTimeMillis, analysisTimeMillis, finishingTimeMillis,
physicalInputBytes, physicalWrittenBytes and internalNetworkInputBytes
to allow JDBC clients to get some important metrics about query execution

Allow configuring parquet_bloom_filter_columns in Iceberg

Verify invalid bloom filter properties in Iceberg

Remove unspecified bloom filter when setting properties in Iceberg

Support managing views in the Faker connector

Add views support to the Faker connector docs

Add min, max, and allowed_values column properties in Faker connector

Allow constraining generated values by setting the min, max, or
allowed_values column properties.

Remove predicate pushdowns in the Faker connector

Predicate pushdown in the Faker connector violates the SQL semantics,
because when applied to separate columns, correlation between columns is
not preserved, and returned results are not deterministic. The `min`,
`max`, and `options` column properties should be used instead.

Refactor Faker tests to be more readable

Remove outdated limitations in Faker's docs

Rename to Trino in product tests

Extract dep.gib.version property

Convert testRollbackToSnapshotWithNullArgument to integration test

Add newTrinoTable method to AbstractTestQueryFramework

Allow running product tests on IPv6 stack

Update nimbus-jose-jwt to 9.48

Update jna to 5.16.0

Update AWS SDK v2 to 2.29.43

Update openlineage-java to 1.26.0

Update airbase to 209

Update airlift to 294

Update freemarker 2.3.34

Remove unreachable code in OrderedPeriodParser

Avoid parsing min and max twice

Avoid and/or in Faker docs

Refactor FakerPageSource to have fewer faker references

Extract typed ranges in Faker's page source

Fix handling upper bounds in FakerPageSource

Fix handling upper bounds for floating point types. The implementation
did not account for rounding issue near the bound, and the test was
using values outside of the allowed range.

Refactor FakerPageSource

Refactor to make subsequent commit's diff smaller

Extract a method in FakerColumnHandle

Support generating sequences in the Faker connector

Configure SSL for unauthenticated client

Unauthenticated client can connect to the Trino cluster when loading segment data.
If cluster has its' own certificate chain - client needs to accept it according to the configuration.

Simplify conditions

Allow resource access type on class-level

For io.trino resources it's now impossible to use class-level @ResourceType
annotations due to the invalid condition.

Check whole class hierarchy for @ResourceSecurity

Use class-level @ResourceSecurity annotations

Fix parsing of negative 0x, 0b, 0o long literals

Update lucene-analysis-common to 10.1.0

Correctly merge multiple splits info

Previously SplitOperatorInfo wasn't Mergeable and hence base
OperatorInfo(OperatorStats#getMergeableInfoOrNull) was null.

Prepare to implement a page source provider for Redshift

Fetch Redshift query results unloaded to S3

Co-authored-by: Mayank Vadariya <[email protected]>

Copy all TPCH tables during initialization in TestRedshiftUnload

Add physicalInputTimeMillis to io.trino.jdbc.QueryStats

Fix listing of files in AlluxioFileSystem

Co-authored by: JiamingMai <[email protected]>

Add info about views in memory only for Faker connector

Minor improvements to Python UDF docs

Improve docs for non-transactional merge

As applicable for PostgreSQL connector for now. Also extract into
a fragment so it can be reused in other connectors.

Improve SQL support section in JDBC connectors

- No content changes but...
- Consistent wording
_ Markdown link syntax
- Move related configs to SQL support section
- Improve list and rejig as small local ToC, add links

Add non-transactional MERGE docs for Phoenix

Add non-transactional MERGE docs for Ignite

Remove unused metastore classes

Remove unused TestingIcebergHiveMetastoreCatalogModule

Replace usage of RetryDriver in HiveMetadata

Move partition utility methods to Partitions class

Move HiveMetastoreFactory to metastore module

Cleanup binding of FlushMetadataCacheProcedure

Move CachingHiveMetastore to metastore module

Move Glue v1 converters into Glue package

Move already exists exceptions to metastore module

Move ForHiveMetastore annotation to thrift package

Move RetryDriver to thrift package

Move Avro utility method to ThriftMetastoreUtil

Add SSE-C option on native-filesystem security mapping

Remove SPI exclusion from previous release

Remove support for connector event listeners

Inline partition projection name in HiveConfig

Remove shadowed field from MockConnectorMetadata

Remove optional binder for HivePageSourceProvider

Remove unused HiveMaterializedViewPropertiesProvider

Remove unused HiveRedirectionsProvider

Remove unused DeltaLakeRedirectionsProvider

Remove unused function providers from Hive connector

Remove deprecated ConnectorMetadata.getTableHandleForExecute

Remove deprecated ConnectorMetadata.beginMerge

Remove optional binder for IcebergPageSourceProviderFactory

Make SqlVarbinary map serialization consistent with other types

Other Sql* classes are serialized to JSON on-the-wire format,
using @JsonValue annotated toString methods, except for SqlVarbinary
which was serialized using its' getBytes() method that was Base64-encoded
to a map key.

Decouple Sql types from JSON serialization

The new JSON serialization is not using ObjectMapper to serialize these values anymore.
We want to decouple SPI types from JSON representation to be able to introduce
alternative encoding formats.

Derive aws sdk retry count from request count

Minor cleanup in Hive procedures

Move createParquetMetadata to ParquetMetadata

Parse parquet footer row groups lazily

Write row group fileOffset in parquet file footer

Parse only required row groups from parquet footer

Extend AbstractTestQueryFramework in TestRedshiftUnload

Use correct table operations provider for Thrift metastore in Delta

Update zstd-jni to 1.5.6-9

Update nimbus-jose-jwt to 10.0

Update AWS SDK v2 to 2.29.44

Update nessie to 0.101.3

Allow configuring gcs service endpoint

Update delta-kernel to 3.3.0

Allow configuring orc_bloom_filter_columns table property in Iceberg

Lower log level

Add rollback_to_snapshot table procedure in Iceberg

Remove <name> from trino-ranger pom

We are not using it for other modules which results in the build output being inconsistent

Instantiate tpch connector using bootstrap

Synchronize types in (Client)StandardTypes

Enumerate all types while decoding

This makes it explicit in regard to the ClientStandardTypes list of types

Fix deserialization of KDB_Tree and BingTile

These types have a custom serialization logic utilizing JsonCreator/JsonProperty
annotations.

Improve SetDigest serialization

Inline SetDigestType.NAME

Use StandardTypes.BING_TILE const

Use StandardTypes.GEOMETRY const

Use StandardTypes.KDB_TREE const

Use StandardTypes.SPHERICAL_GEOMETRY const

Fix deserialization of Color type

Clarify that spooling locations must not be shared

Add docs for gcs.endpoint

Enable Phoenix product test

Fix wrong config name in OPA documentation

Aplhabetize additional IDE configurations on docs

Remove defunct hive metastore property from docs
Removing property `hive.metastore.thrift.batch-fetch.enabled` from docs, which was marked defunct in Trino 443.

Fix correctness issue when writing deletion vectors in Delta Lake

Improve developer docs for connector MERGE support

Fix gcs.endpoint property name in docs

Enable dynamic catalogs for product-tests

Since this uses the default catalog store set to `file`, there should not be issues with switching it all over to dynamic catalog management.

Improve docs for Redshift parallel read

Add counter for tasks created on worker

Expose worker tracked tasks count

Update okio to 3.10.1

Update nimbus-jose-jwt to 10.0.1

Update minio to 8.5.15

Update snowflake-jdbc to 3.21.0

Update checker-qual to 3.48.4

Update flyway to 11.1.1

Update AWS SDK v2 to 2.29.47

Update org.json to 20250107

Update airbase to 210

Test timestamp parsing in Delta Lake

Allow parsing ISO8601 timestamp in Delta lake transaction log

Update docker-images to 108

Disable spooling through session property

Remove obsolete JVM configuration

Performance-wise this doesn't improve memory usage

Reduce the query and task expiration times

This reduces the amount of memory needed to run the product test cluster

Suppress warning when building with mvnd

Require JDK 23.0.0

Add docs for spooling_protocol_enabled session property

Document $entries table in Iceberg

Document limit pushdown in BigQuery connector

Grant privileges only on TPCH tables in Redshift query runner

Granting privileges on all tables may cause unintended failures, as
temporary tables created in one test class may not have been fully
dropped or cleaned up from internal Redshift tables while other test
class executes grant privileges on all tables.

Remove unused NoopFunctionProvider

Fix incorrect result when reading deletion vectors in Delta Lake

This issue happens when Parquet file contains several pages and
predicates filter page when parquet_use_column_index is set to true.

Fix query runners failing to expose local ports

The logic was mistakenly inverted in
a99d96e.

Geospatial function ST_GeomFromKML

Add executeWithoutResults to StandaloneQueryRunner

It is needed to test the query execution without reading the
final query info which triggers the QueryCompletedEvent

Always fire QueryCompletedEvent for DDL queries

Previously the event was fired because client protocol
is reading the final query info. That is brittle and theoretically
could be removed making DDL queries fail to trigger query completed event.

Add authentication to the Preview Web UI

Ensure visibility of finalSinkMetrics

Without exchangeSink not being volatile it was possible that other
thread could observe nulled exchangeSink but still not set
finalSinkMetrics.

Fix logical merge conflict

Fix missing ts dependencies

Move JdbcRecordSetProvider construction to a module

Fix error message to actual variable name

Update supported clickhouse versions

Revert "Update supported clickhouse versions"

This reverts commit e188f97.

Reapply "Update supported clickhouse versions"

This reverts commit 523a305.

Add support for validating JDBC connections

Add docs for optimizer push filter

Clarify where to set legacy Hive properties for S3

The `trino.s3.use-web-identity-token-credentials-provider` property must
be set in the Hadoop config file, not as the connector property. This
needs to be clarified in the docs.

Clarify docs for multiple access controls

Bump Scylla docker images version to 6.2

Bump latest Cassandra docker images version to 5.0.2

WebUI: Sort queries on worker by descending reserved memory

Adjust docs for  metadata table in Iceberg

Improve code blocks and wording in Iceberg connector docs

Fix code formatting in Preview Web UI

Support unpartitioned tables in Kudu

Fix flaky TestQueryManagerConfig

This test relies on rounding large integers and will sometimes fail
depending on the amount of memory available to the test runner.

This commit reuses the exact same calculation for tests as it does in
the code, so that we will always get the correct value for the default.

Replace airlift's auth preserving client with okhttp

Update AWS SDK v2 to 2.29.50

Update httpcore5 to 5.3.2

Update google-api-client to 2.7.1

Update airbase to 211

Update okio to 3.10.2

Update commons-codec to 1.17.2

Update mongo to 5.3.0

Update airlift to 295

Add retry_policy session property docs

- also fix a bunch of style and formatting issues on the page

Avoid writing NaN and Infinity with json format table

Adjust docs for  manifests metadata table in Iceberg

Expose BigQuery RPC configuration settings

Document BigQuery RPC settings

Make the sentence assertive

Use only the enforced partition constraints dependency

Fix typo

Extract logic for appending transaction log file for OPTIMIZE

Add concurrent writes reconciliation for OPTIMIZE in Delta Lake

Allow committing OPTIMIZE operations in a concurrent context
by placing these operations right after any other previously
concurrently completed write operations.

Add info about Python client spooling support

Update airlift to 296

Add support for filtering by client tags in Web UI

Configure Phoenix server scan page timeout

Update netty to 4.1.117.Final

Update AWS SDK v2 to 2.29.50

Update metrics-core to 4.2.30

Update commons-text to 1.13.0

Update elasticsearch to 7.17.27

Update google sheets api to v4-rev20250106-2.0.0

Update reactor-core to 3.7.2

Add test for JsonSerializer handling

Use Java APIs instead of ByteStreams

Avoid exposing not required ports in Exasol

Co-Authored-By: YotillaAntoni <[email protected]>

Update Exasol image version to 8.32.0

Co-Authored-By: YotillaAntoni <[email protected]>

Remove system.iceberg_tables table function

We decided to add a system table instead.

This reverts commit 5ce80be.

Fix failure when setting NULL in UPDATE statement in JDBC-based connectors

Co-Authored-By: Yuya Ebihara <[email protected]>

Use <arg> instead of <compilerArg> to enable incubator module

`<compilerArg>` does not seem to be a valid child element of
`<compilerArgs>`. Even if it seems to work, it conflicts with the
`errorprone-compiler` profile, which appends additional compiler
arguments using `<arg>` (which is documented to be the correct child
element). When there is also `<compilerArg>`, it gets ignored (at least
by IDEA's POM importer).

Update google-cloud-sdk to 26.53.0

Update nimbus oauth2 sdk to 11.21

Update opencsv to 5.10

Update localstack image to 4.0.3

Update minio to RELEASE.2024-12-18T13-15-44Z

Decode JSON directly from Slice without materialization

Use newInputStream instead of FileInputStream

Support FIRST and AFTER clause when adding a new column in engine

Support FIRST and AFTER clause when adding a new column in Iceberg

Add docs for json_table

Co-authored-by: Michael Eby <[email protected]>

Make ElasticsearchServer Closeable

Derive predicate support from Trino type in ElasticSearch

Add test for column name containing special characters in Elasticsearch

Add dereference pushdown support in ElasticSearch

Reduce task.info.max-age to 5m

Default 15m sometiems causes some significant OOM issues on workers.

Bump docker/setup-qemu-action in the dependency-updates group

Bumps the dependency-updates group with 1 update: [docker/setup-qemu-action](https://github.com/docker/setup-qemu-action).

Updates `docker/setup-qemu-action` from 3.2.0 to 3.3.0
- [Release notes](https://github.com/docker/setup-qemu-action/releases)
- [Commits](docker/setup-qemu-action@49b3bc8...53851d1)

---
updated-dependencies:
- dependency-name: docker/setup-qemu-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: dependency-updates
...

Signed-off-by: dependabot[bot] <[email protected]>

Improve test coverage for ThriftHttpMetastoreClient

Remove meaningless call of Math.max

Update caffeine to 3.2.0

Update flyway to 11.2.0

Update swagger to 2.2.28

Update airbase to 212

Test the effectiveness of partial timestamp partition pruning

This test showcases that there is partition pruning done
at the Iceberg metadata layer even though EXPLAIN showcases
that a filter that does not fully match the partition transform
is not being pushed down.

Co-authored-by: Michiel De Smet <[email protected]>

Remove experimental warning for Python UDF

Rename BigQuery test credentials key config

`bigquery.credentials-key` testing config name clashes with the actual
config name and hence any new bigquery catalogs created through `CREATE
CATALOG` command by running BigQueryQueryRunner may not pick the
different key if provided in command.

Update AWS SDK v2 to 2.30.2

Add validation to segment size configuration

Add spooling session properties

Rename session property for consistency

Update graalvm to 24.1.2

Update AWS SDK v2 to 2.30.3

Update nessie to 0.102.0

Update openlineage to 1.26.0

Update google oauth2 client to 1.37.0

Update oauth2-oidc-sdk to 11.21.2

Remove unused argument

Verify that TableScanNode's assignments match the output symbols

Before this change, only one-way match was verified: that each
output symbol is backed by an assignment.

Suffix S3 path with separator for recursive delete

Without trailing path separator the recursive delete operation fails for
directory buckets (e.g. S3 Express)

Remove deprecated method call

Simplify boolean comparisons

Simplify conditionals

Move return statement to the empty catch block

Use switch expressions

Add configuration for maximum Arrow allocation in BigQuery

This helps bounding the memory allocations

Log failed buffer allocations

Add Arrow allocation stats to BigQuery

Add allocator stats to PageSource metrics

Fix splits generation from iceberg TableChangesSplitSource

Make configuration and session property names consistent

It makes it easier to reason about which session property maps to which configuration property.

More details about literals

Add Parquet writer session properties docs

Document spooling session properties

Refactor commitUpdateAndTransaction in Iceberg

Improve error handling for delete and truncate in Iceberg

Allow add column with position in base jdbc module

Support FIRST and AFTER clause when adding a new column in Mysql

Support FIRST and AFTER clause when adding a new column in MariaDb

Temporarily disable Snowflake tests

Update Java to 23.0.2

This brings updated timezonedb to version 2024b
(openjdk/jdk23u@73b2341)
which amends historical timezone definitions for Mexico/Bahia_Banderas
that we use for testing timezone gap around the unix timestamp epoch.

Corresponding Joda time update also has these timezone definitions
updated. PostgreSQL test server was upgraded to 12 to correctly handle
UTC around epoch. MySQL was updated to 8.0.41 due to the same reason.

Temporary downgrade JDK for ppc64 to make CI happy

Simplify char type docs

Add comparison example queries

Improve formatting and style

Fix error with multiple nested partition columns on Iceberg (trinodb#24629)

Adjust release template for 2025

Add Trino 469 release notes

[maven-release-plugin] prepare release 469

[maven-release-plugin] prepare for next development iteration

Remove deprecated method addColumn from JdbcClient

Support MERGE for MySQL connector

Co-Authored-By: Yuya Ebihara <[email protected]>

Remove the Kinesis connector

The connector seems unused from all interaction on the community
and following an investigation with users, vendors, and customers.

The SDK used is deprecated and will be removed.

More details about the removal are captured in
trinodb#23921

Fix failure when adding columns with dots in Iceberg

Update airbase to 213

Update oshi-core to 6.6.6

Update AWS SDK v2 to 2.30.6

Update mongo to 5.3.1

Update protobuf to 3.25.6

Update MySQL to 9.2.0

Update airlift to 298

Update client libraries to JDK 11

Use newTrinoTable in more places

Sometimes the new and old way were mixed in a single method. It also
makes `createTestTableForWrites` reuse this method to avoid duplication.

Fix null check in the array_histogram

Update Azure SDK to 1.2.31

Update AWS SDK v2 to 2.30.7

Update google api client to 2.7.2

Allow nessie to 0.102.2

Update jetbrains annotations to 26.0.2

Update airbase to 214 and airlift to 299

Set validation to WHEN_REQUIRED for FTE exchange

Use isEmpty instead of not isPresent

Modernize client dependencies

Use ImmutableList.copyOf

Remove redundant format call

Mark fields as final

Use more functional style for Optionals

Use String.isEmpty

Use StandardCharsets.UTF_8

Remove unnecessary .toString() call

Drop inferred type arguments

Replace statement lambda with expression lambda

Drop dead variable that is never read

Remove S3 legacy migration guide considerations

Cast table properties based on its type in Faker connector

Add test for Faker connector renameTable

Introduce Loki connector.

Co-authored-by: Janos <[email protected]>

Extract helper method to get table comment in tests

Co-Authored-By: Sandeep Thandassery <[email protected]>

Test COMMENT ON TABLE in Faker connector

Co-Authored-By: Sandeep Thandassery <[email protected]>

Allow setting catalog type in Iceberg query runner

Fix Timestamp assertion test for metrics query

Minor cleanup in Loki connector

Inform user what are extractable date/time fields

Make the error message for invalid `extract` more helpful.

Support non-lower case variables in functions

Update docs with updated Postgres version

Since 1350a8b we switched testing to 12

Close StatementClient when request is timed out

Add JMX metrics for S3 http client usage in native FS.

When s3 client was changed to use AWS SDK v2 from AWS SDK v1, the set of s3 pool metrics were missed.

With this PR, these http pool metrics are now exposed via JMX beans. The reference for the http metrics exposed by AWS SDK can be found here:
https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/http/HttpMetric.html

Fix a wrong issue link in 469 release notes

Sort connector names in EnvMultinodeAllConnectors

Tweak validation of BaseCaseInsensitiveMappingTest

Some databases may map varchar(5) to different types.

Add DuckDB connector

Clean up leftover schema for BigQuery tests

Clean up test schema in connector tests

Determine range constraints in Faker connector

When creating a table in the Faker connector from an existing table,
gather column statistics to determine range constraints, set them as
column properties.

Use actual row count as default limit

Generate low cardinality values in Faker connector

When creating a table in the Faker connector from an existing table,
using column statistics determine low cardinality columns, and generate
values from a randomly generated set.

Set null_probability based on stats

When creating tables in the Faker connector using CREATE TABLE AS
SELECT, use the NUMBER_OF_NON_NULL_VALUES column statistic to set the
null_probability column property.

Remove unreachable code from Exasol

Fix failure when equality delete updated nested fields in Iceberg

Remove 'LOCAL TEMPORARY' from DuckDbClient

Update gson to 2.12.0

Update AWS SDK v2 to 2.30.8

Update exasol-testcontainers to 7.1.3

Update httpcore5 to 5.3.3

Update resteasy-core to 6.0.3.Final

Update duckdb_jdbc to 1.1.3

Improve pom formatting

Update airbase to 215

Update airlift to 300

Update gcs-connector to 3.0.4

Fix connection leakage in the native Azure filesystem

This solves an issue with connection leaks that are happening for Azure Storage SDK
when OkHttp is used. OkHttp is not actively maintained, which makes the default,
Netty implementation, a better choice for the future as it's actively maintained
and tested.

Simplify code

Event loop group/maximum number of concurrent requests are already configured
so the removed setting was a noop.

Propagate function lifecycle events to SystemSecurityMetadata

Use arrays instead of lists in OpenX RowDecoder

Use array instead of list for Hive JSON decoders

Reuse fieldWritten array between rows

Check non-null constructor arguments

Avoid re-checking isScalarType

Improve case insensitive matching fragement

Reenable Snowflake tests

This reverts commit 5c43149
and applies necessary fixes to the changes that were introduced
in the meantime.

Add docs for authentication with Preview Web UI

Make number of commit retries configurable in Iceberg

Correctly categorize external errors in Iceberg REST catalog

Add TestLokiPlugin

Add Loki to EnvMultinodeAllConnectors

Add more connectors to labeler-config

Support multiple state filters on getAllQueryInfo API

Add workers pages to the Preview Web UI

Allow Hive metastore caching for Iceberg

Remove table listing cache from FileHiveMetastore

Fix typo in Python docs

Remove deprecated precomputed hash optimizer

Update snowflake-jdbc to 3.22.0

Update JLine to 3.29.0

Update commons-codec to 1.18.0

Update flyway to 11.3.0

Update AWS SDK v2 to 2.30.10

Update npm to 11.1.0

Update node to 22.13.1

Deprecate support for IBM COS via Hive

Bump actions/stale from 9.0.0 to 9.1.0 in the dependency-updates group

Bumps the dependency-updates group with 1 update: [actions/stale](https://github.com/actions/stale).

Updates `actions/stale` from 9.0.0 to 9.1.0
- [Release notes](https://github.com/actions/stale/releases)
- [Changelog](https://github.com/actions/stale/blob/main/CHANGELOG.md)
- [Commits](actions/stale@v9.0.0...v9.1.0)

---
updated-dependencies:
- dependency-name: actions/stale
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: dependency-updates
...

Signed-off-by: dependabot[bot] <[email protected]>

Fix Vacuum deleting files with whitespace in paths

When a Delta Lake file path contains whitespace, the VACUUM procedure
unexpectedly removes it. This commit fixes the issue by wrapping the
file path using RFC 2396 URI encoding, ensuring consistency with how
file paths are handled when writing add or remove entries

Update airlift to 301

Close FileInputStream while loading keystore

Update reactor-netty-core to 1.1.26

Update AWS SDK v2 to 2.30.11

Update minio to 8.5.17

Update oauth2-oidc-sdk to 11.21.3

Update gson to 2.12.1

Update commons-pool2	to 2.12.1

Update loki-client to 0.0.3

Update nessie to 0.102.4

Update httpclient5 to 5.4.2

upgrade
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
roadmap Top level issues for major efforts in the project
Development

No branches or pull requests

2 participants