Create plan for Kinesis connector changes #23921

mosabua · 2024-10-25T16:44:28Z

The Kinesis connector seems to have very limited usage from looking at slack conversations and vendor data.

The current connector codebase uses a old, deprecated SDK for Kinesis that is now in maintenance mode and will be completely deprecated in 2025. Nobody is available to improve the connector and upgrade it to the newer SDK.

This issue was raised in a maintainer call on the 24th of October 2024 by @wendigo and discussed with all the present maintainers.

We are contemplating removal of the connector similar to #23792 .

We are currently looking for more input and data about potential usage. Please comment in this ticket if you are using the Kinesis connector.

More importantly we are also looking for contributors who are willing to update the connector to the new SDK.

raunaqmorarka · 2024-10-25T18:03:02Z

That connector was contributed to Trino by one of my colleagues at Qubole. We contributed it because we had no customers using it, didn't want to invest any more in maintaining it and thought that giving it to the community might breathe some life into it.
My impression is that it has stayed unused in Trino as well and it should be okay to remove it.

mosabua · 2024-10-25T18:17:09Z

I prepared a PR in case we want to go ahead with removal - which seems likely at this stage.

mosabua · 2024-10-29T16:55:39Z

I reached out to AWS team and Amazon Kinesis team .. hopefully we get some input from them.

The connector seems unused from all interaction on the community and following an investigation with users, vendors, and customers. The SDK used is deprecated and will be removed. More details about the removal are captured in trinodb#23921

mosabua · 2024-12-09T17:54:26Z

At this stage we will ask again at Trino Summit but will most likely remove the connector. No vendors chimed in with any interest either.

mosabua · 2025-01-07T18:14:44Z

User survey from https://trino.io/blog/2025/01/07/2024-and-beyond includes a question for this change.

The connector seems unused from all interaction on the community and following an investigation with users, vendors, and customers. The SDK used is deprecated and will be removed. More details about the removal are captured in trinodb#23921

mosabua · 2025-01-27T23:51:04Z

No usage of the connector was reported in the survey, the January 2025 contributor call and further discussions on slack. As such we are proceeding to remove the connector with the Trino 470 release.

The connector seems unused from all interaction on the community and following an investigation with users, vendors, and customers. The SDK used is deprecated and will be removed. More details about the removal are captured in trinodb#23921

The connector seems unused from all interaction on the community and following an investigation with users, vendors, and customers. The SDK used is deprecated and will be removed. More details about the removal are captured in #23921

mosabua · 2025-01-28T03:44:49Z

PR for removal is merged. It will be part of the upcoming Trino 470 release.

The connector seems unused from all interaction on the community and following an investigation with users, vendors, and customers. The SDK used is deprecated and will be removed. More details about the removal are captured in trinodb#23921

Inline method addPrimaryKeyToCopyTable Inline method `addPrimaryKeyToCopyTable` in `TestPostgreSqlJdbcConnectionAccesses` and `TestPostgreSqlJdbcConnectionCreation` for readability Fix session property description for NON_TRANSACTIONAL_MERGE Also change the description of `NON_TRANSACTIONAL_INSERT` to the "Enables support for non-transactional INSERT" Fix storage table data clean up while dropping iceberg materialized view This is for cleaning up data files in legacy mode i.e iceberg.materialized-views.hide-storage-table is set to false. Delegating the drop table to metastore does not clean up the data files since for HMS, the iceberg table is registered as an "external" table. So to fix this instead of delegating to metastore, have the connector do the drop of the table and data files associated with it. Rename partitions to partition_summaries in Iceberg manifests table Restore removed assertions These assertions are still useful since they don't use the MERGE code path. These were mistakenly removed in e88e2b1. Only enable MERGE for MERGE specific tests Before this change MERGE code path could inadvertently be used in places where we are not interested in testing MERGE. Use getFileSystemFactory in BaseIcebergMaterializedViewTest Extract helper method to get HiveMetastore Inject execution interceptors using multibinder The GlueMetastoreModule is not truly following the inversion of control paradigm. Many things are created directly in the methods without using injection. Using set multibinder for execution interceptors allows to independently define execution interceptors and use Guice injection. Co-authored-by: Grzegorz Kokosiński <[email protected]> Extract table features constants and add ProtocolEntry builder Add missing table features when adding timestamp_ntz in Delta Update Hudi library to 1.0.0 Update airbase to 204 Update airlift to 292 Update metrics-core to 4.2.29 Update reactor-core to 3.7.1 Update swagger to 2.2.27 Update AWS SDK v2 to 2.29.31 Update JLine to 3.28.0 Update nessie to 0.101.1 Update s3mock testcontainers to 3.12.0 Update exasol to 24.2.1 Update google-sheets api to v4-rev20241203-2.0.0 Add assertConsistently Use BigQuery storage read API when reading external BigLake tables The storage APIs support reading BigLake external tables (ie external tables with a connection). But the current implementation uses views which can be expensive, because it requires a query. This PR adds support to read BigLake tables directly using the storage API. There are no behavior changes for external tables and BQ native tables - they use the view and storage APIs respectively. Added a new test for BigLake tables. Co-authored-by: Marcin Rusek <[email protected]> Make OutputBufferInfo not comparable This is not needed Expose exchange sink metrics in operator and stage stats Inline constant Expose output buffer metrics in query completion event Expose filesystem exchange sink stats Correctly categorize filesystem error in Iceberg connector Webapp Preview: Cluster Overview with sparklines Use executor service for iceberg scan planning system tables Add Python UDF support to binaries Use data size for delta metadata cache Reduces chances of coordinator OOM by accounting for retained size of objects in delta metadata cache Increase default delta.metadata.cache-ttl to 30m TTL can be higher because the cached metadata is immutable and the space occupied by it in memory is accounted for Update config description for insert.non-transactional-insert.enabled Document default value of Iceberg object_store_enabled table property Improve performance of Python functions Fix and enable kudu update test Add iceberg.bucket-execution to documentation Introduce NodeStateManagerModule A refactor - rename to prepare for adding new logic. Reactivation of worker nodes Adds new node states to enable full control over shutdown and reactivation of workers. - state: DRAINING - a reversible shutdown, - state: DRAINED - all tasks are finished, server can be safely and quickly stopped. Can still go back to ACTIVE. Update AWS SDK v2 to 1.12.780 Update docker-java to 3.4.1 Update flyway to 11.1.0 Update AWS SDK v2 to 2.29.34 Update airbase to 205 Restructure SQL routine docs Move them in appropriate folders for user-defined functions and SQL user-defined functions. Update all references so that the docs build process fully works. Add redirectors for SQL routine change Reword from SQL routine to SQL UDF And generally introduce user-defined functions (UDF) as a term. Move SQL UDF content Into the separate page, and adjust the generic content to be suitable for any UDF language. Move SQL UDF content Into the separate page, and adjust the generic content to be suitable for any UDF language. Add docs for Python UDFs Remove unnecessary annotation in Kudu connector test Update to oryd/hydra:v1.11.10 for OAuth testing Fix build issues on newer OS/hardware. Use latest 1.x release since 2.x causes container start issues without further changes. Update airbase to 206 and airlift to 293 Update AWS SDK v2 to 2.29.35 Update netty to 4.1.116.Final Update gcs connector to hadoop3-2.2.26 Add example for inline and catalog Python UDF Improve inline and catalog SQL UDF docs Add Trino 468 release notes [maven-release-plugin] prepare release 468 [maven-release-plugin] prepare for next development iteration Do not require Java presence for RPM installation This allows for custom JDK to be used when running Trino with launcher --jvm-dir argument. Allow left side as update target in join pushdown Allow left side as update target when try to pushdown join into table scan. The change prevent the pushdown join into the table scan instead of throwing exception Co-Authored-By: Łukasz Osipiuk <[email protected]> Bind filesystem cache metrics per catalog Cleanup InformationSchemaPageSource projection Avoids unnecessary Integer boxing and Block[] allocations in InformationSchemaPageSource by using Page#getColumns. Update docker image version to 107 Rename HiveMinioDataLake to Hive3MinioDataLake Extract HiveMinioDataLake class Extract BaseTestHiveOnDataLake to reuse it across Hive3/4 test Add TestHive3OnDataLake test Add S3 Hive4 query runner Add TestHive4OnDataLake test Add TestTrinoHive4CatalogWithHiveMetastore test Extract requireEnv in a util class SystemEnvironmentUtils Remove unnecessay requirenment of password in test in TestSalesforceBasicAuthenticator#createAuthenticatedPrincipalRealBadPassword Use SystemEnvironmentUtils#requireEnv Add and use SystemEnvironmentUtils#isEnvSet method Fix misspelling Test concurrent update without partition Extract KuduColumnProperties from KuduTableProperties Fix S3InputStream's handling of large skips When the skip(n) method is called the MAX_SKIP_BYTES check is skipped, resulting in the call potentially blocking for a long time. Instead of delegating to the underlying stream, set the nextReadPosition value. This allows the next read to decide if it is best to keep the existing s3 object stream or open a new one. This behavior matches the implementations for Azure and GCS. Update google cloud SDK to 26.52.0 Update AWS SDK v2 to 2.29.37 Use QUERY_EXCEEDED_COMPILER_LIMIT error code Include deletion vector when filtering active add entries in Delta Add nonnull check for directExchangeClientSupplier Refactor the PlanTester to pass the nonnull `directExchangeClientSupplier`. Also add nonnull check for the `sourceId`, `serdeFactory` in `ExchangeOperator` Make FilesTable.toJson method package-private Add $entries metadata table to Iceberg Run Iceberg concurrent tests multiple times Update client driver and application sections Document sort direction and null order in Iceberg Rename executor to icebergScanExecutor Improve performance when listing columns in Iceberg Remove support for Databricks 9.1 LTS Pin openpolicyagent/opa version as 0.70.0 Add tests for dropMaterializedView The dropMaterializedView on TrinoHiveCatalog that uses FileHiveMetastore works only if unique table locations are enabled. Replace testView method override with getViewType Extend testListTables with other relation types TrinoCatalog.listTables returns not only iceberg tables but also other relations like views, materialized view or non-iceberg tables. Add HiveMetastore.getTableNamesWithParameters Add TrinoCatalog.listIcebergTables Add system.iceberg_tables table function Add the ability to list only iceberg tables from the iceberg catalog. Before this change, there was no way to list only iceberg tables. The SHOW TABLES statement, information_schema.tables, and jdbc.tables will all return all tables that exist in the underlying metastore, even if the table cannot be handled in any way by the iceberg connector. This can happen if other connectors like hive or delta, use the same metastore, catalog, and schema to store its tables. The function accepts an optional parameter with the schema name. Sample statements: SELECT * FROM TABLE(iceberg.system.iceberg_tables()); SELECT * FROM TABLE(iceberg.system.iceberg_tables(SCHEMA_NAME => 'test')); Fix failures in iceberg cloud tests Support MERGE for Ignite connector Remove unused delta-lake-databricks-104 test group Add $all_entries metadata table to Iceberg Delete the oldest tracked version metadata files after commit Using airlift log in SimulationController Stream large transaction log jsons instead of storing in-memory Operations fetching metadata and protocol entries can skip reading the rest of the json file after those entries are found Move `databricksTestJdbcUrl()` method after the constructor Use `ZSTD` Parquet compression codec for Delta Lake by default Add query execution metrics to JDBC QueryStats Adds planningTimeMillis, analysisTimeMillis, finishingTimeMillis, physicalInputBytes, physicalWrittenBytes and internalNetworkInputBytes to allow JDBC clients to get some important metrics about query execution Allow configuring parquet_bloom_filter_columns in Iceberg Verify invalid bloom filter properties in Iceberg Remove unspecified bloom filter when setting properties in Iceberg Support managing views in the Faker connector Add views support to the Faker connector docs Add min, max, and allowed_values column properties in Faker connector Allow constraining generated values by setting the min, max, or allowed_values column properties. Remove predicate pushdowns in the Faker connector Predicate pushdown in the Faker connector violates the SQL semantics, because when applied to separate columns, correlation between columns is not preserved, and returned results are not deterministic. The `min`, `max`, and `options` column properties should be used instead. Refactor Faker tests to be more readable Remove outdated limitations in Faker's docs Rename to Trino in product tests Extract dep.gib.version property Convert testRollbackToSnapshotWithNullArgument to integration test Add newTrinoTable method to AbstractTestQueryFramework Allow running product tests on IPv6 stack Update nimbus-jose-jwt to 9.48 Update jna to 5.16.0 Update AWS SDK v2 to 2.29.43 Update openlineage-java to 1.26.0 Update airbase to 209 Update airlift to 294 Update freemarker 2.3.34 Remove unreachable code in OrderedPeriodParser Avoid parsing min and max twice Avoid and/or in Faker docs Refactor FakerPageSource to have fewer faker references Extract typed ranges in Faker's page source Fix handling upper bounds in FakerPageSource Fix handling upper bounds for floating point types. The implementation did not account for rounding issue near the bound, and the test was using values outside of the allowed range. Refactor FakerPageSource Refactor to make subsequent commit's diff smaller Extract a method in FakerColumnHandle Support generating sequences in the Faker connector Configure SSL for unauthenticated client Unauthenticated client can connect to the Trino cluster when loading segment data. If cluster has its' own certificate chain - client needs to accept it according to the configuration. Simplify conditions Allow resource access type on class-level For io.trino resources it's now impossible to use class-level @ResourceType annotations due to the invalid condition. Check whole class hierarchy for @ResourceSecurity Use class-level @ResourceSecurity annotations Fix parsing of negative 0x, 0b, 0o long literals Update lucene-analysis-common to 10.1.0 Correctly merge multiple splits info Previously SplitOperatorInfo wasn't Mergeable and hence base OperatorInfo(OperatorStats#getMergeableInfoOrNull) was null. Prepare to implement a page source provider for Redshift Fetch Redshift query results unloaded to S3 Co-authored-by: Mayank Vadariya <[email protected]> Copy all TPCH tables during initialization in TestRedshiftUnload Add physicalInputTimeMillis to io.trino.jdbc.QueryStats Fix listing of files in AlluxioFileSystem Co-authored by: JiamingMai <[email protected]> Add info about views in memory only for Faker connector Minor improvements to Python UDF docs Improve docs for non-transactional merge As applicable for PostgreSQL connector for now. Also extract into a fragment so it can be reused in other connectors. Improve SQL support section in JDBC connectors - No content changes but... - Consistent wording _ Markdown link syntax - Move related configs to SQL support section - Improve list and rejig as small local ToC, add links Add non-transactional MERGE docs for Phoenix Add non-transactional MERGE docs for Ignite Remove unused metastore classes Remove unused TestingIcebergHiveMetastoreCatalogModule Replace usage of RetryDriver in HiveMetadata Move partition utility methods to Partitions class Move HiveMetastoreFactory to metastore module Cleanup binding of FlushMetadataCacheProcedure Move CachingHiveMetastore to metastore module Move Glue v1 converters into Glue package Move already exists exceptions to metastore module Move ForHiveMetastore annotation to thrift package Move RetryDriver to thrift package Move Avro utility method to ThriftMetastoreUtil Add SSE-C option on native-filesystem security mapping Remove SPI exclusion from previous release Remove support for connector event listeners Inline partition projection name in HiveConfig Remove shadowed field from MockConnectorMetadata Remove optional binder for HivePageSourceProvider Remove unused HiveMaterializedViewPropertiesProvider Remove unused HiveRedirectionsProvider Remove unused DeltaLakeRedirectionsProvider Remove unused function providers from Hive connector Remove deprecated ConnectorMetadata.getTableHandleForExecute Remove deprecated ConnectorMetadata.beginMerge Remove optional binder for IcebergPageSourceProviderFactory Make SqlVarbinary map serialization consistent with other types Other Sql* classes are serialized to JSON on-the-wire format, using @JsonValue annotated toString methods, except for SqlVarbinary which was serialized using its' getBytes() method that was Base64-encoded to a map key. Decouple Sql types from JSON serialization The new JSON serialization is not using ObjectMapper to serialize these values anymore. We want to decouple SPI types from JSON representation to be able to introduce alternative encoding formats. Derive aws sdk retry count from request count Minor cleanup in Hive procedures Move createParquetMetadata to ParquetMetadata Parse parquet footer row groups lazily Write row group fileOffset in parquet file footer Parse only required row groups from parquet footer Extend AbstractTestQueryFramework in TestRedshiftUnload Use correct table operations provider for Thrift metastore in Delta Update zstd-jni to 1.5.6-9 Update nimbus-jose-jwt to 10.0 Update AWS SDK v2 to 2.29.44 Update nessie to 0.101.3 Allow configuring gcs service endpoint Update delta-kernel to 3.3.0 Allow configuring orc_bloom_filter_columns table property in Iceberg Lower log level Add rollback_to_snapshot table procedure in Iceberg Remove <name> from trino-ranger pom We are not using it for other modules which results in the build output being inconsistent Instantiate tpch connector using bootstrap Synchronize types in (Client)StandardTypes Enumerate all types while decoding This makes it explicit in regard to the ClientStandardTypes list of types Fix deserialization of KDB_Tree and BingTile These types have a custom serialization logic utilizing JsonCreator/JsonProperty annotations. Improve SetDigest serialization Inline SetDigestType.NAME Use StandardTypes.BING_TILE const Use StandardTypes.GEOMETRY const Use StandardTypes.KDB_TREE const Use StandardTypes.SPHERICAL_GEOMETRY const Fix deserialization of Color type Clarify that spooling locations must not be shared Add docs for gcs.endpoint Enable Phoenix product test Fix wrong config name in OPA documentation Aplhabetize additional IDE configurations on docs Remove defunct hive metastore property from docs Removing property `hive.metastore.thrift.batch-fetch.enabled` from docs, which was marked defunct in Trino 443. Fix correctness issue when writing deletion vectors in Delta Lake Improve developer docs for connector MERGE support Fix gcs.endpoint property name in docs Enable dynamic catalogs for product-tests Since this uses the default catalog store set to `file`, there should not be issues with switching it all over to dynamic catalog management. Improve docs for Redshift parallel read Add counter for tasks created on worker Expose worker tracked tasks count Update okio to 3.10.1 Update nimbus-jose-jwt to 10.0.1 Update minio to 8.5.15 Update snowflake-jdbc to 3.21.0 Update checker-qual to 3.48.4 Update flyway to 11.1.1 Update AWS SDK v2 to 2.29.47 Update org.json to 20250107 Update airbase to 210 Test timestamp parsing in Delta Lake Allow parsing ISO8601 timestamp in Delta lake transaction log Update docker-images to 108 Disable spooling through session property Remove obsolete JVM configuration Performance-wise this doesn't improve memory usage Reduce the query and task expiration times This reduces the amount of memory needed to run the product test cluster Suppress warning when building with mvnd Require JDK 23.0.0 Add docs for spooling_protocol_enabled session property Document $entries table in Iceberg Document limit pushdown in BigQuery connector Grant privileges only on TPCH tables in Redshift query runner Granting privileges on all tables may cause unintended failures, as temporary tables created in one test class may not have been fully dropped or cleaned up from internal Redshift tables while other test class executes grant privileges on all tables. Remove unused NoopFunctionProvider Fix incorrect result when reading deletion vectors in Delta Lake This issue happens when Parquet file contains several pages and predicates filter page when parquet_use_column_index is set to true. Fix query runners failing to expose local ports The logic was mistakenly inverted in a99d96e. Geospatial function ST_GeomFromKML Add executeWithoutResults to StandaloneQueryRunner It is needed to test the query execution without reading the final query info which triggers the QueryCompletedEvent Always fire QueryCompletedEvent for DDL queries Previously the event was fired because client protocol is reading the final query info. That is brittle and theoretically could be removed making DDL queries fail to trigger query completed event. Add authentication to the Preview Web UI Ensure visibility of finalSinkMetrics Without exchangeSink not being volatile it was possible that other thread could observe nulled exchangeSink but still not set finalSinkMetrics. Fix logical merge conflict Fix missing ts dependencies Move JdbcRecordSetProvider construction to a module Fix error message to actual variable name Update supported clickhouse versions Revert "Update supported clickhouse versions" This reverts commit e188f97. Reapply "Update supported clickhouse versions" This reverts commit 523a305. Add support for validating JDBC connections Add docs for optimizer push filter Clarify where to set legacy Hive properties for S3 The `trino.s3.use-web-identity-token-credentials-provider` property must be set in the Hadoop config file, not as the connector property. This needs to be clarified in the docs. Clarify docs for multiple access controls Bump Scylla docker images version to 6.2 Bump latest Cassandra docker images version to 5.0.2 WebUI: Sort queries on worker by descending reserved memory Adjust docs for metadata table in Iceberg Improve code blocks and wording in Iceberg connector docs Fix code formatting in Preview Web UI Support unpartitioned tables in Kudu Fix flaky TestQueryManagerConfig This test relies on rounding large integers and will sometimes fail depending on the amount of memory available to the test runner. This commit reuses the exact same calculation for tests as it does in the code, so that we will always get the correct value for the default. Replace airlift's auth preserving client with okhttp Update AWS SDK v2 to 2.29.50 Update httpcore5 to 5.3.2 Update google-api-client to 2.7.1 Update airbase to 211 Update okio to 3.10.2 Update commons-codec to 1.17.2 Update mongo to 5.3.0 Update airlift to 295 Add retry_policy session property docs - also fix a bunch of style and formatting issues on the page Avoid writing NaN and Infinity with json format table Adjust docs for manifests metadata table in Iceberg Expose BigQuery RPC configuration settings Document BigQuery RPC settings Make the sentence assertive Use only the enforced partition constraints dependency Fix typo Extract logic for appending transaction log file for OPTIMIZE Add concurrent writes reconciliation for OPTIMIZE in Delta Lake Allow committing OPTIMIZE operations in a concurrent context by placing these operations right after any other previously concurrently completed write operations. Add info about Python client spooling support Update airlift to 296 Add support for filtering by client tags in Web UI Configure Phoenix server scan page timeout Update netty to 4.1.117.Final Update AWS SDK v2 to 2.29.50 Update metrics-core to 4.2.30 Update commons-text to 1.13.0 Update elasticsearch to 7.17.27 Update google sheets api to v4-rev20250106-2.0.0 Update reactor-core to 3.7.2 Add test for JsonSerializer handling Use Java APIs instead of ByteStreams Avoid exposing not required ports in Exasol Co-Authored-By: YotillaAntoni <[email protected]> Update Exasol image version to 8.32.0 Co-Authored-By: YotillaAntoni <[email protected]> Remove system.iceberg_tables table function We decided to add a system table instead. This reverts commit 5ce80be. Fix failure when setting NULL in UPDATE statement in JDBC-based connectors Co-Authored-By: Yuya Ebihara <[email protected]> Use <arg> instead of <compilerArg> to enable incubator module `<compilerArg>` does not seem to be a valid child element of `<compilerArgs>`. Even if it seems to work, it conflicts with the `errorprone-compiler` profile, which appends additional compiler arguments using `<arg>` (which is documented to be the correct child element). When there is also `<compilerArg>`, it gets ignored (at least by IDEA's POM importer). Update google-cloud-sdk to 26.53.0 Update nimbus oauth2 sdk to 11.21 Update opencsv to 5.10 Update localstack image to 4.0.3 Update minio to RELEASE.2024-12-18T13-15-44Z Decode JSON directly from Slice without materialization Use newInputStream instead of FileInputStream Support FIRST and AFTER clause when adding a new column in engine Support FIRST and AFTER clause when adding a new column in Iceberg Add docs for json_table Co-authored-by: Michael Eby <[email protected]> Make ElasticsearchServer Closeable Derive predicate support from Trino type in ElasticSearch Add test for column name containing special characters in Elasticsearch Add dereference pushdown support in ElasticSearch Reduce task.info.max-age to 5m Default 15m sometiems causes some significant OOM issues on workers. Bump docker/setup-qemu-action in the dependency-updates group Bumps the dependency-updates group with 1 update: [docker/setup-qemu-action](https://github.com/docker/setup-qemu-action). Updates `docker/setup-qemu-action` from 3.2.0 to 3.3.0 - [Release notes](https://github.com/docker/setup-qemu-action/releases) - [Commits](docker/setup-qemu-action@49b3bc8...53851d1) --- updated-dependencies: - dependency-name: docker/setup-qemu-action dependency-type: direct:production update-type: version-update:semver-minor dependency-group: dependency-updates ... Signed-off-by: dependabot[bot] <[email protected]> Improve test coverage for ThriftHttpMetastoreClient Remove meaningless call of Math.max Update caffeine to 3.2.0 Update flyway to 11.2.0 Update swagger to 2.2.28 Update airbase to 212 Test the effectiveness of partial timestamp partition pruning This test showcases that there is partition pruning done at the Iceberg metadata layer even though EXPLAIN showcases that a filter that does not fully match the partition transform is not being pushed down. Co-authored-by: Michiel De Smet <[email protected]> Remove experimental warning for Python UDF Rename BigQuery test credentials key config `bigquery.credentials-key` testing config name clashes with the actual config name and hence any new bigquery catalogs created through `CREATE CATALOG` command by running BigQueryQueryRunner may not pick the different key if provided in command. Update AWS SDK v2 to 2.30.2 Add validation to segment size configuration Add spooling session properties Rename session property for consistency Update graalvm to 24.1.2 Update AWS SDK v2 to 2.30.3 Update nessie to 0.102.0 Update openlineage to 1.26.0 Update google oauth2 client to 1.37.0 Update oauth2-oidc-sdk to 11.21.2 Remove unused argument Verify that TableScanNode's assignments match the output symbols Before this change, only one-way match was verified: that each output symbol is backed by an assignment. Suffix S3 path with separator for recursive delete Without trailing path separator the recursive delete operation fails for directory buckets (e.g. S3 Express) Remove deprecated method call Simplify boolean comparisons Simplify conditionals Move return statement to the empty catch block Use switch expressions Add configuration for maximum Arrow allocation in BigQuery This helps bounding the memory allocations Log failed buffer allocations Add Arrow allocation stats to BigQuery Add allocator stats to PageSource metrics Fix splits generation from iceberg TableChangesSplitSource Make configuration and session property names consistent It makes it easier to reason about which session property maps to which configuration property. More details about literals Add Parquet writer session properties docs Document spooling session properties Refactor commitUpdateAndTransaction in Iceberg Improve error handling for delete and truncate in Iceberg Allow add column with position in base jdbc module Support FIRST and AFTER clause when adding a new column in Mysql Support FIRST and AFTER clause when adding a new column in MariaDb Temporarily disable Snowflake tests Update Java to 23.0.2 This brings updated timezonedb to version 2024b (openjdk/jdk23u@73b2341) which amends historical timezone definitions for Mexico/Bahia_Banderas that we use for testing timezone gap around the unix timestamp epoch. Corresponding Joda time update also has these timezone definitions updated. PostgreSQL test server was upgraded to 12 to correctly handle UTC around epoch. MySQL was updated to 8.0.41 due to the same reason. Temporary downgrade JDK for ppc64 to make CI happy Simplify char type docs Add comparison example queries Improve formatting and style Fix error with multiple nested partition columns on Iceberg (trinodb#24629) Adjust release template for 2025 Add Trino 469 release notes [maven-release-plugin] prepare release 469 [maven-release-plugin] prepare for next development iteration Remove deprecated method addColumn from JdbcClient Support MERGE for MySQL connector Co-Authored-By: Yuya Ebihara <[email protected]> Remove the Kinesis connector The connector seems unused from all interaction on the community and following an investigation with users, vendors, and customers. The SDK used is deprecated and will be removed. More details about the removal are captured in trinodb#23921 Fix failure when adding columns with dots in Iceberg Update airbase to 213 Update oshi-core to 6.6.6 Update AWS SDK v2 to 2.30.6 Update mongo to 5.3.1 Update protobuf to 3.25.6 Update MySQL to 9.2.0 Update airlift to 298 Update client libraries to JDK 11 Use newTrinoTable in more places Sometimes the new and old way were mixed in a single method. It also makes `createTestTableForWrites` reuse this method to avoid duplication. Fix null check in the array_histogram Update Azure SDK to 1.2.31 Update AWS SDK v2 to 2.30.7 Update google api client to 2.7.2 Allow nessie to 0.102.2 Update jetbrains annotations to 26.0.2 Update airbase to 214 and airlift to 299 Set validation to WHEN_REQUIRED for FTE exchange Use isEmpty instead of not isPresent Modernize client dependencies Use ImmutableList.copyOf Remove redundant format call Mark fields as final Use more functional style for Optionals Use String.isEmpty Use StandardCharsets.UTF_8 Remove unnecessary .toString() call Drop inferred type arguments Replace statement lambda with expression lambda Drop dead variable that is never read Remove S3 legacy migration guide considerations Cast table properties based on its type in Faker connector Add test for Faker connector renameTable Introduce Loki connector. Co-authored-by: Janos <[email protected]> Extract helper method to get table comment in tests Co-Authored-By: Sandeep Thandassery <[email protected]> Test COMMENT ON TABLE in Faker connector Co-Authored-By: Sandeep Thandassery <[email protected]> Allow setting catalog type in Iceberg query runner Fix Timestamp assertion test for metrics query Minor cleanup in Loki connector Inform user what are extractable date/time fields Make the error message for invalid `extract` more helpful. Support non-lower case variables in functions Update docs with updated Postgres version Since 1350a8b we switched testing to 12 Close StatementClient when request is timed out Add JMX metrics for S3 http client usage in native FS. When s3 client was changed to use AWS SDK v2 from AWS SDK v1, the set of s3 pool metrics were missed. With this PR, these http pool metrics are now exposed via JMX beans. The reference for the http metrics exposed by AWS SDK can be found here: https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/http/HttpMetric.html Fix a wrong issue link in 469 release notes Sort connector names in EnvMultinodeAllConnectors Tweak validation of BaseCaseInsensitiveMappingTest Some databases may map varchar(5) to different types. Add DuckDB connector Clean up leftover schema for BigQuery tests Clean up test schema in connector tests Determine range constraints in Faker connector When creating a table in the Faker connector from an existing table, gather column statistics to determine range constraints, set them as column properties. Use actual row count as default limit Generate low cardinality values in Faker connector When creating a table in the Faker connector from an existing table, using column statistics determine low cardinality columns, and generate values from a randomly generated set. Set null_probability based on stats When creating tables in the Faker connector using CREATE TABLE AS SELECT, use the NUMBER_OF_NON_NULL_VALUES column statistic to set the null_probability column property. Remove unreachable code from Exasol Fix failure when equality delete updated nested fields in Iceberg Remove 'LOCAL TEMPORARY' from DuckDbClient Update gson to 2.12.0 Update AWS SDK v2 to 2.30.8 Update exasol-testcontainers to 7.1.3 Update httpcore5 to 5.3.3 Update resteasy-core to 6.0.3.Final Update duckdb_jdbc to 1.1.3 Improve pom formatting Update airbase to 215 Update airlift to 300 Update gcs-connector to 3.0.4 Fix connection leakage in the native Azure filesystem This solves an issue with connection leaks that are happening for Azure Storage SDK when OkHttp is used. OkHttp is not actively maintained, which makes the default, Netty implementation, a better choice for the future as it's actively maintained and tested. Simplify code Event loop group/maximum number of concurrent requests are already configured so the removed setting was a noop. Propagate function lifecycle events to SystemSecurityMetadata Use arrays instead of lists in OpenX RowDecoder Use array instead of list for Hive JSON decoders Reuse fieldWritten array between rows Check non-null constructor arguments Avoid re-checking isScalarType Improve case insensitive matching fragement Reenable Snowflake tests This reverts commit 5c43149 and applies necessary fixes to the changes that were introduced in the meantime. Add docs for authentication with Preview Web UI Make number of commit retries configurable in Iceberg Correctly categorize external errors in Iceberg REST catalog Add TestLokiPlugin Add Loki to EnvMultinodeAllConnectors Add more connectors to labeler-config Support multiple state filters on getAllQueryInfo API Add workers pages to the Preview Web UI Allow Hive metastore caching for Iceberg Remove table listing cache from FileHiveMetastore Fix typo in Python docs Remove deprecated precomputed hash optimizer Update snowflake-jdbc to 3.22.0 Update JLine to 3.29.0 Update commons-codec to 1.18.0 Update flyway to 11.3.0 Update AWS SDK v2 to 2.30.10 Update npm to 11.1.0 Update node to 22.13.1 Deprecate support for IBM COS via Hive Bump actions/stale from 9.0.0 to 9.1.0 in the dependency-updates group Bumps the dependency-updates group with 1 update: [actions/stale](https://github.com/actions/stale). Updates `actions/stale` from 9.0.0 to 9.1.0 - [Release notes](https://github.com/actions/stale/releases) - [Changelog](https://github.com/actions/stale/blob/main/CHANGELOG.md) - [Commits](actions/stale@v9.0.0...v9.1.0) --- updated-dependencies: - dependency-name: actions/stale dependency-type: direct:production update-type: version-update:semver-minor dependency-group: dependency-updates ... Signed-off-by: dependabot[bot] <[email protected]> Fix Vacuum deleting files with whitespace in paths When a Delta Lake file path contains whitespace, the VACUUM procedure unexpectedly removes it. This commit fixes the issue by wrapping the file path using RFC 2396 URI encoding, ensuring consistency with how file paths are handled when writing add or remove entries Update airlift to 301 Close FileInputStream while loading keystore Update reactor-netty-core to 1.1.26 Update AWS SDK v2 to 2.30.11 Update minio to 8.5.17 Update oauth2-oidc-sdk to 11.21.3 Update gson to 2.12.1 Update commons-pool2 to 2.12.1 Update loki-client to 0.0.3 Update nessie to 0.102.4 Update httpclient5 to 5.4.2 upgrade

Reorder connector behavior in TestPostgreSqlConnectorTest Inline method addPrimaryKeyToCopyTable Inline method `addPrimaryKeyToCopyTable` in `TestPostgreSqlJdbcConnectionAccesses` and `TestPostgreSqlJdbcConnectionCreation` for readability Fix session property description for NON_TRANSACTIONAL_MERGE Also change the description of `NON_TRANSACTIONAL_INSERT` to the "Enables support for non-transactional INSERT" Fix storage table data clean up while dropping iceberg materialized view This is for cleaning up data files in legacy mode i.e iceberg.materialized-views.hide-storage-table is set to false. Delegating the drop table to metastore does not clean up the data files since for HMS, the iceberg table is registered as an "external" table. So to fix this instead of delegating to metastore, have the connector do the drop of the table and data files associated with it. Rename partitions to partition_summaries in Iceberg manifests table Restore removed assertions These assertions are still useful since they don't use the MERGE code path. These were mistakenly removed in e88e2b1. Only enable MERGE for MERGE specific tests Before this change MERGE code path could inadvertently be used in places where we are not interested in testing MERGE. Use getFileSystemFactory in BaseIcebergMaterializedViewTest Extract helper method to get HiveMetastore Inject execution interceptors using multibinder The GlueMetastoreModule is not truly following the inversion of control paradigm. Many things are created directly in the methods without using injection. Using set multibinder for execution interceptors allows to independently define execution interceptors and use Guice injection. Co-authored-by: Grzegorz Kokosiński <[email protected]> Extract table features constants and add ProtocolEntry builder Add missing table features when adding timestamp_ntz in Delta Update Hudi library to 1.0.0 Update airbase to 204 Update airlift to 292 Update metrics-core to 4.2.29 Update reactor-core to 3.7.1 Update swagger to 2.2.27 Update AWS SDK v2 to 2.29.31 Update JLine to 3.28.0 Update nessie to 0.101.1 Update s3mock testcontainers to 3.12.0 Update exasol to 24.2.1 Update google-sheets api to v4-rev20241203-2.0.0 Add assertConsistently Use BigQuery storage read API when reading external BigLake tables The storage APIs support reading BigLake external tables (ie external tables with a connection). But the current implementation uses views which can be expensive, because it requires a query. This PR adds support to read BigLake tables directly using the storage API. There are no behavior changes for external tables and BQ native tables - they use the view and storage APIs respectively. Added a new test for BigLake tables. Co-authored-by: Marcin Rusek <[email protected]> Make OutputBufferInfo not comparable This is not needed Expose exchange sink metrics in operator and stage stats Inline constant Expose output buffer metrics in query completion event Expose filesystem exchange sink stats Correctly categorize filesystem error in Iceberg connector Webapp Preview: Cluster Overview with sparklines Use executor service for iceberg scan planning system tables Add Python UDF support to binaries Use data size for delta metadata cache Reduces chances of coordinator OOM by accounting for retained size of objects in delta metadata cache Increase default delta.metadata.cache-ttl to 30m TTL can be higher because the cached metadata is immutable and the space occupied by it in memory is accounted for Update config description for insert.non-transactional-insert.enabled Document default value of Iceberg object_store_enabled table property Improve performance of Python functions Fix and enable kudu update test Add iceberg.bucket-execution to documentation Introduce NodeStateManagerModule A refactor - rename to prepare for adding new logic. Reactivation of worker nodes Adds new node states to enable full control over shutdown and reactivation of workers. - state: DRAINING - a reversible shutdown, - state: DRAINED - all tasks are finished, server can be safely and quickly stopped. Can still go back to ACTIVE. Update AWS SDK v2 to 1.12.780 Update docker-java to 3.4.1 Update flyway to 11.1.0 Update AWS SDK v2 to 2.29.34 Update airbase to 205 Restructure SQL routine docs Move them in appropriate folders for user-defined functions and SQL user-defined functions. Update all references so that the docs build process fully works. Add redirectors for SQL routine change Reword from SQL routine to SQL UDF And generally introduce user-defined functions (UDF) as a term. Move SQL UDF content Into the separate page, and adjust the generic content to be suitable for any UDF language. Move SQL UDF content Into the separate page, and adjust the generic content to be suitable for any UDF language. Add docs for Python UDFs Remove unnecessary annotation in Kudu connector test Update to oryd/hydra:v1.11.10 for OAuth testing Fix build issues on newer OS/hardware. Use latest 1.x release since 2.x causes container start issues without further changes. Update airbase to 206 and airlift to 293 Update AWS SDK v2 to 2.29.35 Update netty to 4.1.116.Final Update gcs connector to hadoop3-2.2.26 Add example for inline and catalog Python UDF Improve inline and catalog SQL UDF docs Add Trino 468 release notes [maven-release-plugin] prepare release 468 [maven-release-plugin] prepare for next development iteration Do not require Java presence for RPM installation This allows for custom JDK to be used when running Trino with launcher --jvm-dir argument. Allow left side as update target in join pushdown Allow left side as update target when try to pushdown join into table scan. The change prevent the pushdown join into the table scan instead of throwing exception Co-Authored-By: Łukasz Osipiuk <[email protected]> Bind filesystem cache metrics per catalog Cleanup InformationSchemaPageSource projection Avoids unnecessary Integer boxing and Block[] allocations in InformationSchemaPageSource by using Page#getColumns. Update docker image version to 107 Rename HiveMinioDataLake to Hive3MinioDataLake Extract HiveMinioDataLake class Extract BaseTestHiveOnDataLake to reuse it across Hive3/4 test Add TestHive3OnDataLake test Add S3 Hive4 query runner Add TestHive4OnDataLake test Add TestTrinoHive4CatalogWithHiveMetastore test Extract requireEnv in a util class SystemEnvironmentUtils Remove unnecessay requirenment of password in test in TestSalesforceBasicAuthenticator#createAuthenticatedPrincipalRealBadPassword Use SystemEnvironmentUtils#requireEnv Add and use SystemEnvironmentUtils#isEnvSet method Fix misspelling Test concurrent update without partition Extract KuduColumnProperties from KuduTableProperties Fix S3InputStream's handling of large skips When the skip(n) method is called the MAX_SKIP_BYTES check is skipped, resulting in the call potentially blocking for a long time. Instead of delegating to the underlying stream, set the nextReadPosition value. This allows the next read to decide if it is best to keep the existing s3 object stream or open a new one. This behavior matches the implementations for Azure and GCS. Update google cloud SDK to 26.52.0 Update AWS SDK v2 to 2.29.37 Use QUERY_EXCEEDED_COMPILER_LIMIT error code Include deletion vector when filtering active add entries in Delta Add nonnull check for directExchangeClientSupplier Refactor the PlanTester to pass the nonnull `directExchangeClientSupplier`. Also add nonnull check for the `sourceId`, `serdeFactory` in `ExchangeOperator` Make FilesTable.toJson method package-private Add $entries metadata table to Iceberg Run Iceberg concurrent tests multiple times Update client driver and application sections Document sort direction and null order in Iceberg Rename executor to icebergScanExecutor Improve performance when listing columns in Iceberg Remove support for Databricks 9.1 LTS Pin openpolicyagent/opa version as 0.70.0 Add tests for dropMaterializedView The dropMaterializedView on TrinoHiveCatalog that uses FileHiveMetastore works only if unique table locations are enabled. Replace testView method override with getViewType Extend testListTables with other relation types TrinoCatalog.listTables returns not only iceberg tables but also other relations like views, materialized view or non-iceberg tables. Add HiveMetastore.getTableNamesWithParameters Add TrinoCatalog.listIcebergTables Add system.iceberg_tables table function Add the ability to list only iceberg tables from the iceberg catalog. Before this change, there was no way to list only iceberg tables. The SHOW TABLES statement, information_schema.tables, and jdbc.tables will all return all tables that exist in the underlying metastore, even if the table cannot be handled in any way by the iceberg connector. This can happen if other connectors like hive or delta, use the same metastore, catalog, and schema to store its tables. The function accepts an optional parameter with the schema name. Sample statements: SELECT * FROM TABLE(iceberg.system.iceberg_tables()); SELECT * FROM TABLE(iceberg.system.iceberg_tables(SCHEMA_NAME => 'test')); Fix failures in iceberg cloud tests Support MERGE for Ignite connector Remove unused delta-lake-databricks-104 test group Add $all_entries metadata table to Iceberg Delete the oldest tracked version metadata files after commit Using airlift log in SimulationController Stream large transaction log jsons instead of storing in-memory Operations fetching metadata and protocol entries can skip reading the rest of the json file after those entries are found Move `databricksTestJdbcUrl()` method after the constructor Use `ZSTD` Parquet compression codec for Delta Lake by default Add query execution metrics to JDBC QueryStats Adds planningTimeMillis, analysisTimeMillis, finishingTimeMillis, physicalInputBytes, physicalWrittenBytes and internalNetworkInputBytes to allow JDBC clients to get some important metrics about query execution Allow configuring parquet_bloom_filter_columns in Iceberg Verify invalid bloom filter properties in Iceberg Remove unspecified bloom filter when setting properties in Iceberg Support managing views in the Faker connector Add views support to the Faker connector docs Add min, max, and allowed_values column properties in Faker connector Allow constraining generated values by setting the min, max, or allowed_values column properties. Remove predicate pushdowns in the Faker connector Predicate pushdown in the Faker connector violates the SQL semantics, because when applied to separate columns, correlation between columns is not preserved, and returned results are not deterministic. The `min`, `max`, and `options` column properties should be used instead. Refactor Faker tests to be more readable Remove outdated limitations in Faker's docs Rename to Trino in product tests Extract dep.gib.version property Convert testRollbackToSnapshotWithNullArgument to integration test Add newTrinoTable method to AbstractTestQueryFramework Allow running product tests on IPv6 stack Update nimbus-jose-jwt to 9.48 Update jna to 5.16.0 Update AWS SDK v2 to 2.29.43 Update openlineage-java to 1.26.0 Update airbase to 209 Update airlift to 294 Update freemarker 2.3.34 Remove unreachable code in OrderedPeriodParser Avoid parsing min and max twice Avoid and/or in Faker docs Refactor FakerPageSource to have fewer faker references Extract typed ranges in Faker's page source Fix handling upper bounds in FakerPageSource Fix handling upper bounds for floating point types. The implementation did not account for rounding issue near the bound, and the test was using values outside of the allowed range. Refactor FakerPageSource Refactor to make subsequent commit's diff smaller Extract a method in FakerColumnHandle Support generating sequences in the Faker connector Configure SSL for unauthenticated client Unauthenticated client can connect to the Trino cluster when loading segment data. If cluster has its' own certificate chain - client needs to accept it according to the configuration. Simplify conditions Allow resource access type on class-level For io.trino resources it's now impossible to use class-level @ResourceType annotations due to the invalid condition. Check whole class hierarchy for @ResourceSecurity Use class-level @ResourceSecurity annotations Fix parsing of negative 0x, 0b, 0o long literals Update lucene-analysis-common to 10.1.0 Correctly merge multiple splits info Previously SplitOperatorInfo wasn't Mergeable and hence base OperatorInfo(OperatorStats#getMergeableInfoOrNull) was null. Prepare to implement a page source provider for Redshift Fetch Redshift query results unloaded to S3 Co-authored-by: Mayank Vadariya <[email protected]> Copy all TPCH tables during initialization in TestRedshiftUnload Add physicalInputTimeMillis to io.trino.jdbc.QueryStats Fix listing of files in AlluxioFileSystem Co-authored by: JiamingMai <[email protected]> Add info about views in memory only for Faker connector Minor improvements to Python UDF docs Improve docs for non-transactional merge As applicable for PostgreSQL connector for now. Also extract into a fragment so it can be reused in other connectors. Improve SQL support section in JDBC connectors - No content changes but... - Consistent wording _ Markdown link syntax - Move related configs to SQL support section - Improve list and rejig as small local ToC, add links Add non-transactional MERGE docs for Phoenix Add non-transactional MERGE docs for Ignite Remove unused metastore classes Remove unused TestingIcebergHiveMetastoreCatalogModule Replace usage of RetryDriver in HiveMetadata Move partition utility methods to Partitions class Move HiveMetastoreFactory to metastore module Cleanup binding of FlushMetadataCacheProcedure Move CachingHiveMetastore to metastore module Move Glue v1 converters into Glue package Move already exists exceptions to metastore module Move ForHiveMetastore annotation to thrift package Move RetryDriver to thrift package Move Avro utility method to ThriftMetastoreUtil Add SSE-C option on native-filesystem security mapping Remove SPI exclusion from previous release Remove support for connector event listeners Inline partition projection name in HiveConfig Remove shadowed field from MockConnectorMetadata Remove optional binder for HivePageSourceProvider Remove unused HiveMaterializedViewPropertiesProvider Remove unused HiveRedirectionsProvider Remove unused DeltaLakeRedirectionsProvider Remove unused function providers from Hive connector Remove deprecated ConnectorMetadata.getTableHandleForExecute Remove deprecated ConnectorMetadata.beginMerge Remove optional binder for IcebergPageSourceProviderFactory Make SqlVarbinary map serialization consistent with other types Other Sql* classes are serialized to JSON on-the-wire format, using @JsonValue annotated toString methods, except for SqlVarbinary which was serialized using its' getBytes() method that was Base64-encoded to a map key. Decouple Sql types from JSON serialization The new JSON serialization is not using ObjectMapper to serialize these values anymore. We want to decouple SPI types from JSON representation to be able to introduce alternative encoding formats. Derive aws sdk retry count from request count Minor cleanup in Hive procedures Move createParquetMetadata to ParquetMetadata Parse parquet footer row groups lazily Write row group fileOffset in parquet file footer Parse only required row groups from parquet footer Extend AbstractTestQueryFramework in TestRedshiftUnload Use correct table operations provider for Thrift metastore in Delta Update zstd-jni to 1.5.6-9 Update nimbus-jose-jwt to 10.0 Update AWS SDK v2 to 2.29.44 Update nessie to 0.101.3 Allow configuring gcs service endpoint Update delta-kernel to 3.3.0 Allow configuring orc_bloom_filter_columns table property in Iceberg Lower log level Add rollback_to_snapshot table procedure in Iceberg Remove <name> from trino-ranger pom We are not using it for other modules which results in the build output being inconsistent Instantiate tpch connector using bootstrap Synchronize types in (Client)StandardTypes Enumerate all types while decoding This makes it explicit in regard to the ClientStandardTypes list of types Fix deserialization of KDB_Tree and BingTile These types have a custom serialization logic utilizing JsonCreator/JsonProperty annotations. Improve SetDigest serialization Inline SetDigestType.NAME Use StandardTypes.BING_TILE const Use StandardTypes.GEOMETRY const Use StandardTypes.KDB_TREE const Use StandardTypes.SPHERICAL_GEOMETRY const Fix deserialization of Color type Clarify that spooling locations must not be shared Add docs for gcs.endpoint Enable Phoenix product test Fix wrong config name in OPA documentation Aplhabetize additional IDE configurations on docs Remove defunct hive metastore property from docs Removing property `hive.metastore.thrift.batch-fetch.enabled` from docs, which was marked defunct in Trino 443. Fix correctness issue when writing deletion vectors in Delta Lake Improve developer docs for connector MERGE support Fix gcs.endpoint property name in docs Enable dynamic catalogs for product-tests Since this uses the default catalog store set to `file`, there should not be issues with switching it all over to dynamic catalog management. Improve docs for Redshift parallel read Add counter for tasks created on worker Expose worker tracked tasks count Update okio to 3.10.1 Update nimbus-jose-jwt to 10.0.1 Update minio to 8.5.15 Update snowflake-jdbc to 3.21.0 Update checker-qual to 3.48.4 Update flyway to 11.1.1 Update AWS SDK v2 to 2.29.47 Update org.json to 20250107 Update airbase to 210 Test timestamp parsing in Delta Lake Allow parsing ISO8601 timestamp in Delta lake transaction log Update docker-images to 108 Disable spooling through session property Remove obsolete JVM configuration Performance-wise this doesn't improve memory usage Reduce the query and task expiration times This reduces the amount of memory needed to run the product test cluster Suppress warning when building with mvnd Require JDK 23.0.0 Add docs for spooling_protocol_enabled session property Document $entries table in Iceberg Document limit pushdown in BigQuery connector Grant privileges only on TPCH tables in Redshift query runner Granting privileges on all tables may cause unintended failures, as temporary tables created in one test class may not have been fully dropped or cleaned up from internal Redshift tables while other test class executes grant privileges on all tables. Remove unused NoopFunctionProvider Fix incorrect result when reading deletion vectors in Delta Lake This issue happens when Parquet file contains several pages and predicates filter page when parquet_use_column_index is set to true. Fix query runners failing to expose local ports The logic was mistakenly inverted in a99d96e. Geospatial function ST_GeomFromKML Add executeWithoutResults to StandaloneQueryRunner It is needed to test the query execution without reading the final query info which triggers the QueryCompletedEvent Always fire QueryCompletedEvent for DDL queries Previously the event was fired because client protocol is reading the final query info. That is brittle and theoretically could be removed making DDL queries fail to trigger query completed event. Add authentication to the Preview Web UI Ensure visibility of finalSinkMetrics Without exchangeSink not being volatile it was possible that other thread could observe nulled exchangeSink but still not set finalSinkMetrics. Fix logical merge conflict Fix missing ts dependencies Move JdbcRecordSetProvider construction to a module Fix error message to actual variable name Update supported clickhouse versions Revert "Update supported clickhouse versions" This reverts commit e188f97. Reapply "Update supported clickhouse versions" This reverts commit 523a305. Add support for validating JDBC connections Add docs for optimizer push filter Clarify where to set legacy Hive properties for S3 The `trino.s3.use-web-identity-token-credentials-provider` property must be set in the Hadoop config file, not as the connector property. This needs to be clarified in the docs. Clarify docs for multiple access controls Bump Scylla docker images version to 6.2 Bump latest Cassandra docker images version to 5.0.2 WebUI: Sort queries on worker by descending reserved memory Adjust docs for metadata table in Iceberg Improve code blocks and wording in Iceberg connector docs Fix code formatting in Preview Web UI Support unpartitioned tables in Kudu Fix flaky TestQueryManagerConfig This test relies on rounding large integers and will sometimes fail depending on the amount of memory available to the test runner. This commit reuses the exact same calculation for tests as it does in the code, so that we will always get the correct value for the default. Replace airlift's auth preserving client with okhttp Update AWS SDK v2 to 2.29.50 Update httpcore5 to 5.3.2 Update google-api-client to 2.7.1 Update airbase to 211 Update okio to 3.10.2 Update commons-codec to 1.17.2 Update mongo to 5.3.0 Update airlift to 295 Add retry_policy session property docs - also fix a bunch of style and formatting issues on the page Avoid writing NaN and Infinity with json format table Adjust docs for manifests metadata table in Iceberg Expose BigQuery RPC configuration settings Document BigQuery RPC settings Make the sentence assertive Use only the enforced partition constraints dependency Fix typo Extract logic for appending transaction log file for OPTIMIZE Add concurrent writes reconciliation for OPTIMIZE in Delta Lake Allow committing OPTIMIZE operations in a concurrent context by placing these operations right after any other previously concurrently completed write operations. Add info about Python client spooling support Update airlift to 296 Add support for filtering by client tags in Web UI Configure Phoenix server scan page timeout Update netty to 4.1.117.Final Update AWS SDK v2 to 2.29.50 Update metrics-core to 4.2.30 Update commons-text to 1.13.0 Update elasticsearch to 7.17.27 Update google sheets api to v4-rev20250106-2.0.0 Update reactor-core to 3.7.2 Add test for JsonSerializer handling Use Java APIs instead of ByteStreams Avoid exposing not required ports in Exasol Co-Authored-By: YotillaAntoni <[email protected]> Update Exasol image version to 8.32.0 Co-Authored-By: YotillaAntoni <[email protected]> Remove system.iceberg_tables table function We decided to add a system table instead. This reverts commit 5ce80be. Fix failure when setting NULL in UPDATE statement in JDBC-based connectors Co-Authored-By: Yuya Ebihara <[email protected]> Use <arg> instead of <compilerArg> to enable incubator module `<compilerArg>` does not seem to be a valid child element of `<compilerArgs>`. Even if it seems to work, it conflicts with the `errorprone-compiler` profile, which appends additional compiler arguments using `<arg>` (which is documented to be the correct child element). When there is also `<compilerArg>`, it gets ignored (at least by IDEA's POM importer). Update google-cloud-sdk to 26.53.0 Update nimbus oauth2 sdk to 11.21 Update opencsv to 5.10 Update localstack image to 4.0.3 Update minio to RELEASE.2024-12-18T13-15-44Z Decode JSON directly from Slice without materialization Use newInputStream instead of FileInputStream Support FIRST and AFTER clause when adding a new column in engine Support FIRST and AFTER clause when adding a new column in Iceberg Add docs for json_table Co-authored-by: Michael Eby <[email protected]> Make ElasticsearchServer Closeable Derive predicate support from Trino type in ElasticSearch Add test for column name containing special characters in Elasticsearch Add dereference pushdown support in ElasticSearch Reduce task.info.max-age to 5m Default 15m sometiems causes some significant OOM issues on workers. Bump docker/setup-qemu-action in the dependency-updates group Bumps the dependency-updates group with 1 update: [docker/setup-qemu-action](https://github.com/docker/setup-qemu-action). Updates `docker/setup-qemu-action` from 3.2.0 to 3.3.0 - [Release notes](https://github.com/docker/setup-qemu-action/releases) - [Commits](docker/setup-qemu-action@49b3bc8...53851d1) --- updated-dependencies: - dependency-name: docker/setup-qemu-action dependency-type: direct:production update-type: version-update:semver-minor dependency-group: dependency-updates ... Signed-off-by: dependabot[bot] <[email protected]> Improve test coverage for ThriftHttpMetastoreClient Remove meaningless call of Math.max Update caffeine to 3.2.0 Update flyway to 11.2.0 Update swagger to 2.2.28 Update airbase to 212 Test the effectiveness of partial timestamp partition pruning This test showcases that there is partition pruning done at the Iceberg metadata layer even though EXPLAIN showcases that a filter that does not fully match the partition transform is not being pushed down. Co-authored-by: Michiel De Smet <[email protected]> Remove experimental warning for Python UDF Rename BigQuery test credentials key config `bigquery.credentials-key` testing config name clashes with the actual config name and hence any new bigquery catalogs created through `CREATE CATALOG` command by running BigQueryQueryRunner may not pick the different key if provided in command. Update AWS SDK v2 to 2.30.2 Add validation to segment size configuration Add spooling session properties Rename session property for consistency Update graalvm to 24.1.2 Update AWS SDK v2 to 2.30.3 Update nessie to 0.102.0 Update openlineage to 1.26.0 Update google oauth2 client to 1.37.0 Update oauth2-oidc-sdk to 11.21.2 Remove unused argument Verify that TableScanNode's assignments match the output symbols Before this change, only one-way match was verified: that each output symbol is backed by an assignment. Suffix S3 path with separator for recursive delete Without trailing path separator the recursive delete operation fails for directory buckets (e.g. S3 Express) Remove deprecated method call Simplify boolean comparisons Simplify conditionals Move return statement to the empty catch block Use switch expressions Add configuration for maximum Arrow allocation in BigQuery This helps bounding the memory allocations Log failed buffer allocations Add Arrow allocation stats to BigQuery Add allocator stats to PageSource metrics Fix splits generation from iceberg TableChangesSplitSource Make configuration and session property names consistent It makes it easier to reason about which session property maps to which configuration property. More details about literals Add Parquet writer session properties docs Document spooling session properties Refactor commitUpdateAndTransaction in Iceberg Improve error handling for delete and truncate in Iceberg Allow add column with position in base jdbc module Support FIRST and AFTER clause when adding a new column in Mysql Support FIRST and AFTER clause when adding a new column in MariaDb Temporarily disable Snowflake tests Update Java to 23.0.2 This brings updated timezonedb to version 2024b (openjdk/jdk23u@73b2341) which amends historical timezone definitions for Mexico/Bahia_Banderas that we use for testing timezone gap around the unix timestamp epoch. Corresponding Joda time update also has these timezone definitions updated. PostgreSQL test server was upgraded to 12 to correctly handle UTC around epoch. MySQL was updated to 8.0.41 due to the same reason. Temporary downgrade JDK for ppc64 to make CI happy Simplify char type docs Add comparison example queries Improve formatting and style Fix error with multiple nested partition columns on Iceberg (trinodb#24629) Adjust release template for 2025 Add Trino 469 release notes [maven-release-plugin] prepare release 469 [maven-release-plugin] prepare for next development iteration Remove deprecated method addColumn from JdbcClient Support MERGE for MySQL connector Co-Authored-By: Yuya Ebihara <[email protected]> Remove the Kinesis connector The connector seems unused from all interaction on the community and following an investigation with users, vendors, and customers. The SDK used is deprecated and will be removed. More details about the removal are captured in trinodb#23921 Fix failure when adding columns with dots in Iceberg Update airbase to 213 Update oshi-core to 6.6.6 Update AWS SDK v2 to 2.30.6 Update mongo to 5.3.1 Update protobuf to 3.25.6 Update MySQL to 9.2.0 Update airlift to 298 Update client libraries to JDK 11 Use newTrinoTable in more places Sometimes the new and old way were mixed in a single method. It also makes `createTestTableForWrites` reuse this method to avoid duplication. Fix null check in the array_histogram Update Azure SDK to 1.2.31 Update AWS SDK v2 to 2.30.7 Update google api client to 2.7.2 Allow nessie to 0.102.2 Update jetbrains annotations to 26.0.2 Update airbase to 214 and airlift to 299 Set validation to WHEN_REQUIRED for FTE exchange Use isEmpty instead of not isPresent Modernize client dependencies Use ImmutableList.copyOf Remove redundant format call Mark fields as final Use more functional style for Optionals Use String.isEmpty Use StandardCharsets.UTF_8 Remove unnecessary .toString() call Drop inferred type arguments Replace statement lambda with expression lambda Drop dead variable that is never read Remove S3 legacy migration guide considerations Cast table properties based on its type in Faker connector Add test for Faker connector renameTable Introduce Loki connector. Co-authored-by: Janos <[email protected]> Extract helper method to get table comment in tests Co-Authored-By: Sandeep Thandassery <[email protected]> Test COMMENT ON TABLE in Faker connector Co-Authored-By: Sandeep Thandassery <[email protected]> Allow setting catalog type in Iceberg query runner Fix Timestamp assertion test for metrics query Minor cleanup in Loki connector Inform user what are extractable date/time fields Make the error message for invalid `extract` more helpful. Support non-lower case variables in functions Update docs with updated Postgres version Since 1350a8b we switched testing to 12 Close StatementClient when request is timed out Add JMX metrics for S3 http client usage in native FS. When s3 client was changed to use AWS SDK v2 from AWS SDK v1, the set of s3 pool metrics were missed. With this PR, these http pool metrics are now exposed via JMX beans. The reference for the http metrics exposed by AWS SDK can be found here: https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/http/HttpMetric.html Fix a wrong issue link in 469 release notes Sort connector names in EnvMultinodeAllConnectors Tweak validation of BaseCaseInsensitiveMappingTest Some databases may map varchar(5) to different types. Add DuckDB connector Clean up leftover schema for BigQuery tests Clean up test schema in connector tests Determine range constraints in Faker connector When creating a table in the Faker connector from an existing table, gather column statistics to determine range constraints, set them as column properties. Use actual row count as default limit Generate low cardinality values in Faker connector When creating a table in the Faker connector from an existing table, using column statistics determine low cardinality columns, and generate values from a randomly generated set. Set null_probability based on stats When creating tables in the Faker connector using CREATE TABLE AS SELECT, use the NUMBER_OF_NON_NULL_VALUES column statistic to set the null_probability column property. Remove unreachable code from Exasol Fix failure when equality delete updated nested fields in Iceberg Remove 'LOCAL TEMPORARY' from DuckDbClient Update gson to 2.12.0 Update AWS SDK v2 to 2.30.8 Update exasol-testcontainers to 7.1.3 Update httpcore5 to 5.3.3 Update resteasy-core to 6.0.3.Final Update duckdb_jdbc to 1.1.3 Improve pom formatting Update airbase to 215 Update airlift to 300 Update gcs-connector to 3.0.4 Fix connection leakage in the native Azure filesystem This solves an issue with connection leaks that are happening for Azure Storage SDK when OkHttp is used. OkHttp is not actively maintained, which makes the default, Netty implementation, a better choice for the future as it's actively maintained and tested. Simplify code Event loop group/maximum number of concurrent requests are already configured so the removed setting was a noop. Propagate function lifecycle events to SystemSecurityMetadata Use arrays instead of lists in OpenX RowDecoder Use array instead of list for Hive JSON decoders Reuse fieldWritten array between rows Check non-null constructor arguments Avoid re-checking isScalarType Improve case insensitive matching fragement Reenable Snowflake tests This reverts commit 5c43149 and applies necessary fixes to the changes that were introduced in the meantime. Add docs for authentication with Preview Web UI Make number of commit retries configurable in Iceberg Correctly categorize external errors in Iceberg REST catalog Add TestLokiPlugin Add Loki to EnvMultinodeAllConnectors Add more connectors to labeler-config Support multiple state filters on getAllQueryInfo API Add workers pages to the Preview Web UI Allow Hive metastore caching for Iceberg Remove table listing cache from FileHiveMetastore Fix typo in Python docs Remove deprecated precomputed hash optimizer Update snowflake-jdbc to 3.22.0 Update JLine to 3.29.0 Update commons-codec to 1.18.0 Update flyway to 11.3.0 Update AWS SDK v2 to 2.30.10 Update npm to 11.1.0 Update node to 22.13.1 Deprecate support for IBM COS via Hive Bump actions/stale from 9.0.0 to 9.1.0 in the dependency-updates group Bumps the dependency-updates group with 1 update: [actions/stale](https://github.com/actions/stale). Updates `actions/stale` from 9.0.0 to 9.1.0 - [Release notes](https://github.com/actions/stale/releases) - [Changelog](https://github.com/actions/stale/blob/main/CHANGELOG.md) - [Commits](actions/stale@v9.0.0...v9.1.0) --- updated-dependencies: - dependency-name: actions/stale dependency-type: direct:production update-type: version-update:semver-minor dependency-group: dependency-updates ... Signed-off-by: dependabot[bot] <[email protected]> Fix Vacuum deleting files with whitespace in paths When a Delta Lake file path contains whitespace, the VACUUM procedure unexpectedly removes it. This commit fixes the issue by wrapping the file path using RFC 2396 URI encoding, ensuring consistency with how file paths are handled when writing add or remove entries Update airlift to 301 Close FileInputStream while loading keystore Update reactor-netty-core to 1.1.26 Update AWS SDK v2 to 2.30.11 Update minio to 8.5.17 Update oauth2-oidc-sdk to 11.21.3 Update gson to 2.12.1 Update commons-pool2 to 2.12.1 Update loki-client to 0.0.3 Update nessie to 0.102.4 Update httpclient5 to 5.4.2 upgrade

mosabua added the roadmap Top level issues for major efforts in the project label Oct 25, 2024

mosabua self-assigned this Oct 25, 2024

mosabua mentioned this issue Oct 25, 2024

Remove the Kinesis connector #23923

Merged

mosabua closed this as completed Jan 28, 2025

This was referenced Jan 28, 2025

Adding an http proxy option for the kinesis connector #11682

Closed

Implement AbstractTestIntegrationSmokeTest for Kinesis connector #4293

Closed

Enhance configuration of kinesis catalogs to support multiple streams per catalog entry #5663

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create plan for Kinesis connector changes #23921

Create plan for Kinesis connector changes #23921

mosabua commented Oct 25, 2024 •

edited

Loading

raunaqmorarka commented Oct 25, 2024

mosabua commented Oct 25, 2024

mosabua commented Oct 29, 2024

mosabua commented Dec 9, 2024

mosabua commented Jan 7, 2025

mosabua commented Jan 27, 2025

mosabua commented Jan 28, 2025

Create plan for Kinesis connector changes #23921

Create plan for Kinesis connector changes #23921

Comments

mosabua commented Oct 25, 2024 • edited Loading

raunaqmorarka commented Oct 25, 2024

mosabua commented Oct 25, 2024

mosabua commented Oct 29, 2024

mosabua commented Dec 9, 2024

mosabua commented Jan 7, 2025

mosabua commented Jan 27, 2025

mosabua commented Jan 28, 2025

mosabua commented Oct 25, 2024 •

edited

Loading