Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting PAGE_TRANSPORT_TIMEOUT for any query that goes beyond 1 minute. #24149

Open
ak2766 opened this issue Nov 17, 2024 · 2 comments
Open

Getting PAGE_TRANSPORT_TIMEOUT for any query that goes beyond 1 minute. #24149

ak2766 opened this issue Nov 17, 2024 · 2 comments

Comments

@ak2766
Copy link

ak2766 commented Nov 17, 2024

TL;DR - Which timeout setting do I need to change from default to stop this PAGE_TRANSPORT_TIMEOUT and does it need to go in coordinator, worker, or both?

I'm experiment with Trino and DBeaver and I'm hitting a road block for queries that take longer than 1 minute to complete.

For instance, the query below in particular never finishes when using Trino cli. However, if I run the same query on SSMS, it completes in ~3 minutes. The table is >30GB

trino> SELECT id, REDACTED, REDACTED, REDACTED, count, REDACTED 
    -> from mssql.dbo.REDACTED 
    -> order by count desc
    -> limit 200;

Query 20241117_033406_00006_u26jx, FAILED, 1 node
Splits: 1 total, 0 done (0.00%)
1:07 [1 rows, 0B] [0 rows/s, 0B/s]

Query 20241117_033406_00006_u26jx failed: Encountered too many errors talking to a worker node. The node may have crashed or be under too much load. This is probably a transient issue, so please retry your query in a few minutes. (http://REDACTED:8080/v1/task/20241117_033406_00006_u26jx.0.0.0/results/0/0 - 104 failures, failure duration 60.02s, total failed request time 61.74s)

Initially, thought it was a DBeaver issue and I wasted hours researching timeouts. After exhausting all timeouts on DBeaver, I finally tried running the query directly on Trino's (which I really ought to have done first), I discovered the timeout is occurring somewhere inside Trino. I just started on Trino yesterday so I'm very green.

Here are my configs:

coordinator config:
coordinator=true
node-scheduler.include-coordinator=false
http-server.http.port=8080
discovery-server.enabled=true
discovery.uri=http://trino-coordinator:8080
internal-communication.shared-secret=REDACTED
internal-communication.https.required=false
worker config:
coordinator=false
http-server.http.port=8080
discovery.uri=http://trino-coordinator:8080
internal-communication.shared-secret=REDACTED
internal-communication.https.required=false

Quick EDIT: If I comment out the order by count desc, the query completes in under 5 seconds.

@zachtrong
Copy link

There are undocumented http-client config properties to increase timeout:

  workerExtraConfig: |-
    exchange.http-client.request-timeout=60s
    exchange.http-client.idle-timeout=2m 
    exchange.http-client.max-connections-per-server=1000
  coordinatorExtraConfig: |-
    exchange.http-client.request-timeout=60s
    exchange.http-client.idle-timeout=2m 
    exchange.http-client.max-connections-per-server=1000

@ak2766
Copy link
Author

ak2766 commented Nov 18, 2024

Thanks @zachtron.

I went through the logs and searched for timeouts. Trial and error got me to the correct one and eventually got it going a day ago but forgot to come back and update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants