-
Notifications
You must be signed in to change notification settings - Fork 762
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
optimize table ontime all
returns Too many open files (os error 24)
#4253
Comments
have seen this in the lab dev box,
can mitigate this. The |
Is it possible that we could use a home-brewed or shall we fallback to the |
Thank you. |
Another problem: After the
Query log:
After optmized:
Looks there is no difference between them from the query log. cc @dantengsky |
got it. BTW, does the cache of table meta are warmed up after |
The meta cache is enabled:
|
I'm sorry for letting this linger. begin working on it now. |
the first question is: do we need this level of parallel IO except I suggest YES:
todo:
|
Pls correct me if I misunderstand the question
does it mean, should we allow simultaneously reading of several columns? (while blocks are already being read parallelly) imo, yes. Before But there is a limit (hardcoded constant) of the number of tasks that could be buffered, thus the max number of FDs that simultaneously opened is not dependent on the number of columns of the table being scanned. i.e. instead of
used to be a hardcoded constant 10. Not sure if we should have both
A little bit lost here, does it mean the factory will not produce another
yeah, maybe we should open a feature request issue :D |
maybe I missed something, as far as I can see from the code before read_columns_many_async is introduced, read is async but not parallel, (the chuck decoding maybe in parallel), because we had only one reader, which can not be read simultaneously (that is why the parquet API changed), and I didn't see find code like
not a good idea for me at first, but works well with the parquet2 new API. |
yes, deleted several commits before. one of the version that uses
got it. let's refine it later if necessary. |
Summary
Using ontime dataset.
Server log:
The text was updated successfully, but these errors were encountered: