-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Listing contents of large s3 folders is slow #140
Comments
Fixed thanks to your PR :) |
@ziprjoe @danielfrg First of all, many thanks for your precious work! 😄 |
Hey, should be a matter of catching the exception and ignoring it. In cases where there is no s3keep file, there isn't a way to show the last update time, so a dummy date will be displayed. |
@ziprjoe @danielfrg Firstly, I'd like to express my gratitude for your excellent work on this library. It has been incredibly useful for my use-case of connecting s3 with Jhub compared to the alternatives. However, I've encountered an issue when using s3contents to connect to an S3 bucket with pre-existing directories. These directories aren't displayed in the UI unless I manually add a .s3keep file to each directory. Once I do this, the issue is resolved. I'm wondering if you are aware of the cause of this problem and if there's a way to use s3contents with a bucket that has pre-existing directories without having to manually add .s3keep files to each directory. Thank you for your time and attention! |
Hi @ziproje. I think there are new ways to handle directories in S3 that do not require the placeholder files. I have not tested and to be honest I am not using this lib anymore. I try to keep it updated but since I am not using it, it is behind on needed features and I dont expect I will be able to add new features in the near future. I basically just handle new releases from contributors at this point. |
I handle that with a script called in postStart lifecycle hook file=$HOME/.dir.txt
# Save s3 directory tree
aws s3 ls --recursive s3://<bucket> | cut -c32- | xargs -d '\n' -n 1 dirname | uniq > $HOME/.dir.txt
touch .s3keep
while IFS= read -r folder; do
aws s3 cp .s3keep s3://<bucket>/$folder/.s3keep
done < "$file" |
Hey,
Thanks for your work on this library, iv'e been using it for a while and its really nice.
Recently i ran into some issues with long load time for large s3 folders. I believe this is the result of repeated synchronous calls to the abstract
lstat
method. I have done some testing, and found that if making these calls with asyncio, using thes3fs._info
method instead really speeds things up (like 20X faster on large folders).I'm currently using a fork i made with these changes, and it works great. I opened a PR for you to consider: #139
I use this library quite a bit, and would be happy to put in the work to get this change merged.
Thanks again!
Joe
The text was updated successfully, but these errors were encountered: