-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Azure concurrent writes #2069
Comments
Good news - azure storage supports concurrent writes out of the box. |
Apologies for necro. I saw the azure implementation for the LogStore is basically the default, whereas the aws one has a specific LogStore implementation based on dynamoDB locking. Thanks! |
I think I have answered myself by trawling through various issues and PRs. To answer myself: yes, the difference is that the Azure object store implements a simple Whereas the AWS implementation depends on the locking client |
- It was unclear to me that concurrent writing was available by default for non-S3 backends, so I am making the language clearer. - I have also added an extra section showing that R2 and maybe MinIO can enable concurrent writing - Fixed a couple of unrelated formatting issues in the page I edited closes #2556 #2069 also had the same confusion
Description
Looking for a bit of advice on concurrent writes against a table in azure. I'm familiar with the requirement to have a Dynamo table to handle table locks in AWS, but reading the docs (as someone not overly familiar with azure), I'm unsure if concurrent writes against a delta table in blob/adls2 in azure could lead to data loss, or if by default the storage in azure provides 'put-if-absent' type guarantees that makes it protect against data loss from multi-app/cluster writes?
Use Case
I would like to utilise
write_deltalake
to write to a table on Azure (blob storage and/or adls2) from a multi-replica python app, and I want to guarantee no data loss while creating a transaction in the_delta_log
directory (i.e. each replica of my application creating a transaction under_delta_log
with the same name at exactly the same point in time leading to one replica overwriting the other replica's entry)Related Issue(s)
None.
The text was updated successfully, but these errors were encountered: