-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: how delta lake transactions work #2089
Conversation
|
||
Delta Lake supports transactions which provide necessary reliability guarantees for production data systems. | ||
|
||
Data lakes don’t provide transactions and this can cause nasty bugs and a bad user experience. Let’s look at a couple of scenarios when the lack of transactions cause a poor user experience: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe call it vanilla data lakes? Delta lake in my opinion is just a data lake with a metadata layer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated this. Delta Lakes def aren't data lakes 😱
See the Lakehouse paper. We usually call these "Lakehouse storage systems" or "open table formats".
|
||
Durability means that all transactions that are successfully completed will always remain persisted, even if there are service outages or program crashes. | ||
|
||
Suppose you have a Delta table that’s persisted in Azure blog storage. The Delta table transactions that are committed will always remain available, even in these circumstances: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
blog - blob
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, lol. Thanks for reviewing!!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @MrPowers!
@nkarpov and I collaborated on this Delta Lake transactions post. It's meant to give the basics on how transactions work and why they're a huge advantage of Delta Lakes. @rtyler is giving a talk on transactions/concurrency soon. We're trying to set the stage with some foundational content first. --------- Co-authored-by: Ion Koutsouris <[email protected]>
@nkarpov and I collaborated on this Delta Lake transactions post.
It's meant to give the basics on how transactions work and why they're a huge advantage of Delta Lakes.
@rtyler is giving a talk on transactions/concurrency soon. We're trying to set the stage with some foundational content first.