-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
理解 LSM 树:一种适用于频繁写入的数据库的结构 #7795
理解 LSM 树:一种适用于频繁写入的数据库的结构 #7795
Conversation
Python的优化 — 驻留机制
同步原项目的更新内容
同步原项目更新的内容
Python List 使用注意事项
根据校对意见完成修改
根据校对意见完成修改
同步原仓库的更新
同步更新的内容
根据校对意见修改完成
为什么如今 Deno 正全面取代 Node.js
根据校对意见修改完成
更新原仓库的内容
@lsvih 校对认领 |
@chzh9311 好的~ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lsvih @SamYu2000 校对完成
article/2020/lsm.md
Outdated
|
||
# SSTables | ||
|
||
LSM trees are persisted to disk using a **Sorted Strings Table (SSTable)** format. As indicated by the name, SSTables are a format for storing key-value pairs in which the keys are in sorted order. An SSTable will consist of multiple sorted files called **segments**. These segments are immutable once they are written to disk. A simple example could look like this: | ||
LSM 树使用 **Sorted Strings Table (SSTable)** 格式持久化于磁盘中。顾名思义,SSTables 是一种存储 key-value 对的格式,其中 key 是经过排序的。一个 SSTable 是由若干已排序的文件组成的,这些文件称为 **segments**。这些 segments 一经写入磁盘,就处于不可变状态。我们来看一个简单的例子: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
『SSTables 是一种存储 key-value 对的格式』=>『SSTable 是一种存储 key-value 对的格式』
复数形式->单数形式
article/2020/lsm.md
Outdated
|
||
Recall that LSM trees only perform sequential writes. You may be wondering how we sequentially write our data in a sorted format when values may be written in any order. This is solved by using an in-memory tree structure. This is frequently referred to as a **memtable**, but the underlying data structure is generally some form of a sorted tree like a [red-black tree](https://en.wikipedia.org/wiki/Red%E2%80%93black_tree). As writes come in, the data is added to this red-black tree. | ||
我们来回顾下,LSM 树只能处理顺序写入。您可能不知道如何在写入值是无序的情况下顺序写入数据。这个问题可以使用内存中的树结构来解决。它通常被称为 **内存表**,从本质上来看,它是一种经排序的树,类似于[红黑树](https://en.wikipedia.org/wiki/Red%E2%80%93black_tree)。进行数据更新时,数据存入这个红黑树。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
『您可能不知道如何在写入值是无序的情况下顺序写入数据。』=> 『您可能想知道如何在写入值是无序的情况下顺序写入数据。』
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wonder也有”不知道“的意思。
article/2020/lsm.md
Outdated
|
||
![](https://yetanotherdevblog.com/content/images/2020/06/output-onlinepngtools--4-.png) | ||
|
||
Our writes get stored in this red-black tree until the tree reaches a predefined size. Once the red-black tree has enough entries, it is flushed to disk as a segment on disk in sorted order. This allows us to write the segment file as a single sequential write even though the inserts may occur in any order. | ||
我们写入的数据存在红黑树中,直到树的大小达到某个预设的值为止。此时红黑树有了足够的数据元素,它就作为一个有序的片段转移到磁盘上。因此,我们就能以单个顺序写入的方式更新这个片段,即使插入的数据是无序的也可以实现。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
『因此,我们就能以单个顺序写入的方式更新这个片段』=>『这样,我们就能以单次顺序写入的方式更新这个片段』
article/2020/lsm.md
Outdated
|
||
![](https://yetanotherdevblog.com/content/images/2020/06/output-onlinepngtools--6-.png) | ||
|
||
We can use this index to quickly find the offsets for values that would come before and after the key we want. Now we only have to scan a small portion of each segment file based on those bounds. For example, let's consider a scenario where we want to look up the key `dollar` in the segment above. We can perform a binary search on our sparse index to find that `dollar` comes between `dog` and `downgrade`. Now we only need to scan from offset 17208 to 19504 in order to find the value (or determine it is missing). | ||
我们使用这样的索引,可以快速得到需要的 key 前后的值的偏移量。现在我们只需要对边界符合条件的 segment 进行扫描。例如,我们需要在上述的 segment 中查找名为 `dollar` 的key。我们可以在稀疏索引中进行二分搜索,结果发现 `dollar` 位于 `dog` 与 `downgrade` 之间。此时我们只需要在偏移量为 17208 与 19504 之间的数据进行扫描,就能找到需要的值。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
『现在我们只需要对边界符合条件的 segment 进行扫描。』=>『现在只需要对每个 segment 中符合边界条件的一小部分进行扫描。』
看原文意思,这个索引应该是对一个 segment 而言的。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
『例如,我们需要在上述的 segment 中查找名为 dollar
的key。』=>『例如,我们需要在上图所示的 segment 中查找名为 dollar
的 key。』
article/2020/lsm.md
Outdated
|
||
This is a nice improvement, but what about looking up records that do not exist? We will still end up looping over all segment files and fail to find the key in each segment. This is something that a [bloom filter](https://yetanotherdevblog.com/bloom-filters/) can help us out with. A bloom filter is a space-efficient data structure that can tell us if a value is missing from our data. We can add entries to a bloom filter as they are written and check it at the beginning of reads in order to efficiently respond to requests for missing data. | ||
这种优化方法很好,但如果查找不存在的记录会怎样呢?如果沿袭上述办法,我们仍然需要遍历所有的 segment 文件,才能得到查找目标不存在的结果。在此情况下,就需要使用[布隆过滤器](https://yetanotherdevblog.com/bloom-filters/)了。布隆过滤器是一种空间效率较高的数据结构,它用于检测数据中某个值素是否存在。我们可以把记录添加到布隆过滤器,这些记录写入后,布隆过滤器会在开始读取时进行检查,从而高效处理对不存在的数据的请求。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
『我们可以把记录添加到布隆过滤器,这些记录写入后,布隆过滤器会在开始读取时进行检查,从而高效处理对不存在的数据的请求。』
=>
『在写入数据的同时,我们可以把记录添加到布隆过滤器;在开始读取时,布隆过滤器就会进行检查,从而高效处理对不存在的数据的请求。』
article/2020/lsm.md
Outdated
|
||
![](https://yetanotherdevblog.com/content/images/2020/06/output-onlinepngtools--7-.png) | ||
|
||
You can see in the example above that segments 1 and 2 both have a value for the key `dog`. Newer segments contain the latest values written, so the value from segment 2 is what gets carried forward into the segment 4. Once the compaction process has written a new segment for the input segments, the old segment files are deleted. | ||
在上述例子中,你可以看到,1 号 segment 与 2 号 segment 中, `dog` 键都有对应的值。新的 segment 包含最新写入的值,所以 2 号 segment 中的值是传入 4 号 segment 中的值。当压缩进程把加入的数据写入一个新的 segment 时,旧 segment 文件就被删除了。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
『在上述例子中』=>『在上图所示的例子中』
article/2020/lsm.md
Outdated
|
||
![](https://yetanotherdevblog.com/content/images/2020/06/output-onlinepngtools--8-.png) | ||
|
||
The example above shows that the key `dog` had the value 52 at some point in the past, but now it has a tombstone marker. This indicates that if we receive a request for the key `dog` then we should return a response indicating that the key does not exist. This means that delete requests actually take up disk space initially which many developers may find surprising. Eventually, tombstones will get compacted away so that the value no longer exists on disk. | ||
上述例子说明,名为 `dog` 的 key 原来对应的值是 52,现在打上了 tombstone 标记。这说明如果收到一个获取 key 为 `dog` 的数据的请求,我们应当得到的响应是数据不存在。这说明,删除请求起初占用的磁盘空间很大,令开发者感到吃惊。但最终,打上 tombstone 标记的数据被压缩了,因此相关的值就永远消失了。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
『这说明,删除请求起初占用的磁盘空间很大,令开发者感到吃惊。』=>『这说明,删除请求起初其实是占用磁盘空间的,很多开发者可能对此感到吃惊。』
2. When this tree becomes too large it is flushed to disk with the keys in sorted order. | ||
3. When a read comes in we check the bloom filter. If the bloom filter indicates that the value is not present then we tell the client that the key could not be found. If the bloom filter indicates that the value is present then we begin iterating over our segment files from newest to oldest. | ||
4. For each segment file, we check a sparse index and scan the offsets where we expect the key to be found until we find the key. We'll return the value as soon as we find it in a segment file. | ||
1. 写入的数据存储在内存中的树结构中(也可以称为内存表)。任何支持的数据结构(布隆过滤器和稀疏索引)都会在必要时更新。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
『任何支持的数据结构』=>『任何辅助的数据结构』
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不太清楚为什么是辅助的数据结构,或许「任何支持的数据结构类型(...)都会在必要时更新」会更好些。
@lsvih 校对认领 |
@Eminlin 好的~ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
翻译得不错,辛苦了,仅做一些建议。
PS.
文章里的 https://yetanotherdevblog.com/content/images/2020/06/output-onlinepngtools--5-.png
等图片貌似加载很慢
article/2020/lsm.md
Outdated
|
||
Recall that LSM trees only perform sequential writes. You may be wondering how we sequentially write our data in a sorted format when values may be written in any order. This is solved by using an in-memory tree structure. This is frequently referred to as a **memtable**, but the underlying data structure is generally some form of a sorted tree like a [red-black tree](https://en.wikipedia.org/wiki/Red%E2%80%93black_tree). As writes come in, the data is added to this red-black tree. | ||
我们来回顾下,LSM 树只能处理顺序写入。您可能不知道如何在写入值是无序的情况下顺序写入数据。这个问题可以使用内存中的树结构来解决。它通常被称为 **内存表**,从本质上来看,它是一种经排序的树,类似于[红黑树](https://en.wikipedia.org/wiki/Red%E2%80%93black_tree)。进行数据更新时,数据存入这个红黑树。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
『进行数据更新时,数据存入这个红黑树。』=>『当进行数据更新时,将会存入这个红黑树。』
article/2020/lsm.md
Outdated
|
||
This is a nice improvement, but what about looking up records that do not exist? We will still end up looping over all segment files and fail to find the key in each segment. This is something that a [bloom filter](https://yetanotherdevblog.com/bloom-filters/) can help us out with. A bloom filter is a space-efficient data structure that can tell us if a value is missing from our data. We can add entries to a bloom filter as they are written and check it at the beginning of reads in order to efficiently respond to requests for missing data. | ||
这种优化方法很好,但如果查找不存在的记录会怎样呢?如果沿袭上述办法,我们仍然需要遍历所有的 segment 文件,才能得到查找目标不存在的结果。在此情况下,就需要使用[布隆过滤器](https://yetanotherdevblog.com/bloom-filters/)了。布隆过滤器是一种空间效率较高的数据结构,它用于检测数据中某个值素是否存在。我们可以把记录添加到布隆过滤器,这些记录写入后,布隆过滤器会在开始读取时进行检查,从而高效处理对不存在的数据的请求。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
『这种优化方法很好』=>『这种方法有了很大的改进』
article/2020/lsm.md
Outdated
|
||
Over time, this system will accumulate more segment files as it continues to run. These segment files need to be cleaned up and maintained in order to prevent the number of segment files from getting out of hand. This is the responsibility of a process called compaction. Compaction is a background process that is continuously combining old segments together into newer segments. | ||
随着时间的推移,只要系统持续运行,会有越来越多的 segment 文件累计起来。为了防止 segment 文件数量失控,应当对这些 segment 文件进行清理和维护。压缩进程就是负责这些工作的。它是一个后台进程,会持续地把旧 segment 跟新 segment 进行结合。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
『随着时间的推移,只要系统持续运行,会有越来越多的 segment 文件累计起来。为了防止 segment 文件数量失控,应当对这些 segment 文件进行清理和维护。』=>
『随着时间的推移,系统在运行过程中,会累计越来越多的 segment 文件。为了防止 segment 文件数量逐渐庞大直至失控,应当对这些 segment 文件进行清理和维护。』
article/2020/lsm.md
Outdated
|
||
We've covered reading and writing data, but what about deleting data? How do you delete data from the SSTable when the segment files are considered immutable? Deletes actually follow the exact same path as writing data. Whenever a delete request is received, a unique marker called a **tombstone** is written for that key. | ||
我们已经讨论了数据的读取和更新,那数据的删除呢?既然 segment 文件是不可变的,那如何把它从 SSTable 中删除呢?实际上,删除跟写入的过程是一样的。无论何时,只要收到删除请求,需要删除的那个 key 就打上了一个被称为 **tombstone** 的标记。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
『需要删除的那个 key 就打上了一个被称为 tombstone 的标记』=>『需要删除的那个 key 就打上具有唯一标识的tombstone 标记』
a unique marker
可以翻译成独特的标记或者唯一的标记,计算机里面 unique
还是有特殊含义的,查阅了相关资料,数据的删除操作确实也是需要唯一标识。
article/2020/lsm.md
Outdated
|
||
![](https://yetanotherdevblog.com/content/images/2020/06/output-onlinepngtools--8-.png) | ||
|
||
The example above shows that the key `dog` had the value 52 at some point in the past, but now it has a tombstone marker. This indicates that if we receive a request for the key `dog` then we should return a response indicating that the key does not exist. This means that delete requests actually take up disk space initially which many developers may find surprising. Eventually, tombstones will get compacted away so that the value no longer exists on disk. | ||
上述例子说明,名为 `dog` 的 key 原来对应的值是 52,现在打上了 tombstone 标记。这说明如果收到一个获取 key 为 `dog` 的数据的请求,我们应当得到的响应是数据不存在。这说明,删除请求起初占用的磁盘空间很大,令开发者感到吃惊。但最终,打上 tombstone 标记的数据被压缩了,因此相关的值就永远消失了。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
「我们应当得到的响应是数据不存在」=> 「我们会收到一个数据不存在的响应」
2. When this tree becomes too large it is flushed to disk with the keys in sorted order. | ||
3. When a read comes in we check the bloom filter. If the bloom filter indicates that the value is not present then we tell the client that the key could not be found. If the bloom filter indicates that the value is present then we begin iterating over our segment files from newest to oldest. | ||
4. For each segment file, we check a sparse index and scan the offsets where we expect the key to be found until we find the key. We'll return the value as soon as we find it in a segment file. | ||
1. 写入的数据存储在内存中的树结构中(也可以称为内存表)。任何支持的数据结构(布隆过滤器和稀疏索引)都会在必要时更新。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不太清楚为什么是辅助的数据结构,或许「任何支持的数据结构类型(...)都会在必要时更新」会更好些。
3. When a read comes in we check the bloom filter. If the bloom filter indicates that the value is not present then we tell the client that the key could not be found. If the bloom filter indicates that the value is present then we begin iterating over our segment files from newest to oldest. | ||
4. For each segment file, we check a sparse index and scan the offsets where we expect the key to be found until we find the key. We'll return the value as soon as we find it in a segment file. | ||
1. 写入的数据存储在内存中的树结构中(也可以称为内存表)。任何支持的数据结构(布隆过滤器和稀疏索引)都会在必要时更新。 | ||
2. 当树结构太大时,会以一个有序的片段的形式转移到磁盘上。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
「会以一个有序的片段的形式转移到磁盘上。」 => 「会以一个有序的片段的形式持久化到磁盘上。」
article/2020/lsm.md
Outdated
4. For each segment file, we check a sparse index and scan the offsets where we expect the key to be found until we find the key. We'll return the value as soon as we find it in a segment file. | ||
1. 写入的数据存储在内存中的树结构中(也可以称为内存表)。任何支持的数据结构(布隆过滤器和稀疏索引)都会在必要时更新。 | ||
2. 当树结构太大时,会以一个有序的片段的形式转移到磁盘上。 | ||
3. 读取数据时,我们先检查布隆过滤器。如果布隆过滤器找不到相应的值,就告诉客户端相应的 key 不存在。如果布隆过滤器找到了相应的值,我们就开始从新到旧遍历 segment 文件。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
「我们就开始从新到旧遍历 segment 文件」=> 「就会按照从最新到旧的顺序遍历 segment 文件」
1. 写入的数据存储在内存中的树结构中(也可以称为内存表)。任何支持的数据结构(布隆过滤器和稀疏索引)都会在必要时更新。 | ||
2. 当树结构太大时,会以一个有序的片段的形式转移到磁盘上。 | ||
3. 读取数据时,我们先检查布隆过滤器。如果布隆过滤器找不到相应的值,就告诉客户端相应的 key 不存在。如果布隆过滤器找到了相应的值,我们就开始从新到旧遍历 segment 文件。 | ||
4. 对于每个 segment 文件,我们需要检查稀疏索引并在估计能查找到需要的 key 的位置扫描偏移量,直到我们找到了目标 key 为止。一经找到,就可以返回相应的值。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
「我们需要检查稀疏索引并在估计能查找到需要的 key 的位置扫描偏移量」=>「我们需要检查稀疏索引,并扫描我们期望找到的 key
的偏移量」
@lsvih @SamYu2000 @chzh9311 校对完成 |
根据校对意见修改完成
@SamYu2000 已经 merge 啦~ 快快麻溜发布到掘金然后给我发下链接,方便及时添加积分哟。 掘金翻译计划有自己的知乎专栏,你也可以投稿哈,推荐使用一个好用的插件。 |
收到! |
译文翻译完成,resolve #7771