理解 LSM 树：一种适用于频繁写入的数据库的结构 #7795

cool-summer-021 · 2020-12-28T03:15:28Z

译文翻译完成，resolve #7771

master

Python的优化 — 驻留机制

同步原项目的更新内容

同步原项目更新的内容

Python List 使用注意事项

根据校对意见完成修改

同步原仓库的更新

更新

同步更新的内容

根据校对意见修改完成

为什么如今 Deno 正全面取代 Node.js

删除多余内容

根据校对意见修改完成

更新原仓库的内容

更新

翻译完成

更新

chzh9311 · 2021-01-11T03:50:55Z

@lsvih 校对认领

lsvih · 2021-01-11T06:19:13Z

@chzh9311 好的~

chzh9311

@lsvih @SamYu2000 校对完成

chzh9311 · 2021-01-11T13:06:28Z

article/2020/lsm.md


 # SSTables

-LSM trees are persisted to disk using a **Sorted Strings Table (SSTable)** format. As indicated by the name, SSTables are a format for storing key-value pairs in which the keys are in sorted order. An SSTable will consist of multiple sorted files called **segments**. These segments are immutable once they are written to disk. A simple example could look like this:
+LSM 树使用 **Sorted Strings Table (SSTable)** 格式持久化于磁盘中。顾名思义，SSTables 是一种存储 key-value 对的格式，其中 key 是经过排序的。一个 SSTable 是由若干已排序的文件组成的，这些文件称为 **segments**。这些 segments 一经写入磁盘，就处于不可变状态。我们来看一个简单的例子：


『SSTables 是一种存储 key-value 对的格式』=>『SSTable 是一种存储 key-value 对的格式』
复数形式->单数形式

chzh9311 · 2021-01-11T13:09:40Z

article/2020/lsm.md


-Recall that LSM trees only perform sequential writes. You may be wondering how we sequentially write our data in a sorted format when values may be written in any order. This is solved by using an in-memory tree structure. This is frequently referred to as a **memtable**, but the underlying data structure is generally some form of a sorted tree like a [red-black tree](https://en.wikipedia.org/wiki/Red%E2%80%93black_tree). As writes come in, the data is added to this red-black tree.
+我们来回顾下，LSM 树只能处理顺序写入。您可能不知道如何在写入值是无序的情况下顺序写入数据。这个问题可以使用内存中的树结构来解决。它通常被称为 **内存表**，从本质上来看，它是一种经排序的树，类似于[红黑树](https://en.wikipedia.org/wiki/Red%E2%80%93black_tree)。进行数据更新时，数据存入这个红黑树。


『您可能不知道如何在写入值是无序的情况下顺序写入数据。』=> 『您可能想知道如何在写入值是无序的情况下顺序写入数据。』

wonder也有”不知道“的意思。

chzh9311 · 2021-01-11T13:20:47Z

article/2020/lsm.md


 ![](https://yetanotherdevblog.com/content/images/2020/06/output-onlinepngtools--4-.png)

-Our writes get stored in this red-black tree until the tree reaches a predefined size. Once the red-black tree has enough entries, it is flushed to disk as a segment on disk in sorted order. This allows us to write the segment file as a single sequential write even though the inserts may occur in any order.
+我们写入的数据存在红黑树中，直到树的大小达到某个预设的值为止。此时红黑树有了足够的数据元素，它就作为一个有序的片段转移到磁盘上。因此，我们就能以单个顺序写入的方式更新这个片段，即使插入的数据是无序的也可以实现。


『因此，我们就能以单个顺序写入的方式更新这个片段』=>『这样，我们就能以单次顺序写入的方式更新这个片段』

chzh9311 · 2021-01-11T13:26:11Z

article/2020/lsm.md


 ![](https://yetanotherdevblog.com/content/images/2020/06/output-onlinepngtools--6-.png)

-We can use this index to quickly find the offsets for values that would come before and after the key we want. Now we only have to scan a small portion of each segment file based on those bounds. For example, let's consider a scenario where we want to look up the key `dollar` in the segment above. We can perform a binary search on our sparse index to find that `dollar` comes between `dog` and `downgrade`. Now we only need to scan from offset 17208 to 19504 in order to find the value (or determine it is missing).
+我们使用这样的索引，可以快速得到需要的 key 前后的值的偏移量。现在我们只需要对边界符合条件的 segment 进行扫描。例如，我们需要在上述的 segment 中查找名为 `dollar` 的key。我们可以在稀疏索引中进行二分搜索，结果发现 `dollar` 位于 `dog` 与 `downgrade` 之间。此时我们只需要在偏移量为 17208 与 19504 之间的数据进行扫描，就能找到需要的值。


『现在我们只需要对边界符合条件的 segment 进行扫描。』=>『现在只需要对每个 segment 中符合边界条件的一小部分进行扫描。』
看原文意思，这个索引应该是对一个 segment 而言的。

『例如，我们需要在上述的 segment 中查找名为 dollar 的key。』=>『例如，我们需要在上图所示的 segment 中查找名为 dollar 的 key。』

chzh9311 · 2021-01-11T13:40:04Z

article/2020/lsm.md


-This is a nice improvement, but what about looking up records that do not exist? We will still end up looping over all segment files and fail to find the key in each segment. This is something that a [bloom filter](https://yetanotherdevblog.com/bloom-filters/) can help us out with. A bloom filter is a space-efficient data structure that can tell us if a value is missing from our data. We can add entries to a bloom filter as they are written and check it at the beginning of reads in order to efficiently respond to requests for missing data.
+这种优化方法很好，但如果查找不存在的记录会怎样呢？如果沿袭上述办法，我们仍然需要遍历所有的 segment 文件，才能得到查找目标不存在的结果。在此情况下，就需要使用[布隆过滤器](https://yetanotherdevblog.com/bloom-filters/)了。布隆过滤器是一种空间效率较高的数据结构，它用于检测数据中某个值素是否存在。我们可以把记录添加到布隆过滤器，这些记录写入后，布隆过滤器会在开始读取时进行检查，从而高效处理对不存在的数据的请求。


『我们可以把记录添加到布隆过滤器，这些记录写入后，布隆过滤器会在开始读取时进行检查，从而高效处理对不存在的数据的请求。』
=>
『在写入数据的同时，我们可以把记录添加到布隆过滤器；在开始读取时，布隆过滤器就会进行检查，从而高效处理对不存在的数据的请求。』

chzh9311 · 2021-01-11T13:43:14Z

article/2020/lsm.md


 ![](https://yetanotherdevblog.com/content/images/2020/06/output-onlinepngtools--7-.png)

-You can see in the example above that segments 1 and 2 both have a value for the key `dog`. Newer segments contain the latest values written, so the value from segment 2 is what gets carried forward into the segment 4. Once the compaction process has written a new segment for the input segments, the old segment files are deleted.
+在上述例子中，你可以看到，1 号 segment 与 2 号 segment 中， `dog` 键都有对应的值。新的 segment 包含最新写入的值，所以 2 号 segment 中的值是传入 4 号 segment 中的值。当压缩进程把加入的数据写入一个新的 segment 时，旧 segment 文件就被删除了。


『在上述例子中』=>『在上图所示的例子中』

chzh9311 · 2021-01-11T13:51:29Z

article/2020/lsm.md


 ![](https://yetanotherdevblog.com/content/images/2020/06/output-onlinepngtools--8-.png)

-The example above shows that the key `dog` had the value 52 at some point in the past, but now it has a tombstone marker. This indicates that if we receive a request for the key `dog` then we should return a response indicating that the key does not exist. This means that delete requests actually take up disk space initially which many developers may find surprising. Eventually, tombstones will get compacted away so that the value no longer exists on disk.
+上述例子说明，名为 `dog` 的 key 原来对应的值是 52，现在打上了 tombstone 标记。这说明如果收到一个获取 key 为 `dog` 的数据的请求，我们应当得到的响应是数据不存在。这说明，删除请求起初占用的磁盘空间很大，令开发者感到吃惊。但最终，打上 tombstone 标记的数据被压缩了，因此相关的值就永远消失了。


『这说明，删除请求起初占用的磁盘空间很大，令开发者感到吃惊。』=>『这说明，删除请求起初其实是占用磁盘空间的，很多开发者可能对此感到吃惊。』

chzh9311 · 2021-01-11T13:57:16Z

article/2020/lsm.md

-2. When this tree becomes too large it is flushed to disk with the keys in sorted order.
-3. When a read comes in we check the bloom filter. If the bloom filter indicates that the value is not present then we tell the client that the key could not be found. If the bloom filter indicates that the value is present then we begin iterating over our segment files from newest to oldest.
-4. For each segment file, we check a sparse index and scan the offsets where we expect the key to be found until we find the key. We'll return the value as soon as we find it in a segment file.
+1. 写入的数据存储在内存中的树结构中（也可以称为内存表）。任何支持的数据结构（布隆过滤器和稀疏索引）都会在必要时更新。


『任何支持的数据结构』=>『任何辅助的数据结构』

不太清楚为什么是辅助的数据结构，或许「任何支持的数据结构类型(...)都会在必要时更新」会更好些。

Eminlin · 2021-01-17T07:27:41Z

@lsvih 校对认领

lsvih · 2021-01-17T07:38:57Z

@Eminlin 好的~

Eminlin

翻译得不错，辛苦了，仅做一些建议。

PS.
文章里的 https://yetanotherdevblog.com/content/images/2020/06/output-onlinepngtools--5-.png 等图片貌似加载很慢

Eminlin · 2021-01-17T08:18:09Z

article/2020/lsm.md


-Recall that LSM trees only perform sequential writes. You may be wondering how we sequentially write our data in a sorted format when values may be written in any order. This is solved by using an in-memory tree structure. This is frequently referred to as a **memtable**, but the underlying data structure is generally some form of a sorted tree like a [red-black tree](https://en.wikipedia.org/wiki/Red%E2%80%93black_tree). As writes come in, the data is added to this red-black tree.
+我们来回顾下，LSM 树只能处理顺序写入。您可能不知道如何在写入值是无序的情况下顺序写入数据。这个问题可以使用内存中的树结构来解决。它通常被称为 **内存表**，从本质上来看，它是一种经排序的树，类似于[红黑树](https://en.wikipedia.org/wiki/Red%E2%80%93black_tree)。进行数据更新时，数据存入这个红黑树。


『进行数据更新时，数据存入这个红黑树。』=>『当进行数据更新时，将会存入这个红黑树。』

Eminlin · 2021-01-17T08:41:17Z

article/2020/lsm.md


-This is a nice improvement, but what about looking up records that do not exist? We will still end up looping over all segment files and fail to find the key in each segment. This is something that a [bloom filter](https://yetanotherdevblog.com/bloom-filters/) can help us out with. A bloom filter is a space-efficient data structure that can tell us if a value is missing from our data. We can add entries to a bloom filter as they are written and check it at the beginning of reads in order to efficiently respond to requests for missing data.
+这种优化方法很好，但如果查找不存在的记录会怎样呢？如果沿袭上述办法，我们仍然需要遍历所有的 segment 文件，才能得到查找目标不存在的结果。在此情况下，就需要使用[布隆过滤器](https://yetanotherdevblog.com/bloom-filters/)了。布隆过滤器是一种空间效率较高的数据结构，它用于检测数据中某个值素是否存在。我们可以把记录添加到布隆过滤器，这些记录写入后，布隆过滤器会在开始读取时进行检查，从而高效处理对不存在的数据的请求。


『这种优化方法很好』=>『这种方法有了很大的改进』

Eminlin · 2021-01-17T08:50:27Z

article/2020/lsm.md


-Over time, this system will accumulate more segment files as it continues to run. These segment files need to be cleaned up and maintained in order to prevent the number of segment files from getting out of hand. This is the responsibility of a process called compaction. Compaction is a background process that is continuously combining old segments together into newer segments.
+随着时间的推移，只要系统持续运行，会有越来越多的 segment 文件累计起来。为了防止 segment 文件数量失控，应当对这些 segment 文件进行清理和维护。压缩进程就是负责这些工作的。它是一个后台进程，会持续地把旧 segment 跟新 segment 进行结合。


『随着时间的推移，只要系统持续运行，会有越来越多的 segment 文件累计起来。为了防止 segment 文件数量失控，应当对这些 segment 文件进行清理和维护。』=>
『随着时间的推移，系统在运行过程中，会累计越来越多的 segment 文件。为了防止 segment 文件数量逐渐庞大直至失控，应当对这些 segment 文件进行清理和维护。』

Eminlin · 2021-01-17T09:01:47Z

article/2020/lsm.md


-We've covered reading and writing data, but what about deleting data? How do you delete data from the SSTable when the segment files are considered immutable? Deletes  actually follow the exact same path as writing data.  Whenever a delete request is received, a unique marker called a **tombstone** is written for that key.
+我们已经讨论了数据的读取和更新，那数据的删除呢？既然 segment 文件是不可变的，那如何把它从 SSTable 中删除呢？实际上，删除跟写入的过程是一样的。无论何时，只要收到删除请求，需要删除的那个 key 就打上了一个被称为 **tombstone** 的标记。


『需要删除的那个 key 就打上了一个被称为 tombstone 的标记』=>『需要删除的那个 key 就打上具有唯一标识的tombstone 标记』

a unique marker 可以翻译成独特的标记或者唯一的标记，计算机里面 unique 还是有特殊含义的，查阅了相关资料，数据的删除操作确实也是需要唯一标识。

Eminlin · 2021-01-17T09:05:05Z

article/2020/lsm.md


 ![](https://yetanotherdevblog.com/content/images/2020/06/output-onlinepngtools--8-.png)

-The example above shows that the key `dog` had the value 52 at some point in the past, but now it has a tombstone marker. This indicates that if we receive a request for the key `dog` then we should return a response indicating that the key does not exist. This means that delete requests actually take up disk space initially which many developers may find surprising. Eventually, tombstones will get compacted away so that the value no longer exists on disk.
+上述例子说明，名为 `dog` 的 key 原来对应的值是 52，现在打上了 tombstone 标记。这说明如果收到一个获取 key 为 `dog` 的数据的请求，我们应当得到的响应是数据不存在。这说明，删除请求起初占用的磁盘空间很大，令开发者感到吃惊。但最终，打上 tombstone 标记的数据被压缩了，因此相关的值就永远消失了。


「我们应当得到的响应是数据不存在」=> 「我们会收到一个数据不存在的响应」

Eminlin · 2021-01-17T09:09:20Z

article/2020/lsm.md

-2. When this tree becomes too large it is flushed to disk with the keys in sorted order.
-3. When a read comes in we check the bloom filter. If the bloom filter indicates that the value is not present then we tell the client that the key could not be found. If the bloom filter indicates that the value is present then we begin iterating over our segment files from newest to oldest.
-4. For each segment file, we check a sparse index and scan the offsets where we expect the key to be found until we find the key. We'll return the value as soon as we find it in a segment file.
+1. 写入的数据存储在内存中的树结构中（也可以称为内存表）。任何支持的数据结构（布隆过滤器和稀疏索引）都会在必要时更新。


不太清楚为什么是辅助的数据结构，或许「任何支持的数据结构类型(...)都会在必要时更新」会更好些。

Eminlin · 2021-01-17T09:11:57Z

article/2020/lsm.md

-3. When a read comes in we check the bloom filter. If the bloom filter indicates that the value is not present then we tell the client that the key could not be found. If the bloom filter indicates that the value is present then we begin iterating over our segment files from newest to oldest.
-4. For each segment file, we check a sparse index and scan the offsets where we expect the key to be found until we find the key. We'll return the value as soon as we find it in a segment file.
+1. 写入的数据存储在内存中的树结构中（也可以称为内存表）。任何支持的数据结构（布隆过滤器和稀疏索引）都会在必要时更新。
+2. 当树结构太大时，会以一个有序的片段的形式转移到磁盘上。


「会以一个有序的片段的形式转移到磁盘上。」 => 「会以一个有序的片段的形式持久化到磁盘上。」

Eminlin · 2021-01-17T09:14:35Z

article/2020/lsm.md

-4. For each segment file, we check a sparse index and scan the offsets where we expect the key to be found until we find the key. We'll return the value as soon as we find it in a segment file.
+1. 写入的数据存储在内存中的树结构中（也可以称为内存表）。任何支持的数据结构（布隆过滤器和稀疏索引）都会在必要时更新。
+2. 当树结构太大时，会以一个有序的片段的形式转移到磁盘上。
+3. 读取数据时，我们先检查布隆过滤器。如果布隆过滤器找不到相应的值，就告诉客户端相应的 key 不存在。如果布隆过滤器找到了相应的值，我们就开始从新到旧遍历 segment 文件。


「我们就开始从新到旧遍历 segment 文件」=> 「就会按照从最新到旧的顺序遍历 segment 文件」

Eminlin · 2021-01-17T09:27:54Z

article/2020/lsm.md

+1. 写入的数据存储在内存中的树结构中（也可以称为内存表）。任何支持的数据结构（布隆过滤器和稀疏索引）都会在必要时更新。
+2. 当树结构太大时，会以一个有序的片段的形式转移到磁盘上。
+3. 读取数据时，我们先检查布隆过滤器。如果布隆过滤器找不到相应的值，就告诉客户端相应的 key 不存在。如果布隆过滤器找到了相应的值，我们就开始从新到旧遍历 segment 文件。
+4. 对于每个 segment 文件，我们需要检查稀疏索引并在估计能查找到需要的 key 的位置扫描偏移量，直到我们找到了目标 key 为止。一经找到，就可以返回相应的值。 


「我们需要检查稀疏索引并在估计能查找到需要的 key 的位置扫描偏移量」=>「我们需要检查稀疏索引，并扫描我们期望找到的 key 的偏移量」

Eminlin · 2021-01-17T09:39:59Z

@lsvih @SamYu2000 @chzh9311 校对完成

根据校对意见修改完成

lsvih · 2021-01-18T01:39:25Z

@SamYu2000 已经 merge 啦~ 快快麻溜发布到掘金然后给我发下链接，方便及时添加积分哟。

掘金翻译计划有自己的知乎专栏，你也可以投稿哈，推荐使用一个好用的插件。
专栏地址：https://zhuanlan.zhihu.com/juejinfanyi

cool-summer-021 · 2021-01-18T03:59:55Z

@lsvih 已发布 https://juejin.cn/post/6918940339676020743/

lsvih · 2021-01-18T06:12:01Z

收到！

cool-summer-021 and others added 30 commits September 1, 2020 16:24

Merge pull request #1 from xitu/master

409fa28

master

Python的优化 — 驻留机制

b25fc11

Python的优化 — 驻留机制

根据校对的意见修改完成

c20bc99

根据校对意见修改完成

92f8c24

Merge pull request #2 from xitu/master

17509cf

同步原项目的更新内容

Update optimization-in-python-interning.md

62b0bf6

Update optimization-in-python-interning.md

07f7654

Merge pull request #3 from xitu/master

9536184

同步原项目更新的内容

Python List 使用注意事项

bda4a3e

Python List 使用注意事项

根据校对意见完成修改

99d7468

根据校对意见完成修改

根据校对意见完成修改

d336f81

根据校对意见完成修改

Merge pull request #4 from xitu/master

a66f143

同步原仓库的更新

Merge pull request #5 from xitu/master

a67339c

更新

Merge pull request #6 from xitu/master

97ca333

同步更新的内容

根据校对意见修改完成

caf0c7b

根据校对意见修改完成

Revert

4228fc8

Update the-dos-and-don-ts-of-python-list-comprehension.md

1553ae1

为什么如今 Deno 正全面取代 Node.js

03415de

为什么如今 Deno 正全面取代 Node.js

删除多余内容

89a9ef3

删除多余内容

Revert

11950f7

Revert

d86cfb9

根据校对意见修改完成

663526a

根据校对意见修改完成

Merge pull request #7 from xitu/master

1c716a5

更新原仓库的内容

Merge pull request #8 from xitu/master

da6e38d

更新

Merge pull request #9 from xitu/master

86f01a6

更新

翻译完成

19c7711

翻译完成

翻译完成

c89b7cc

翻译完成

删除 Applications of Some of The Famous Algorithms译文

5401dec

根据校对意见修改完成

79157d1

格式问题修改完成

07b65fb

lsvih and others added 4 commits December 21, 2020 15:17

Revert

d91975c

Update why-deno-is-perfectly-ready-to-take-over-node-js-now.md

9cf11c8

Merge pull request #11 from xitu/master

8ee30fa

更新

翻译完成

50a168b

lsvih changed the title ~~Translate/understanding lsm trees~~ 理解 LSM 树：一种适用于频繁写入的数据库的结构 Dec 29, 2020

lsvih added 后端校对认领 labels Dec 29, 2020

cool-summer-021 mentioned this pull request Jan 11, 2021

为什么我的数据会漂移？ #7798

Merged

lsvih added the 正在校对 label Jan 11, 2021

chzh9311 reviewed Jan 11, 2021

View reviewed changes

lsvih removed the 校对认领 label Jan 17, 2021

Eminlin reviewed Jan 17, 2021

View reviewed changes

根据校对意见修改完成

e04f50a

根据校对意见修改完成

lsvih approved these changes Jan 18, 2021

View reviewed changes

lsvih merged commit 843bb37 into xitu:master Jan 18, 2021

lsvih added 翻译完成 and removed 正在校对 labels Jan 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

理解 LSM 树：一种适用于频繁写入的数据库的结构 #7795

理解 LSM 树：一种适用于频繁写入的数据库的结构 #7795

cool-summer-021 commented Dec 28, 2020

chzh9311 commented Jan 11, 2021

lsvih commented Jan 11, 2021

chzh9311 left a comment

chzh9311 Jan 11, 2021

chzh9311 Jan 11, 2021

cool-summer-021 Jan 18, 2021

chzh9311 Jan 11, 2021

chzh9311 Jan 11, 2021

chzh9311 Jan 11, 2021

chzh9311 Jan 11, 2021

chzh9311 Jan 11, 2021

chzh9311 Jan 11, 2021 •

edited

Loading

chzh9311 Jan 11, 2021

Eminlin Jan 17, 2021

Eminlin commented Jan 17, 2021

lsvih commented Jan 17, 2021

Eminlin left a comment

Eminlin Jan 17, 2021 •

edited

Loading

Eminlin Jan 17, 2021 •

edited

Loading

Eminlin Jan 17, 2021 •

edited

Loading

Eminlin Jan 17, 2021 •

edited

Loading

Eminlin Jan 17, 2021

Eminlin Jan 17, 2021

Eminlin Jan 17, 2021 •

edited

Loading

Eminlin Jan 17, 2021 •

edited

Loading

Eminlin Jan 17, 2021

Eminlin commented Jan 17, 2021

lsvih commented Jan 18, 2021

cool-summer-021 commented Jan 18, 2021

lsvih commented Jan 18, 2021


		Recall that LSM trees only perform sequential writes. You may be wondering how we sequentially write our data in a sorted format when values may be written in any order. This is solved by using an in-memory tree structure. This is frequently referred to as a memtable, but the underlying data structure is generally some form of a sorted tree like a [red-black tree](https://en.wikipedia.org/wiki/Red%E2%80%93black_tree). As writes come in, the data is added to this red-black tree.
		我们来回顾下，LSM 树只能处理顺序写入。您可能不知道如何在写入值是无序的情况下顺序写入数据。这个问题可以使用内存中的树结构来解决。它通常被称为内存表，从本质上来看，它是一种经排序的树，类似于[红黑树](https://en.wikipedia.org/wiki/Red%E2%80%93black_tree)。进行数据更新时，数据存入这个红黑树。


		This is a nice improvement, but what about looking up records that do not exist? We will still end up looping over all segment files and fail to find the key in each segment. This is something that a [bloom filter](https://yetanotherdevblog.com/bloom-filters/) can help us out with. A bloom filter is a space-efficient data structure that can tell us if a value is missing from our data. We can add entries to a bloom filter as they are written and check it at the beginning of reads in order to efficiently respond to requests for missing data.
		这种优化方法很好，但如果查找不存在的记录会怎样呢？如果沿袭上述办法，我们仍然需要遍历所有的 segment 文件，才能得到查找目标不存在的结果。在此情况下，就需要使用[布隆过滤器](https://yetanotherdevblog.com/bloom-filters/)了。布隆过滤器是一种空间效率较高的数据结构，它用于检测数据中某个值素是否存在。我们可以把记录添加到布隆过滤器，这些记录写入后，布隆过滤器会在开始读取时进行检查，从而高效处理对不存在的数据的请求。


		Over time, this system will accumulate more segment files as it continues to run. These segment files need to be cleaned up and maintained in order to prevent the number of segment files from getting out of hand. This is the responsibility of a process called compaction. Compaction is a background process that is continuously combining old segments together into newer segments.
		随着时间的推移，只要系统持续运行，会有越来越多的 segment 文件累计起来。为了防止 segment 文件数量失控，应当对这些 segment 文件进行清理和维护。压缩进程就是负责这些工作的。它是一个后台进程，会持续地把旧 segment 跟新 segment 进行结合。


		We've covered reading and writing data, but what about deleting data? How do you delete data from the SSTable when the segment files are considered immutable? Deletes actually follow the exact same path as writing data. Whenever a delete request is received, a unique marker called a tombstone is written for that key.
		我们已经讨论了数据的读取和更新，那数据的删除呢？既然 segment 文件是不可变的，那如何把它从 SSTable 中删除呢？实际上，删除跟写入的过程是一样的。无论何时，只要收到删除请求，需要删除的那个 key 就打上了一个被称为 tombstone 的标记。

理解 LSM 树：一种适用于频繁写入的数据库的结构 #7795

理解 LSM 树：一种适用于频繁写入的数据库的结构 #7795

Conversation

cool-summer-021 commented Dec 28, 2020

chzh9311 commented Jan 11, 2021

lsvih commented Jan 11, 2021

chzh9311 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chzh9311 Jan 11, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Eminlin commented Jan 17, 2021

lsvih commented Jan 17, 2021

Eminlin left a comment

Choose a reason for hiding this comment

Eminlin Jan 17, 2021 • edited Loading

Choose a reason for hiding this comment

Eminlin Jan 17, 2021 • edited Loading

Choose a reason for hiding this comment

Eminlin Jan 17, 2021 • edited Loading

Choose a reason for hiding this comment

Eminlin Jan 17, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Eminlin Jan 17, 2021 • edited Loading

Choose a reason for hiding this comment

Eminlin Jan 17, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Eminlin commented Jan 17, 2021

lsvih commented Jan 18, 2021

cool-summer-021 commented Jan 18, 2021

lsvih commented Jan 18, 2021

chzh9311 Jan 11, 2021 •

edited

Loading

Eminlin Jan 17, 2021 •

edited

Loading

Eminlin Jan 17, 2021 •

edited

Loading

Eminlin Jan 17, 2021 •

edited

Loading

Eminlin Jan 17, 2021 •

edited

Loading

Eminlin Jan 17, 2021 •

edited

Loading

Eminlin Jan 17, 2021 •

edited

Loading