-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIFO Compaction with TTL #2480
FIFO Compaction with TTL #2480
Conversation
@sagar0 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
- Added a new TableProperty, creation_time, to keep track of when the SST file is created. - Creation_time: - On Flush: Set to the time of flush. - On Compaction: Set to the max creation_time of all the files involved in the compaction. - Added a new TTL option to FIFO compaction options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure how it works, please clarify or provide unit tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few more comments. Glad we'll have this feature soon!
@@ -62,6 +62,13 @@ struct CompactionOptionsFIFO { | |||
// Default: 1GB | |||
uint64_t max_table_files_size; | |||
|
|||
// Drop files older than TTL. TTL based deletion will take precedence over |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the semantics to me were a bit confusing -- consider:
- if we exceed size limit and TTL of last file expired: then we drop the last file and don't necessarily go under size limit
- if we exceed size limit and TTL not expired: we drop files from the end until we're under size limit
my preference would be in either case we pick a compaction such that, once it's finished, we satisfy all user-specified limits.
I'd also consider letting max_table_files_size == 0
disable the size-triggered file dropping now, similarly to what you did for ttl
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, and I too noticed it. I changed it now to fall back to size-based deletion when ttl-based deletion does not bring the total-size to be under the max_table_files_size
threshold.
I am deferring the work of allowing max_table_files_size
to be set to 0, for later. I will look into it in future, as that involves a lot more changes.
- Added unit tests. - Split PickCompaction into PickTTLCompaction and PickSizeCompaction. - Fixed a bug in BuildTable where time wasn't being passed to NewTableBuilder.
Also handled the return status of GetCurrentTime, so that a garbage value is not used when return statu is not ok. time is set to 0 instead.
with a default value of 0.
@sagar0 updated the pull request - view changes - changes since last import |
@sagar0 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@sagar0 updated the pull request - view changes - changes since last import |
@sagar0 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Also moved GetTotalFilesSize into anonymouse namespace from FIFOCompactionPicker class.
@sagar0 updated the pull request - view changes - changes since last import |
@sagar0 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Since there is a potential chance that we could discard the files picked in the main loop in PickTTLCompaction, move the log message to the right place, to be right before creating the Compaction object.
@sagar0 updated the pull request - view changes - changes since last import |
Benchmark results:
With TTL (a low one for testing) ::
Example Log lines:
SST Files remaining in the dbbench dir, after db_bench execution completed:
|
Support FIFO-compaction-with-TTL only when max_open_files=-1, as the creation_time embedded in every file's table properties need to be consulted to figure out the files that need to be deleted. table_reader embedded in a FileDescriptor could potentially get deleted if max_open_files is not set to -1. We could initialize the table_reader again to get the table properties again, but it would involve a performance penalty due to reading new blocks from disk. We could potentially support it in the future, but may be not in the first version.
@sagar0 updated the pull request - view changes - changes since last import |
@sagar0 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Sorry I wasn't able to get to reviewing today, I'll try again in the morning. |
1. It is only supported with max_open_files = -1. 2. It is only supported with Block based table format.
@sagar0 updated the pull request - view changes - changes since last import |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! let's ship it today :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry last comment - is there a way to implement the test without sleeping?
Re. you comment about sleep ... I found that there is a |
Addressing review comments: 1. Pick only continguous files for deletion in FIFO-with-TTL. 2. Add a check to make sure that GetTableProperties() does not return a null pointer, before accessing fields fruther.
@sagar0 updated the pull request - view changes - changes since last import |
@sagar0 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Summary: Introducing FIFO compactions with TTL. FIFO compaction is based on size only which makes it tricky to enable in production as use cases can have organic growth. A user requested an option to drop files based on the time of their creation instead of the total size. To address that request: - Added a new TTL option to FIFO compaction options. - Updated FIFO compaction score to take TTL into consideration. - Added a new table property, creation_time, to keep track of when the SST file is created. - Creation_time is set as below: - On Flush: Set to the time of flush. - On Compaction: Set to the max creation_time of all the files involved in the compaction. - On Repair and Recovery: Set to the time of repair/recovery. - Old files created prior to this code change will have a creation_time of 0. - FIFO compaction with TTL is enabled when ttl > 0. All files older than ttl will be deleted during compaction. i.e. `if (file.creation_time < (current_time - ttl)) then delete(file)`. This will enable cases where you might want to delete all files older than, say, 1 day. - FIFO compaction will fall back to the prior way of deleting files based on size if: - the creation_time of all files involved in compaction is 0. - the total size (of all SST files combined) does not drop below `compaction_options_fifo.max_table_files_size` even if the files older than ttl are deleted. This feature is not supported if max_open_files != -1 or with table formats other than Block-based. **Test Plan:** Added tests. **Benchmark results:** Base: FIFO with max size: 100MB :: ``` svemuri@dev15905 ~/rocksdb (fifo-compaction) $ TEST_TMPDIR=/dev/shm ./db_bench --benchmarks=readwhilewriting --num=5000000 --threads=16 --compaction_style=2 --fifo_compaction_max_table_files_size_mb=100 readwhilewriting : 1.924 micros/op 519858 ops/sec; 13.6 MB/s (1176277 of 5000000 found) ``` With TTL (a low one for testing) :: ``` svemuri@dev15905 ~/rocksdb (fifo-compaction) $ TEST_TMPDIR=/dev/shm ./db_bench --benchmarks=readwhilewriting --num=5000000 --threads=16 --compaction_style=2 --fifo_compaction_max_table_files_size_mb=100 --fifo_compaction_ttl=20 readwhilewriting : 1.902 micros/op 525817 ops/sec; 13.7 MB/s (1185057 of 5000000 found) ``` Example Log lines: ``` 2017/06/26-15:17:24.609249 7fd5a45ff700 (Original Log Time 2017/06/26-15:17:24.609177) [db/compaction_picker.cc:1471] [default] FIFO compaction: picking file 40 with creation time 1498515423 for deletion 2017/06/26-15:17:24.609255 7fd5a45ff700 (Original Log Time 2017/06/26-15:17:24.609234) [db/db_impl_compaction_flush.cc:1541] [default] Deleted 1 files ... 2017/06/26-15:17:25.553185 7fd5a61a5800 [DEBUG] [db/db_impl_files.cc:309] [JOB 0] Delete /dev/shm/dbbench/000040.sst type=2 #40 -- OK 2017/06/26-15:17:25.553205 7fd5a61a5800 EVENT_LOG_v1 {"time_micros": 1498515445553199, "job": 0, "event": "table_file_deletion", "file_number": 40} ``` SST Files remaining in the dbbench dir, after db_bench execution completed: ``` svemuri@dev15905 ~/rocksdb (fifo-compaction) $ ls -l /dev/shm//dbbench/*.sst -rw-r--r--. 1 svemuri users 30749887 Jun 26 15:17 /dev/shm//dbbench/000042.sst -rw-r--r--. 1 svemuri users 30768779 Jun 26 15:17 /dev/shm//dbbench/000044.sst -rw-r--r--. 1 svemuri users 30757481 Jun 26 15:17 /dev/shm//dbbench/000046.sst ``` Closes #2480 Differential Revision: D5305116 Pulled By: sagar0 fbshipit-source-id: 3e5cfcf5dd07ed2211b5b37492eb235b45139174
Summary: Valgrind had false positive complaints about the initialization pattern for `GetCurrentTime()`'s argument in #2480. We can instead have the client initialize the time variable before calling `GetCurrentTime()`, and have `GetCurrentTime()` promise to only overwrite it in success case. Closes #2526 Differential Revision: D5358689 Pulled By: ajkr fbshipit-source-id: 857b189f24c19196f6bb299216f3e23e7bc4be42
Summary: `GetCurrentTime()` is used to populate `creation_time` table property during flushes and compactions. It is safe to ignore `GetCurrentTime()` failures here but they should be logged. (Note that `creation_time` property was introduced as part of TTL-based FIFO compaction in #2480.) Tes Plan: `make check` Closes #3231 Differential Revision: D6501935 Pulled By: sagar0 fbshipit-source-id: 376adcf4ab801d3a43ec4453894b9a10909c8eb6
Introducing FIFO compactions with TTL.
FIFO compaction is based on size only which makes it tricky to enable in production as use cases can have organic growth. A user requested an option to drop files based on the time of their creation instead of the total size.
To address that request:
if (file.creation_time < (current_time - ttl)) then delete(file)
. This will enable cases where you might want to delete all files older than, say, 1 day.compaction_options_fifo.max_table_files_size
even if the files older than ttl are deleted.This feature is not supported if max_open_files != -1 or with table formats other than Block-based.
Test Plan:
Added tests.
Benchmark results:
Base: FIFO with max size: 100MB ::
With TTL (a low one for testing) ::
Example Log lines:
SST Files remaining in the dbbench dir, after db_bench execution completed: