Skip to content

Commit

Permalink
btrfs: zoned: count fresh BG region as zone unusable
Browse files Browse the repository at this point in the history
The naming of space_info->active_total_bytes is misleading. It counts
not only active block groups but also full ones which are previously
active but now inactive. That confusion results in a bug not counting
the full BGs into active_total_bytes on mount time.

For a background, there are three kinds of block groups in terms of
activation.

  1. Block groups never activated
  2. Block groups currently active
  3. Block groups previously active and currently inactive (due to fully
     written or zone finish)

What we really wanted to exclude from "total_bytes" is the total size of
BGs #1. They seem empty and allocatable but since they are not activated,
we cannot rely on them to do the space reservation.

And, since BGs #1 never get activated, they should have no "used",
"reserved" and "pinned" bytes.

OTOH, BGs Rust-for-Linux#3 can be counted in the "total", since they are already full
we cannot allocate from them anyway. For them, "total_bytes == used +
reserved + pinned + zone_unusable" should hold.

Tracking Rust-for-Linux#2 and Rust-for-Linux#3 as "active_total_bytes" (current implementation) is
confusing. And, tracking #1 and subtract that properly from "total_bytes"
every time you need space reservation is cumbersome.

Instead, we can count the whole region of a newly allocated block group as
zone_unusable. Then, once that block group is activated, release
[0 ..  zone_capacity] from the zone_unusable counters. With this, we can
eliminate the confusing ->active_total_bytes and the code will be common
among regular and the zoned mode. Also, no additional counter is needed
with this approach.

Fixes: 6a921de ("btrfs: zoned: introduce space_info->active_total_bytes")
CC: [email protected] # 6.1+
Signed-off-by: Naohiro Aota <[email protected]>
Signed-off-by: David Sterba <[email protected]>
  • Loading branch information
naota authored and kdave committed Mar 15, 2023
1 parent df384da commit fa2068d
Show file tree
Hide file tree
Showing 2 changed files with 26 additions and 6 deletions.
8 changes: 7 additions & 1 deletion fs/btrfs/free-space-cache.c
Original file line number Diff line number Diff line change
Expand Up @@ -2693,8 +2693,13 @@ static int __btrfs_add_free_space_zoned(struct btrfs_block_group *block_group,
bg_reclaim_threshold = READ_ONCE(sinfo->bg_reclaim_threshold);

spin_lock(&ctl->tree_lock);
/* Count initial region as zone_unusable until it gets activated. */
if (!used)
to_free = size;
else if (initial &&
test_bit(BTRFS_FS_ACTIVE_ZONE_TRACKING, &block_group->fs_info->flags) &&
(block_group->flags & (BTRFS_BLOCK_GROUP_METADATA | BTRFS_BLOCK_GROUP_SYSTEM)))
to_free = 0;
else if (initial)
to_free = block_group->zone_capacity;
else if (offset >= block_group->alloc_offset)
Expand Down Expand Up @@ -2722,7 +2727,8 @@ static int __btrfs_add_free_space_zoned(struct btrfs_block_group *block_group,
reclaimable_unusable = block_group->zone_unusable -
(block_group->length - block_group->zone_capacity);
/* All the region is now unusable. Mark it as unused and reclaim */
if (block_group->zone_unusable == block_group->length) {
if (block_group->zone_unusable == block_group->length &&
block_group->alloc_offset) {
btrfs_mark_bg_unused(block_group);
} else if (bg_reclaim_threshold &&
reclaimable_unusable >=
Expand Down
24 changes: 19 additions & 5 deletions fs/btrfs/zoned.c
Original file line number Diff line number Diff line change
Expand Up @@ -1580,9 +1580,19 @@ void btrfs_calc_zone_unusable(struct btrfs_block_group *cache)
return;

WARN_ON(cache->bytes_super != 0);
unusable = (cache->alloc_offset - cache->used) +
(cache->length - cache->zone_capacity);
free = cache->zone_capacity - cache->alloc_offset;

/* Check for block groups never get activated */
if (test_bit(BTRFS_FS_ACTIVE_ZONE_TRACKING, &cache->fs_info->flags) &&
cache->flags & (BTRFS_BLOCK_GROUP_METADATA | BTRFS_BLOCK_GROUP_SYSTEM) &&
!test_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, &cache->runtime_flags) &&
cache->alloc_offset == 0) {
unusable = cache->length;
free = 0;
} else {
unusable = (cache->alloc_offset - cache->used) +
(cache->length - cache->zone_capacity);
free = cache->zone_capacity - cache->alloc_offset;
}

/* We only need ->free_space in ALLOC_SEQ block groups */
cache->cached = BTRFS_CACHE_FINISHED;
Expand Down Expand Up @@ -1901,7 +1911,11 @@ bool btrfs_zone_activate(struct btrfs_block_group *block_group)

/* Successfully activated all the zones */
set_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, &block_group->runtime_flags);
space_info->active_total_bytes += block_group->length;
WARN_ON(block_group->alloc_offset != 0);
if (block_group->zone_unusable == block_group->length) {
block_group->zone_unusable = block_group->length - block_group->zone_capacity;
space_info->bytes_zone_unusable -= block_group->zone_capacity;
}
spin_unlock(&block_group->lock);
btrfs_try_granting_tickets(fs_info, space_info);
spin_unlock(&space_info->lock);
Expand Down Expand Up @@ -2265,7 +2279,7 @@ int btrfs_zone_finish_one_bg(struct btrfs_fs_info *fs_info)
u64 avail;

spin_lock(&block_group->lock);
if (block_group->reserved ||
if (block_group->reserved || block_group->alloc_offset == 0 ||
(block_group->flags & BTRFS_BLOCK_GROUP_SYSTEM)) {
spin_unlock(&block_group->lock);
continue;
Expand Down

0 comments on commit fa2068d

Please sign in to comment.