Skip to content

Commit

Permalink
Update info re mixed raid levels Web-UI compatibility #468
Browse files Browse the repository at this point in the history
Updates main Pools doc entry re mixed raid profiles,
and newer supported profile info. Also moves mixed raid info
previously located in stable_kernel_backport.rst and updates
that page to reflect upstream parity raid read-write defaults.
Some minor rewording of existing content.

Incidental update re zstd compression addition.
  • Loading branch information
phillxnet committed Dec 4, 2024
1 parent 997105c commit b57bb7e
Show file tree
Hide file tree
Showing 2 changed files with 160 additions and 117 deletions.
79 changes: 22 additions & 57 deletions howtos/stable_kernel_backport.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,11 @@ Installing the Stable Kernel Backport
.. warning::

This How-to is intended for advanced users only.
It's contents are likely irrelevant unless you require capabilities beyond our default openSUSE base.
Its contents are likely irrelevant unless you require capabilities beyond our default openSUSE base.
We include these instructions with the proviso that they will significantly modify your system from our upstream base.
As such you will be running a far less tested system, and consequently may face more system stability/reliability risks.
N.B. Pools created with this newer kernel have the newer free space tree i.e. (space_cache=v2).
As do Pools created using "Built on openSUSE" Leap 15.6 and newer instances.
Future imports require kernels which are equally new/capable (at least ideally).

If you are reporting issues on our `Community Forum <https://forum.rockstor.com/>`_
Expand All @@ -19,16 +20,16 @@ please indicate if you have applied these changes.
Install all updates before following these instructions.
And test your system again after a reboot to ensure that this procedure is still necessary.

As of writing, the Leap version used in our v4 "Built on openSUSE" comes with many openSUSE/SuSE organised patches.
As of writing, the Leap version used in our "Built on openSUSE" comes with many openSUSE/SuSE organised patches.
Almost all of these patches are backports from newer kernels,
applied to a designated 'base' kernel version.
As such the running kernel is newer than its 'base' version indicates.

Rockstor V4 "Built on openSUSE" no longer installs newer kernels than its upstream OS,
as it once did when based on CentOS 7 (Rockstor v3).
Rockstor "Built on openSUSE" does not install newer kernels than its upstream OS,
as it once did when based on CentOS 7 (Rockstor v3 and earlier).
That is primarily because it is no longer necessary,
our new upstream actively maintains btrfs and employs some of the key btrfs contributors.
As a result all relevant btrfs backports are already in our upstream default kernel.
As a result many relevant btrfs backports are already in our upstream default kernel.
But in some situations it may be desirable to enable a newer base kernel version.

.. _why_newer_kernel:
Expand All @@ -46,20 +47,29 @@ There are two main reasons:
Btrfs raid 5/6 read-only
------------------------

OpenSUSE's Leap 15.3/15.4 default kernels restrict the parity raid levels of 5 & 6 to read-only.
This decision was taken as the parity raid levels are far younger than the non parity levels of 0, 1, and 10.
Rockstor's Web-UI supports btrfs raid 0, 1, 10, 5, and 6.
See the following links for how other newer raid levels such as :ref:`raid1c3_raid1c4` and :ref:`mixed_raid_levels` are treated.
OpenSUSE's Leap 15.3/15.4/15.5 default kernels restrict the parity raid levels of 5 & 6 to read-only.
This decision was taken as the parity raid profiles are far younger than the non parity profiles of 0, 1, and 10.

When creating a parity raid pool (volume in btrfs parlance) we see the following message in the system journal:
From openSUSE Leap 15.6 this is no longer the case as a far newer kernel base was chosen.
As such our "Built on openSUSE" Leap 15.6 variant and newer have read-write parity btrfs raid out-of-the-box.

See :ref:`redundancyprofiles` for supported btrfs profiles per Rockstor version.

Pre "Built on openSUSE" Leap 15.6,
when creating a parity raid pool (volume in btrfs parlance), the following message appears in the system journal:

.. code-block:: console
kernel: btrfs: RAID56 is supported read-only, load module with allow_unsupported=1
Rather than just *allowing unsupported* it is proposed that instead we take advantage of a newer kernel.
Rather than just *allowing unsupported* it was proposed that instead one took advantage of a newer kernel.
In this case the upstream-of-openSUSE latest stable kernel version back-ported to openSUSE.
And in turn Rockstor.

.. warning::

As of Leap 15.6 there are far-fewer btrfs reasons to use the Stable_kernel_Backport approach.
Consider instead installing a later version of Rockstor,
or following the appropriate in-place "Distribution update from 15.* to 15.*` howto.

.. _newer_kernel_repos:

Expand Down Expand Up @@ -118,48 +128,3 @@ In which case omit the "--no-recommends" option to also install these firmware
.. note::

A system reboot will be required for the above changes to take effect.

.. _raid1c3_raid1c4:

Btrfs raid1c3 raid1c4
---------------------

These raid levels are currently the newest available in btrfs.
As they are based on the far more mature btrfs raid1 they may be considered more mature than the parity raid levels.
They simply 'amplify' the number of copies stored across the same number of independent devices.

- **raid1c3** - 3 copies across 3 independent drives.
- **raid1c4** - 4 copies across 4 independent drives.

The :ref:`stable_kernel_backport` above procedure also enables the use of these even newer btrfs raid levels.
At least in the underlying operating system.

.. note::

Rockstor 'allows' these raid levels but is currently un-aware of them.
As such if any Pool modifications are enacted via the Web-UI,
e.g :ref:`poolbalance` or :ref:`poolresize` the Rockstor defaults will be reasserted.
See :ref:`dlbalance_re_raid` to reassert a custom raid profile.

.. _mixed_raid_levels:

Btrfs mixed raid levels
-----------------------

Btrfs, somewhat uniquely, can have one raid level for data and another for metadata.
One approach to alleviate the currently know issues, design wise, in the btrfs parity raid levels,
is to use:

- **data** - btrfs raid5 or preferred raid6
- **metadata** - btrfs raid1c3 or preferred raid1c4

Note that with the preferred options above btrfs can have a 2 disk failure capability per pool.
This is of particular interest to those running pools consisting of many devices.

.. note::

As per the :ref:`raid1c3_raid1c4` note, Rockstor is unaware of some non standard data/metadata mixes.
And likewise the Web-UI Pool operations of :ref:`poolbalance` or :ref:`poolresize`
will undo any custom pool data/metadata mixed raid setup and revert to Rockstor defaults.
See :ref:`dlbalance_re_raid` to re-assert a custom mixed raid arrangement.
All other operations however should function as normal.
198 changes: 138 additions & 60 deletions interface/storage/pools-btrfs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,10 @@
Pools
=====

A Pool in Rockstor is a set of disk drives combined and represented as a single
A Pool in Rockstor is a set of drives combined and represented as a single
volume. Pools have attributes such as redundancy profile and compression to
safeguard and store data efficiently. Pools can be expanded or shrunk by adding
or removing disk drives. In other words, a Pool is a single or multi device
or removing drives. In other words, a Pool is a single or multi device
BTRFS filesystem.

Pool related operations can be managed from the **Pools** screen listed under
Expand All @@ -17,7 +17,7 @@ the **Storage** tab of the Web-UI.
Creating a Pool
---------------

Whole disks (un-partitioned) drives are very much preferred as Rockstor Pool members.
Whole (un-partitioned) drives are very much preferred as Rockstor Pool members.

See :ref:`import_data` to re-establish a prior Rockstor installs Pool.
It may also be possible to import similarly structured btrfs volumes, single or multi member.
Expand All @@ -31,66 +31,144 @@ There is a tooltip for each input field to help you choose appropriate parameter
Redundancy profiles
^^^^^^^^^^^^^^^^^^^

All standard BTRFS redundancy profiles are available when creating a pool.
A broad abstraction of BTRFS raid profiles are available when creating a pool.

- Rockstor 4.1.0-0 and earlier supported btrfs raid single, 0, 1, 10, 5, and 6; with no mixed profile awareness.
- 4.6.0-0's Web-UI supports: single, single-dup, 0, 1, 10, 5, 6, raid1c3, raid1c4, raid1-1c3, raid1-1c4,
raid10-1c3, raid10-1c4, raid5-1, raid5-1c3, raid6-1c3, raid6-1c4.

.. warning::
Please see :ref:`btrfsnature` to avoid some surprises regarding the way btrfs does raid.

.. note::
No btrfs-raid profile requires that its member disks be matched in size.
But in the case of btrfs-raid0 particularly,
No btrfs raid profile requires that its member drives be matched in size.
But in the case of btrfs raid0 particularly,
the available space is maximised if they are similar, or ideally the same.

* **Single**: This profile offers no redundancy.
A single disk failure will result in the entire Pool being lost.
It is the only valid option for creating a Pool with a single disk drive.
It is also recommended if you have multiple disks of very different sizes,
yielding higher total capacity compared to **Raid0** in this setting.

* **Raid0**: This profile offers no redundancy.
A single disk failure will result in the entire Pool being lost.
Two or more disks are required for this profile.
It is recommended only when there is no need for redundancy,
and offers better performance than the **Single** btrfs-raid option.
Both data and metadata are striped across the disks.
It is recommended for same/similar size disks.
If you have very differently sized disks and no need for redundancy,
the **Single** profile provides higher capacity.

* **Raid1**: Two or more disks are required for this profile.
This profile can sustain **a maximum of one disk failure**.
Data and metadata are replicated on 2 independent devices,
irrespective of the total pool member count.

* **Raid5**: Two or more disks are requried for this profile.
This profile can sustain **a maximum of one disk failure**.
Uses parity and striping.
The BTRFS community consensus is that btrfs-raid5 is not yet
fully stable and so is ***not recommended for production use***.

* **Raid6**: Three or more disks are requried for this profile.
This profile can sustain **a maximum of two disk failures**.
Uses dual-parity and striping.
The BTRFS community consensus is that btrfs-raid6 is not yet
fully stable and so is ***not recommended for production use***.

* **Raid10**: Four or more disks are required for this profile.
This profile can sustain **a practical maximum of one disk failures**.
Uses a Raid0 (strip) of Raid1 mirrors.
Btrfs-raid 10 offers the best overall performance with single disk redundancy.
.. _mixed_raid_levels:

Btrfs mixed raid levels
~~~~~~~~~~~~~~~~~~~~~~~

Btrfs, somewhat uniquely, can have one raid level for data and another for metadata.
One alternative to using btrfs parity raid levels for metadata, known to be slow, is to use:

- **data** - btrfs raid5 or preferably raid6
- **metadata** - btrfs raid1c3 or preferably raid1c4

I.e. a btrfs raid6-1c4 or raid6-1c3 pool will have a 2 disk failure capability.
This is of particular interest to those running pools with a higher device count.

.. note::
Rockstor from 4.6.0-0 onwards is required for Web-UI awareness of mixed profiles.
The naming convention adopted within the Web-UI is essentially comprised of:
"data-metadata" where the metadata profile is an abridged version, e.g. :ref:`raid5`-:ref:`1 <raid1>`.
I.e. short-hand for :ref:`raid5` data, with :ref:`raid1` metadata.

.. _single:

single
......

**No redundancy.**
A single disk failure will result in the entire Pool being lost.
Valid option for creating a Pool with a single drive.
It is also recommended if you have multiple drives of very different sizes,
yielding higher total capacity compared to :ref:`raid0` in this setting.

.. _single_dup:

single-dup
..........

**Minimal redundancy for metadata only.**
A single drive failure will result in the entire Pool being lost.
Valid option for creating a Pool with a single drive.
Uses :ref:`single` for data, with duplication of metadata.
Metadata duplication does NOT span devices, providing only a second metadata copy,
possibly on the same device. Enabling a fail-through & repair for metadata read checksum failures.

.. _raid0:

raid0
.....

**No redundancy.**
A single drive failure will result in the entire Pool being lost.
Two or more drives are required for this profile.
It is recommended only when there is no need for redundancy,
and offers better performance than the :ref:`single` profile.
Both data and metadata are striped across the drives.
It is recommended for same/similar size drives.
If you have very differently sized drives and no need for redundancy,
the :ref:`single` profiles provide higher capacity.

.. _raid1:

raid1
.....

Can sustain **a maximum of one drive failure**.
Two or more drives are required.
Data and metadata are replicated on two independent devices,
irrespective of the total pool member count.

.. _raid5:

raid5
.....

Can sustain **a maximum of one drive failure**.
Two or more drives are required.
Uses parity and striping.
The BTRFS community consensus is that btrfs raid5 is ***not recommended for production/metadata use***.

.. _raid6:

raid6
.....

Can sustain **a maximum of two drive failures**.
Three or more drives are required.
Uses dual-parity and striping.
The BTRFS community consensus is that btrfs raid6 is ***not recommended for production/metadata use***.

.. _raid10:

raid10
......

Can sustain **a practical maximum of one drive failures**.
Four or more drives are required.
Uses a Raid0 (strip) of Raid1 mirrors.
Btrfs raid 10 offers the best overall performance with single drive redundancy.

.. _raid1c3_raid1c4:

raid1c3 & raid1c4
.................

Can sustain **two or three drive failures respectively**.
Three or four drives are respectively required.
These raid profiles are a more recent addition to btrfs.
Based on the far more mature btrfs :ref:`raid1`,
they may be considered more mature than the parity raid levels of :ref:`raid5` and :ref:`raid6`.
They essentially 'amplify' the number of copies stored across the same number of independent devices:

- **raid1c3** - 3 copies across 3 independent drives.
- **raid1c4** - 4 copies across 4 independent drives.

Please see the `btrfs docs <https://btrfs.readthedocs.io/en/latest/Introduction.html>`_
for up to date information on all btrfs matters.

For a BTRFS features stability status overview, including redundancy profiles,
visit the `btrfs docs Status <https://btrfs.readthedocs.io/en/latest/Status.html>`_.
For a BTRFS features stability status overview
see: `btrfs docs Status <https://btrfs.readthedocs.io/en/latest/Status.html>`_.

.. warning::

As of Rocksor v4 "Built on openSUSE" Leap 15.3 base,
the far younger parity raid levels of 5 & 6 are read-only by default.
Write access can be enabled by :ref:`stable_kernel_backport`: **advanced users only**.
See also our :ref:`raid1c3_raid1c4` doc section on the same page.
Rockstor "Built on openSUSE" before Leap 15.6 defaulted to read-only for :ref:`raid5` & :ref:`raid6`.
Write access can be enabled on older installs via :ref:`stable_kernel_backport`: **advanced users only**.
Preferably consider an in-place OS update via the appropriate "Distribution update from 15.* to 15.*" how-to.

Compression Options
^^^^^^^^^^^^^^^^^^^
Expand All @@ -103,13 +181,13 @@ Compression can also be set at the Share level. If you don't want to enable
compression for all Shares under a Pool, don't enable it at the Pool
level. Instead, selectively enable it on Shares.

Besides not enabling compression at all, there are two additional choices
Besides not enabling compression at all, there are three additional choices.
For more info see: `btrfs.readthedocs compression <https://btrfs.readthedocs.io/en/latest/Compression.html>`_

* **zlib**: Provides slower but higher compression ratio. You can find out
more from `zlib.net <https://www.zlib.net/manual.html>`_.
* **lzo**: A faster compression algorithm but provides lower ratio compared to
**zlib**. You can find out more from `oberhumer.com
<https://www.oberhumer.com/opensource/lzo/>`_.
* **zlib**: Provides slower but higher compression ratio. Levels as yet unsupported.
* **lzo**: A faster compression algorithm but provides lower ratio compared to **zlib**.
* **zstd** Comparable compression to **zlib** but faster. Levels as yet unsupported.
Requires Rockstor 5.0.2-0 "Build on openSUSE" Leap 15.4 or newer.

.. _poolmountoptions:

Expand Down Expand Up @@ -152,15 +230,15 @@ out more about each option from the `BTRFS documentation mount options section
Pool Resize/ReRaid
------------------

A convenience feature of btrfs Pool management is the ability to add or remove disks,
A convenience feature of btrfs Pool management is the ability to add or remove drives,
and change redundancy profiles, while still using the Pool.
The persistence of a pool's accessibility is otherwise known as it's 'online' state.
And so these changes are referenced as it's online capabilities.

A performance reduction is expected during any changes of this sort,
but depending on your hardware overhead, this can be unnoticeable.

**Note that increases in; disk count, percent usage, snapshots count, and Pool size can all impact on the memory and CPU required,
**Note that increases in; drive count, percent usage, snapshots count, and Pool size can all impact on the memory and CPU required,
and the time for any changes to be enacted.**

Pool Resize / ReRaid may be done for the following reasons.
Expand All @@ -184,7 +262,7 @@ You can change :ref:`redundancyprofiles` online with only a few restrictions.

1. The resulting pool must have sufficient space for the existing data.
2. The target drive count will be sufficient for the target btrfs raid profile.
3. Rockstor can simultaneously change btrfs-raid levels while :ref:`pooladddisks`, but NOT while :ref:`poolremovedisks`.
3. Rockstor can simultaneously change btrfs raid levels while :ref:`pooladddisks`, but NOT while :ref:`poolremovedisks`.

Because of (3.) above, when removing for example a drive from a pool which is already at the minimum drive count,
attached or detached, we have to first change the raid level of that pool.
Expand All @@ -199,12 +277,12 @@ Pool has below minimum members

This situation is most common in non industrial DIY setups where a pool will often have only the minimum number of disks.

In the following example we have a btrfs-raid1 Pool (minimum 2 disks) that has a detached/missing member.
In the following example we have a btrfs raid1 Pool (minimum 2 disks) that has a detached/missing member.
We have already refreshed our backups via the suggested ro,degraded mount;
from the Pool details maintenance section that appeared.
And we have then switched to a rw,degraded mount to allow for the Pool changes.

A degraded mount option is required when there is a detached/missing disk; irrespective of drive count and btrfs-raid level.
A degraded mount option is required when there is a detached/missing disk; irrespective of drive count and btrfs raid level.
Otherwise any mount operation is refused.
The intention of the obligatory 'degraded' option is to ensure conscious intervention during an enhanced data loss state.
And a Pool may well go read only on it's own, by design, shortly after loosing access to one of it's members.
Expand Down

0 comments on commit b57bb7e

Please sign in to comment.