retry device mapper and cryptsetup errors #1721

anmaxvl · 2023-04-07T18:17:30Z

Occasionally /dev/sd* devices arrive late and not available at the time when verity or dm-crypt targets are created. This commit introduces a CreateDevice wrapper which can retry the operation on specific errors and always retries cryptsetup once, but with a large retry timeout.

internal/guest/storage/devicemapper/devicemapper.go

BryceDFisher · 2023-04-07T20:17:18Z

internal/guest/storage/devicemapper/targets.go

 	if err != nil {
-		return "", errors.Wrapf(err, "failed to create dm-verity target. device=%s", devPath)
+		return "", fmt.Errorf("frailed to create dm-verity target for device=%s: %w", devPath, err)


Is this specific to container layer devices? If so, would it be possible to just make an error message that says something like "container layer for container [X] could not be mounted in time. Retrying may resolve this issue.". This way the person consuming the error knows a little bit more specifically what happened and how to move forward?

not necessarily, at this point GCS isn't aware if it's a container layer being mounted or not. The host, however, is. The host side error message will have something like: failed to add LCOW layer: failed to add SCSI layer: failed to modify UVM with new SCSI mount: guest modify: guest RPC failure: frailed to create dm-verity target for device=/dev/sda: device-mapper table load: no such device: unknown, so I don't we need to word this differently.

BryceDFisher · 2023-04-07T20:18:41Z

internal/guest/storage/devicemapper/devicemapper.go

+		}
+		// check retry-able errors
+		for _, e := range errs {
+			if errors.Is(dmErr.Err, e) {


If a non-retriable error is encountered, does it make sense to fail fast?

that's what it does, we loop through possible errors and in case we hit a non-retriable error, we finish the loop and return error on L255

internal/guest/storage/devicemapper/devicemapper.go

Occasionally /dev/sd* devices arrive late and not available at the time when verity or dm-crypt targets are created. This commit introduces a `CreateDevice` wrapper which can retry the operation on specific errors and always retries cryptsetup once, but with a large retry timeout. Signed-off-by: Maksim An <[email protected]>

anmaxvl mentioned this pull request Apr 7, 2023

hack: add blanket retries on device-mapper failures with SCSI #1720

Merged

BryceDFisher reviewed Apr 7, 2023

View reviewed changes

internal/guest/storage/devicemapper/devicemapper.go Outdated Show resolved Hide resolved

BryceDFisher reviewed Apr 7, 2023

View reviewed changes

anmaxvl force-pushed the retry-dev-mapper-create-device branch 4 times, most recently from 2e176ac to 1d8d651 Compare April 10, 2023 23:51

anmaxvl marked this pull request as ready for review April 11, 2023 04:09

anmaxvl requested a review from a team as a code owner April 11, 2023 04:09

msscotb assigned msscotb and helsaawy Apr 12, 2023

helsaawy approved these changes Apr 12, 2023

View reviewed changes

msscotb reviewed May 2, 2023

View reviewed changes

internal/guest/storage/devicemapper/devicemapper.go Show resolved Hide resolved

internal/guest/storage/devicemapper/devicemapper.go Outdated Show resolved Hide resolved

anmaxvl force-pushed the retry-dev-mapper-create-device branch from 1d8d651 to 9a4a019 Compare May 5, 2023 01:15

anmaxvl force-pushed the retry-dev-mapper-create-device branch from 9a4a019 to ee8f293 Compare June 6, 2023 00:45

msscotb approved these changes Jun 29, 2023

View reviewed changes

anmaxvl force-pushed the retry-dev-mapper-create-device branch from ee8f293 to 89e4738 Compare August 7, 2023 17:05

anmaxvl merged commit 0423eec into microsoft:main Aug 7, 2023

anmaxvl deleted the retry-dev-mapper-create-device branch August 7, 2023 17:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

retry device mapper and cryptsetup errors #1721

retry device mapper and cryptsetup errors #1721

anmaxvl commented Apr 7, 2023

BryceDFisher Apr 7, 2023

anmaxvl Apr 10, 2023

BryceDFisher Apr 7, 2023

anmaxvl Apr 7, 2023

retry device mapper and cryptsetup errors #1721

retry device mapper and cryptsetup errors #1721

Conversation

anmaxvl commented Apr 7, 2023

BryceDFisher Apr 7, 2023

Choose a reason for hiding this comment

anmaxvl Apr 10, 2023

Choose a reason for hiding this comment

BryceDFisher Apr 7, 2023

Choose a reason for hiding this comment

anmaxvl Apr 7, 2023

Choose a reason for hiding this comment