Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fast USB DFU workflow #41921

Closed
hackwerken opened this issue Jan 18, 2022 · 14 comments
Closed

Fast USB DFU workflow #41921

hackwerken opened this issue Jan 18, 2022 · 14 comments
Labels
area: DFU Device Firmware Upgrade area: USB Universal Serial Bus Enhancement Changes/Updates/Additions to existing features

Comments

@hackwerken
Copy link
Contributor

hackwerken commented Jan 18, 2022

Is your enhancement proposal related to a problem? Please describe.

The current of implementation USB DFU for Zephyr is not very streamlined. This makes it unusable for our product. See Additional context.

The current workflow (as far as I understand) is as follows:

  1. Download new binary to slot 1 using dfu-util
  2. Reset the device
    • There seems to be no solid way to do this automatically from the host. It could be done via the shell, but that adds two large dependencies. Also sometimes the CDC ACM endpoint seems to have trouble coming back up after the download.
  3. Bootloader swaps the slots
  4. Confirm the image
    • Again there seems to be no way to do this nicely without the shell.
    • It is possible to download an image that is already confirmed. This however has the disadvantage that it will increase the image size to the max possible value, since the swap status is stored at the end of the image. This is quite significant on nRF52480 since it has lots of flash.

I am mainly opening this issue to see if there is interest from the community to fix this. Since we need this anyway for our company we are interested to know if something like this would be merged.

Describe the solution you'd like

The most ideal (and fasted) workflow for us would be if developers could just run west flash --runner dfu, and have the same experience as with a debugger(minus actual debugging of course).

Keep in mind this flow is most suited to developers. For customer facing updates this may not be ideal.

Here are the steps that need to happen under the hood:

  1. dfu-util sends the DFU_DETACH followed by a USB reset, as per the USB DFU standard. This causes the MCU to reset into the bootloader, where the actual download is preformed.
    • The application will somehow have to let the bootloader know it should go into DFU mode instead of booting the application. Nrf52 has some registers that are retained over a software reset, those could be used for that. I'm not sure if something similar exists on other platforms.
    • For testing at our company we currently have a shell command set up that sets the GPREGRET registers and resets the MCU. This works fine, but needs the shell enabled.
  2. Mcuboot enables USB and USB DFU
    • This is similar to the USB DFU recovery mode already present in Mcuboots Zephyr code. Only an additional check is needed to see if it should stay in DFU mode.
  3. dfu-util downloads the binary directly to slot 0
    • Because the binary is written from the bootloader, it is possible to write into slot 0. This eliminates both the swap and confirm operations. Saving time and complexity.
    • This of course isn't as safe as using the complete test-confirm cycle, but if the upload fails the device can always be debricked with the DFU recovery mode.
  4. The bootloader directly boots the uploaded image!

I have some proof of concept code for enabling step 3. The current USB DFU implementation (correctly) forbids writing to slot 0. I only need to add some KConfig options so these checks can be en/disabled depending if it is used from the bootloader or application. I should be able to have a PR up by next week.

I suspect step 1 would be a bit more work. I guess usb_dfu.c would need to be split up somehow. One for the 'reset to bootloader' part, and another to do the actual download.

Describe alternatives you've considered

Going the USB DFU route seems the nicest for our use case, since it seems to be the fasted in terms of upload speed (given CONFIG_USB_REQUEST_BUFFER_SIZE is set to a high enough value).

We also experimented with the Mcumgr over USB, but that seems pretty slow with all the overhead involved. It also didn't seem to be very reliable.

Additional context

Our product (based on a nRF52840) has a closed housing with only a USB port exposed. For development we currently have to partly disassemble the product and hook up a debugger.

This works of course, but it is not very convenient for several reasons:

  • Quickly testing a change requires disassembly and lots of desk space.
  • Our product consists of several (up to 24) identical units that communicate to each other. Flashing them one by one via a debugger is tedious...
  • We have some developers that only work on the higher level side of the code. They don't need the debugging functionality anyway.

This is very specific to our product of course. But we figured improving this might be useful for more people.

Reference

https://usb.org/sites/default/files/DFU_1.1.pdf

@hackwerken hackwerken added the Enhancement Changes/Updates/Additions to existing features label Jan 18, 2022
@Laczen
Copy link
Collaborator

Laczen commented Jan 19, 2022

Hi @hackwerken, it seems like you are willing to give up all nice mcuboot properties (downgrade protection, encrypted software, ...) to simplify the development proces. This seems like the wrong approach. Are you sure that your final product can go without these properties?

It is not because mcuboot is said to be a secure bootloader that you can't disable all the security.

@hackwerken
Copy link
Contributor Author

Hi @Laczen, thanks for your response.

We indeed do not use the features you mentioned for various reasons. Image signing is the most important for us, which is still intact with the proposed changes. We are also not sure yet if we want to ship a bootloader containing this behavior to end customers.

In any case, it would be nice to be able to optionally enable writing to slot 0 from the bootloader. We could then flash our development units with a special version of the bootloader.

Do you know what the original use case for the USB DFU feature was? Since some peaces seem to be missing.

@jfischer-no
Copy link
Collaborator

jfischer-no commented Jan 21, 2022

The current of implementation USB DFU for Zephyr is not very streamlined. This makes it unusable for our product. See Additional context.

The current workflow (as far as I understand) is as follows:

1. Download new binary to slot 1 using dfu-util

2. Reset the device
   
   * There seems to be no solid way to do this automatically from the host. It could be done via the shell, but that adds two large dependencies. Also sometimes the CDC ACM endpoint seems to have trouble coming back up after the download.

3. Bootloader swaps the slots

4. Confirm the image
   
   * Again there seems to be no way to do this nicely without the shell.
   * It is possible to download an image that is already confirmed. This however has the disadvantage that it will increase the image size to the max possible value, since the swap status is stored at the end of the image. This is quite significant on nRF52480 since it has lots of flash.

What you describe is how samples/subsys/usb/dfu works, your application does not have to be like that, you can also use USB DFU class from/with bootloader only (what is obvious). IRRC there is example code in MCUboot (where MCUboot is just a Zephyr OS application) itself how to trigger update, e.g. using button or just timeout.

Is not CONFIG_SINGLE_APPLICATION_SLOT what you are looking for?
⬆️ @nvlsianpu

@jfischer-no jfischer-no added area: DFU Device Firmware Upgrade area: USB Universal Serial Bus labels Jan 21, 2022
@hackwerken
Copy link
Contributor Author

Hi @jfischer-no, thank you for your suggestions.

You are right, the Zephyr Mcuboot port has a recovery option using USB DFU. As you mentioned this requires a button or some kind of other trigger to stay in the bootloader. This works great on the NRF52480 DK. But, apart from requiring extra user input, this also requires a spare button. Unfortunately our product only has one button to turn it on/off. The most ideal workflow would be to only have to run west flash --runner dfu and have the application running some seconds afterwards. Kind of the Arduino experience, so to say.

CONFIG_SINGLE_APPLICATION_SLOT is and interesting option I didn't conciser yet. It would simplify some things as it does not require the patches to usb_dfu.c I mentioned. It does however remove the possibility to update via BLE. So then we would have to flash some units for USB and others for BLE.

Also, I realize I'm talking a lot in terms of 'we' and 'our product'. However, my goal for this issue is to find out if there is some common ground with other Zephyr users. If that is the case we are willing to put in the effort to implement it in such a way that it can be upstreamed.

@Laczen
Copy link
Collaborator

Laczen commented Jan 21, 2022

@hackwerken, it would be good if there would be a zephyr application that fully supports usb-dfu. As far as I understand usb-dfu this application should:
a. Allow moving from usb-run to usb-dfu,
b. Remain in usb-dfu during the upload,
c. Reboot and upgrade (if using image0 and image1 solution),
d. After the upgrade restart in usb-dfu to allow the manifestation (that should include image validation),
e. Switch to usb-run,

Step d. could use the state of the validation to decide if an image should start in usb-run or usb-dfu. If the image upgrade fails for whatever reason the image would be the old image and this would not start the usb-dfu which could be used by the uploader to error out.

@hackwerken
Copy link
Contributor Author

@Laczen right!

With D, do you mean to 'confirm' the image (in mcuboot terms)?

That is a good point. Because C and D seem to be the missing pieces of the puzzle.

I'll look more into the manifestation phase next week.

@Laczen
Copy link
Collaborator

Laczen commented Jan 21, 2022

@Laczen right!

With D, do you mean to 'confirm' the image (in mcuboot terms)?

That is a good point. Because C and D seem to be the missing pieces of the puzzle.

I'll look more into the manifestation phase next week.

Yes, by D I mean the image confirmation. You could however also not do the confirmation in the manifestation but to assign a special dfu location where you would write "OK" to in order to validate the image. The running image could be asked to go to dfu mode, allow only upgrades if the image is confirmed, and allow writing to the special "OK" location.

@jfischer-no
Copy link
Collaborator

You are right, the Zephyr Mcuboot port has a recovery option using USB DFU. As you mentioned this requires a button or some kind of other trigger to stay in the bootloader. This works great on the NRF52480 DK. But, apart from requiring extra user input, this also requires a spare button. Unfortunately our product only has one button to turn it on/off. The most ideal workflow would be to only have to run west flash --runner dfu and have the application running some seconds afterwards. Kind of the Arduino experience, so to say.

IIRC there is also timeout option, but it would not make it faster. There is no way without disadvantages, either USB DFU in bootloader and user interaction, or issues/limitations in USB communication in application mode, since USB DFU + other classes usually does not work, on Windows OS anyway.

CONFIG_SINGLE_APPLICATION_SLOT is and interesting option I didn't conciser yet. It would simplify some things as it does not require the patches to usb_dfu.c I mentioned. It does however remove the possibility to update via BLE. So then we would have to flash some units for USB and others for BLE.

What kind of product is it that is updated with west flash --runner dfu or via BLE, does it happen in the field?
Have you considered to use mcumgr and DFU over CDC ACM UART?

@hackwerken
Copy link
Contributor Author

@Laczen

You could however also not do the confirmation in the manifestation but to assign a special dfu location where you would write "OK" to in order to validate the image.

I'm not sure what you mean by this? Such location already exists in the image trailer. Since the trailer is at the end of an image partition a 'preconfirmed' image is very large (as large as i can be).

Meanwhile I dived a bit deeper into the USB DFU specification. I might be mistaken, but as far as I can tell the manifestation phase works different from what you describe. Because in this case the device can not communicate during manifestation (ie the swap), it has to set bitManifestationTolerant=0 and bitWillDetach=1. After the update is complete the new image is booted in dfu app mode.

I made this diagram to show how that could work for Zephyr/Mcuboot.
image

This only leaves image confirmation. I see a few points where this could happen:

  1. Uploading a precomfirmed image
    • Makes the image size very large
    • Requires extra scripts to generate the image via the sign tool
  2. Confirm from the application before swapping (at 6 in the diagram)
  3. Confirm on boot of the newly flashed image
    • How does the app know it was updated successfully and not reverted?
    • How does the app know it was updated via USB DFU and not some other way (ie BLE)?
    • Does not protect against uploading an image without USB DFU...
    • Could use something like GPREGRET on nRF52, but that would not be platform agnostic.
  4. ??

To me 2 seems best to start with. All three bypass the test-confirm mechanism, but manual recovery could be enabled via the bootloader.

@hackwerken
Copy link
Contributor Author

@jfischer-no

IIRC there is also timeout option, but it would not make it faster. There is no way without disadvantages, either USB DFU in bootloader and user interaction, or issues/limitations in USB communication in application mode, since USB DFU + other classes usually does not work, on Windows OS anyway.

Entering the bootloader without user interaction is possible. We currently have a proof of concept shell command that does this. But as I wrote in the original issue, this uses the nRF52 GPREGRET register. That could be a drawback if nothing similar is available on other platforms.

For completeness I also made a diagram for the 'DFU in bootloader' solution:
image

This has the same issues with confirmation as the 'DFU in app' solution. It seems like USB DFU isn't really intended for such mechanism. But when updating from the bootloader this isn't a big deal, since manual recovery is always possible.

What kind of product is it that is updated with west flash --runner dfu or via BLE, does it happen in the field?

We make an interactive toy to encourage kids to play outside. There are already thousands being used in the field (hopefully many more will follow :)). Our customers use BLE to update their devices.
Since this product requires many devices to run the same firmware, it would be nice to have an update mechanism that is faster then BLE. This won't (yet) be used by customers, but it will be invaluable for development and user tests. So to answer your question more directly: it would be nice if it could happen in the field :). Having to keep different sets around that can ether be update via BLE of USB would be a pain. Also since our previous generation (not based on Zephyr) could do this.

Have you considered to use mcumgr and DFU over CDC ACM UART?

Yes, I have tried that before. It works, but it is really slow. I flashed the SMP example to my NRF DK this morning, and got around 3KiB/s. That would be more then 10 minutes for a 1MiB image... This is probably because the data is packetized before being send over ACM CDC.
From past experience it also seems to be unreliable when it has to share the ACM CDC pipe with logging.

@Laczen
Copy link
Collaborator

Laczen commented Jan 25, 2022

@Laczen

You could however also not do the confirmation in the manifestation but to assign a special dfu location where you would write "OK" to in order to validate the image.

I'm not sure what you mean by this? Such location already exists in the image trailer. Since the trailer is at the end of an image partition a 'preconfirmed' image is very large (as large as i can be).

Meanwhile I dived a bit deeper into the USB DFU specification. I might be mistaken, but as far as I can tell the manifestation phase works different from what you describe. Because in this case the device can not communicate during manifestation (ie the swap), it has to set bitManifestationTolerant=0 and bitWillDetach=1. After the update is complete the new image is booted in dfu app mode.

I made this diagram to show how that could work for Zephyr/Mcuboot. image

This only leaves image confirmation. I see a few points where this could happen:

  1. Uploading a precomfirmed image

    • Makes the image size very large
    • Requires extra scripts to generate the image via the sign tool
  2. Confirm from the application before swapping (at 6 in the diagram)

  3. Confirm on boot of the newly flashed image

    • How does the app know it was updated successfully and not reverted?
    • How does the app know it was updated via USB DFU and not some other way (ie BLE)?
    • Does not protect against uploading an image without USB DFU...
    • Could use something like GPREGRET on nRF52, but that would not be platform agnostic.
  4. ??

To me 2 seems best to start with. All three bypass the test-confirm mechanism, but manual recovery could be enabled via the bootloader.

@hackwerken, nice diagrams. One small remark regarding this diagram, the manifestation phase could be after the upgrade (and I think usb-dfu intends it to be after the upgrade). If the upgrade fails the manifestation phase would never be reached and dfu-util would time-out. To start the manifestation at first boot you could detect the image verification state (but this would probably ruin support for preconfirmed images). If the image is not confirmed the application boots into dfu-mode and in the manifestation state.

Regarding the image confirmation, in dfu-util you are passing the location where to write and the data to write. You could define a "fake" location (e.g. location 0x0) and use this to pass the confimation message (e.g. a message containing "OK"). The dfu-util would then be called twice, once for the update and once for the confirmation. The application catches this second "upgrade" using the fake location and instead of writing an image it does the image confirm.

@Laczen
Copy link
Collaborator

Laczen commented Jan 26, 2022

@hackwerken (goedemorgen Martijn), the above comments are two separate solutions, the first one to confirm during manifestation, the second one to use dfu-util to confirm the image. The last solution would be the simplest and most versatile.

@hackwerken
Copy link
Contributor Author

@Laczen (goedemiddag :))

As far as I understand the manifestation phase is meant to do things like the swap. In fact, it can't happen after the swap/update, because resetting to the bootloader causes a detach event on the bus. At least in the case of the 'DFU via app' approach.

Taken from the USB Device Firmware Upgrade Specification, Revision 1.1, page 26:

After the zero length DFU_DNLOAD request terminates the Transfer phase, the device is ready to
manifest the new firmware. As described previously, some devices may accumulate the firmware image
and perform the entire reprogramming operation at one time. Others may have only a small amount
remaining to be reprogrammed, and still others may have none. Regardless, the device enters the
dfuMANIFEST-SYNC state and awaits the solicitation of the status report by the host. Upon receipt of
the anticipated DFU_GETSTATUS, the device enters the dfuMANIFEST state, where it completes its
reprogramming operations.
Following a successful reprogramming, the device enters one of two states: dfuMANIFEST-SYNC or
dfuMANIFEST-WAIT-RESET, depending on whether or not it is still capable of communicating via
USB. The host is aware of which state the device will enter by virtue of the bmAttributes bit
bitManifestationTolerant. If the device enters dfuMANIFEST-SYNC (bitMainfestationTolerant = 1),
then the host issues the DFU_GETSTATUS request, and the device enters the dfuIDLE state. At that
point, the host can perform another download, solicit an upload, or issue a USB reset to return the
device to application run-time mode. If, however, the device enters the dfuMANIFEST-WAIT-RESET
state (bitManifestationTolerant = 0), then if bitWillDetach = 1 the device generates a detach-attach
sequence on the bus, otherwise (bitWillDetach = 0) the host must issue a USB reset to the device.
After the bus reset the device will evaluate the firmware status and enter the appropriate mode.

When the appropriate bits are set (bitManifestationTolerant = 0 and bitWillDetach = 1) dfu-util shouldn't error out.

If upgrading/swapping failed, according to the spec (page 37) the firmware should remain in DFU mode after reset. This does not seem to be possible for us tho, since the application has no way of knowing if the swap was successful. Not really a big deal I think.

Or am I missing something?

I understand your idea for confirming via the 'fake location'. It wouldn't be strictly USB DFU compliant, but it could work I guess.

Anyway, for now I want to start implementing the 'DFU via app' approach, and confirm from the application after the upload completes. This would be the simplest to implement. More advanced/robust approaches could be added later.

I still think the 'DFU via bootloader' is the best approach in terms of functionality (option to upload directly to slot0, good error reporting in the manifestation phase, good recovery mode). But it is also the most difficult to implement correctly (especially the 'enter bootloader' bit). So that's why I'll start work on the other approach.

@Laczen
Copy link
Collaborator

Laczen commented Jan 26, 2022

The 'DFU via bootloader' might be a closer match for usb, not for ble. As you add more and more communication methods to the bootloader it's size and complexity increases. I prefer a bootloader that does almost nothing, but does this good.

Anyhow I am sure that you will come up with a workable solution for usb-dfu. You are nicely trying to map the usb-dfu chart to what can be done in zephyr.

hackwerken added a commit to hackwerken/zephyr that referenced this issue Mar 29, 2022
This commit adds the USB_DFU_PERMANENT_DOWNLOAD and USB_DFU_REBOOT and
symbols.When the permanent download symbol is enabled, slot 1 will be
marked as confirmed. With the reboot symbol enabled, the devices
automatically reboots after the download is completed.

The functionality is split into two symbols to allow the automatic
reboot without confirming the image. This enables image confirmation via
another channel. For example via the shell’s Mcuboot commands.

This functionality allows downloading an image to the device without the
user having to interact with the device. It is useful in cases where
ease of use is more important then safety. For example  when using USB
download for daily development. This is especially applicable for
devices with a closed case.

The changes were tested on an nrf52840dk. The following line can be used
to build the USB DFU example with the symbols enabled.

west build -b nrf52840dk_nrf52840 zephyr/samples/subsys/usb/dfu \
-d build-dfu -- -DCONFIG_BOOTLOADER_MCUBOOT=y \
-DOVERLAY_CONFIG=overlay-reboot-permanent.conf \
-DCONFIG_MCUBOOT_SIGNATURE_KEY_FILE\
=\"bootloader/mcuboot/root-rsa-2048.pem\"

Fixes zephyrproject-rtos#41921

Signed-off-by: Martijn Stommels <[email protected]>
jeffrizzo pushed a commit to Hakkei-Co/zephyr that referenced this issue Sep 4, 2022
This commit adds the USB_DFU_PERMANENT_DOWNLOAD and USB_DFU_REBOOT and
symbols.When the permanent download symbol is enabled, slot 1 will be
marked as confirmed. With the reboot symbol enabled, the devices
automatically reboots after the download is completed.

The functionality is split into two symbols to allow the automatic
reboot without confirming the image. This enables image confirmation via
another channel. For example via the shell’s Mcuboot commands.

This functionality allows downloading an image to the device without the
user having to interact with the device. It is useful in cases where
ease of use is more important then safety. For example  when using USB
download for daily development. This is especially applicable for
devices with a closed case.

The changes were tested on an nrf52840dk. The following line can be used
to build the USB DFU example with the symbols enabled.

west build -b nrf52840dk_nrf52840 zephyr/samples/subsys/usb/dfu \
-d build-dfu -- -DCONFIG_BOOTLOADER_MCUBOOT=y \
-DOVERLAY_CONFIG=overlay-reboot-permanent.conf \
-DCONFIG_MCUBOOT_SIGNATURE_KEY_FILE\
=\"bootloader/mcuboot/root-rsa-2048.pem\"

Fixes zephyrproject-rtos#41921

Signed-off-by: Martijn Stommels <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: DFU Device Firmware Upgrade area: USB Universal Serial Bus Enhancement Changes/Updates/Additions to existing features
Projects
None yet
Development

No branches or pull requests

3 participants