Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encryption key not escrowed after macOS/Windows host is decrypted and re-encrypted #25723

Open
getvictor opened this issue Jan 23, 2025 · 6 comments
Assignees
Labels
~backend Backend-related issue. bug Something isn't working as documented :demo #g-mdm MDM product group P1 Prioritize as critical :release Ready to write code. Scheduled in a release. See "Making changes" in handbook. ~released bug This bug was found in a stable release.
Milestone

Comments

@getvictor
Copy link
Member

Currently, this is a placeholder for issues I've seen testing the key recovery flow for P1 #25609

@PezHub, please try to reproduce these issues without the key recovery flow. To summarize:

  1. Move offline host to unencrypted team. (Make sure key is not retrievable.)
  2. Move offline host back to encrypted team.
  3. Bring host online and see if it escrows new key.
@getvictor getvictor added #g-mdm MDM product group :release Ready to write code. Scheduled in a release. See "Making changes" in handbook. :reproduce Involves documenting reproduction steps in the issue bug Something isn't working as documented ~backend Backend-related issue. ~released bug This bug was found in a stable release. labels Jan 23, 2025
@lukeheath lukeheath added the P1 Prioritize as critical label Jan 23, 2025
@lukeheath
Copy link
Member

@getvictor I'm going to put a P1 on any bugs found related to us losing the encryption key so hey are easy to spot.

@georgekarrv georgekarrv added this to the 4.64.0-tentative milestone Jan 23, 2025
@gillespi314 gillespi314 self-assigned this Jan 23, 2025
@getvictor
Copy link
Member Author

@gillespi314 For macOS stuck in Action required (pending), I noticed that /var/db/FileVaultPRK.dat was empty. I ran

sudo fdesetup changerecovery -personal

and that populated that file, and osquery was able to return the recovery key.

@gillespi314
Copy link
Contributor

@getvictor, does this bug require the host to be offline when the team transfers happen?

@georgekarrv georgekarrv removed the :reproduce Involves documenting reproduction steps in the issue label Jan 24, 2025
@PezHub
Copy link
Contributor

PezHub commented Jan 24, 2025

Testing so far shows (win & mac) hosts successfully re-encrypt (key gets rotated and escrowed) when moving teams if you wait at least an hour. This was tested with the hosts remaining online.

Linux LUKS encryption is working as well but the user experience is different. The rotation and escrow of the key happens on the My Device page with modals showing progress and success messages throughout.

I still need to run through everything again while the host is offline.

tracking here

@gillespi314
Copy link
Contributor

gillespi314 commented Jan 28, 2025

Findings

macOS
Works as expected although the process depends on a series of events that can take time to play out. For reference, here’s the flow I tested:

  • mdm enrolled host with disk encryption enforced and verified
  • take host offline
  • transfer to disk encryption off (hdek record deleted)
  • transfer to disk encryption on
  • bring host online
  • action pending (no record in hdek)
  • trigger cleanups (empty row in hdek)
  • refetch host status (no change)
  • restart host
  • refetch host status (host populates hdek)
  • verifying
  • trigger cleanups
  • verified

Windows
Key is re-escrowed as expected. Process can take over an hour to complete because we limit the frequency of disk encryption attempts by fleetd.

I did find a case where a host can get stuck in the verifying state after the key is re-escrowed, in which case, the state will persist as verifying until the hosts_disks table is updated (e.g., when available disk space changes by at least 0.01 GB). We can avoid this by manually setting host_disks.updated_at every time the detail query is reported.

@PezHub
Copy link
Contributor

PezHub commented Feb 10, 2025

QA Test results after fix -

macOS:
(Tested workflow with the host online and offline) When moving from encrypted team, to non, back to encrypted team = the host was re-encrypted and the key was stored in the DB. This occurred after a logoff/restart plus refetch, per instructions in the banner on the My Device page. Host went from pending --> verifying --> verified within a reasonable amount of time, as expected.

Windows:
Without forcing a restart, the host eventually went from pending --> verifying --> verified after an hour.

I was not able to reproduce a host getting stuck in a verifying or pending state like we did before the fix was applied. I tested the workflow with hosts online and offline during the team transfer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
~backend Backend-related issue. bug Something isn't working as documented :demo #g-mdm MDM product group P1 Prioritize as critical :release Ready to write code. Scheduled in a release. See "Making changes" in handbook. ~released bug This bug was found in a stable release.
Projects
None yet
Development

No branches or pull requests

5 participants