Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update fleetd to use uninstall script to rollback after a failed postInstall #22081

Closed
getvictor opened this issue Sep 13, 2024 · 11 comments
Closed
Assignees
Labels
~agent Related to Fleet's osquery runtime and agent autoupdater (Orbit) ~backend Backend-related issue. bug Something isn't working as documented #g-endpoint-ops Endpoint ops product group :release Ready to write code. Scheduled in a release. See "Making changes" in handbook. ~unreleased bug This bug was found in an unreleased version of Fleet.
Milestone

Comments

@getvictor
Copy link
Member

getvictor commented Sep 13, 2024

When postInstall script fails, fleetd should run the user-visible uninstall script instead of the secret remove script.

Also need to update the docs regarding this behavior.

Demo

@getvictor getvictor added #g-endpoint-ops Endpoint ops product group :release Ready to write code. Scheduled in a release. See "Making changes" in handbook. bug Something isn't working as documented ~agent Related to Fleet's osquery runtime and agent autoupdater (Orbit) ~backend Backend-related issue. ~unreleased bug This bug was found in an unreleased version of Fleet. labels Sep 13, 2024
@getvictor getvictor self-assigned this Sep 13, 2024
@getvictor getvictor added this to the 4.57.0-tentative milestone Sep 13, 2024
@sharon-fdm sharon-fdm modified the milestones: 4.57.0, Fleetd-1.31.0 Sep 13, 2024
@lucasmrod
Copy link
Member

lucasmrod commented Sep 17, 2024

I've started QAing this issue.

Tried uninstalling Tailscale on a Windows 10 and 11 VMs (VMWare), I got the following error in the uninstall script output:

Start-Process : This command cannot be run due to the error: The system cannot find the file specified.

At C:\Windows\TEMP\fleet-436bbb32-ce1b-49f5-b75d-d2e047cc63f9-236699952\script.ps1:46 char:20

+         $process = Start-Process @processOptions

+                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    + CategoryInfo          : InvalidOperation: (:) [Start-Process], InvalidOperationException

    + FullyQualifiedErrorId : InvalidOperationException,Microsoft.PowerShell.Commands.StartProcessCommand

 

Uninstall exit code: 

@getvictor
Copy link
Member Author

@lucasmrod Yes, I'm aware of this issue. Fixing as part of #20000 -- currently EXE flow is buggy.

@lucasmrod
Copy link
Member

I tried installing+uninstalling a Firefox MSI (downloaded from here) and I got the following error on uninstall, let me know if it's expected/known:
This action is only valid for products that are currently installed

Screenshot 2024-09-17 at 10 57 20 AM

Am trying to get a successful uninstall before proceeding to test the edge case here.
Am happy to skip the happy path if we know about these issues.

@getvictor
Copy link
Member Author

For the specific EXE fail, fix is here: #22164

@getvictor
Copy link
Member Author

The MSI uninstall issue is new.

@lucasmrod
Copy link
Member

QA Update:

I was able to verify the fix for macOS. Installing Microsoft Teams pkg (with a exit 1 post install script) does effectively run the uninstaller script. (I had to build fleetd from main of course).

Am having some trouble with install scripts that have !#/bin/sh on Linux, most likely a bug but not related to this issue.

So, TODO are Linux and Windows.

@lucasmrod
Copy link
Member

QA Update:

I was able to verify Windows using 7-zip's MSI installer. (Exit 1 on the post install script and adding Write-Host "Hello" to the uninstall script).

TODO: Linux (I'm having issues with the new lines, it might be a bug on my VM, investigating...)

@lucasmrod
Copy link
Member

QA Update:

Installing Firefox deb with a post-install script with exit 1. I get the following output:

Post-install script output:

Running script...
Exit code: 1 (Failed)
Hello

Attempting rollback by running uninstall script...
Uninstall script exit code: 100
/tmp/2512911600/rollback-script.sh: 2: 
: not found
Reading package lists...
Building dependency tree...
Reading state information...
E: Unable to locate package firefox

I also got the following error while trying to install again:
Screenshot 2024-09-18 at 9 26 45 AM

(I didn't change the package itself, and it installed successfully in previous attempts...)

@lucasmrod
Copy link
Member

I finally was able to root cause the issue for Ubuntu packages:
#22196

@lucasmrod
Copy link
Member

#22196 is now fixed which allowed me to verify this fix for Linux. g2g QA-wise.

getvictor added a commit that referenced this issue Sep 18, 2024
Cherry-pick #22081 fixes from main.
@fleet-release
Copy link
Contributor

Uninstall script glows,
In clouds, a gentle safety,
Docs reflect the change.

This was referenced Oct 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
~agent Related to Fleet's osquery runtime and agent autoupdater (Orbit) ~backend Backend-related issue. bug Something isn't working as documented #g-endpoint-ops Endpoint ops product group :release Ready to write code. Scheduled in a release. See "Making changes" in handbook. ~unreleased bug This bug was found in an unreleased version of Fleet.
Development

No branches or pull requests

5 participants