Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Application health assessment failed with "ComparisonError: json: cannot unmarshal array into Go value of type" #5423

Closed
ymmt2005 opened this issue Feb 5, 2021 · 9 comments
Labels
bug/in-triage This issue needs further triage to be correctly classified bug Something isn't working
Milestone

Comments

@ymmt2005
Copy link

ymmt2005 commented Feb 5, 2021

Describe the bug

When I added a Lua script to restore Application health assessment behavior in v1.7 for v1.8.3 server
according to the this, argocd-application-controller stop syncing Applications.

An Application had a status like this:

status:
    health:
      status: Missing
    operationState:
      finishedAt: "2021-02-02T08:39:12Z"
      message: 'ComparisonError: json: cannot unmarshal array into Go value of type
        health.HealthStatus'

argocd-application-controller outputted logs like this:

time="2021-02-02T10:19:40Z" level=info msg="Normalized app spec: {\"status\":{\"conditions\":[{\"lastTransitionTime\":\"2021-02-02T10:19:40Z\",\"message\":\"json: cannot unmarshal array into Go value of type health.HealthStatus\",\"type\":\"ComparisonError\"}]}}" application=argocd-config

To Reproduce

We cannot create a minimal case.
Our app-of-apps repository is github.com/cybozu-go/neco-apps

When we added the Lua script in https://argoproj.github.io/argo-cd/operator-manual/upgrading/1.7-1.8/#health-assessement-of-argoprojioapplication-crd-has-been-removed to argocd-cm, it stops Application synchronization.

Expected behavior

The Lua script should work without errors.

Version

argocd: v1.8.3+0f9c684
  BuildDate: 2021-01-21T22:19:20Z
  GitCommit: 0f9c68427882bf4633d395cbfcd7c9271795fd9b
  GitTreeState: clean
  GoVersion: go1.14.12
  Compiler: gc
  Platform: linux/amd64
argocd-server: v1.8.3+unknown
  BuildDate: 2021-02-02T03:51:06Z
  GitCommit:
  GitTreeState: clean
  GoVersion: go1.15.7
  Compiler: gc
  Platform: linux/amd64
  Ksonnet Version: unable to determine ksonnet version: exec: "ks": executable file not found in $PATH
  Kustomize Version: v3.7.0 2020-07-04T19:15:46Z
  Helm Version: v3.5.1+g32c2223
  Kubectl Version: v1.19.7
  Jsonnet Version: v0.17.0
@ymmt2005 ymmt2005 added the bug Something isn't working label Feb 5, 2021
@jannfis
Copy link
Member

jannfis commented Feb 5, 2021

Hi @ymmt2005, the error message suggests that there might have been a C&P or other error (such as indentation) when you integrated the Lua script into your ConfigMap. I have looked into the repository you mentioned, but could not find a change that has the health check actually integrated, so this is a wild guess.

However, I'm running several 1.8 instances with a C&P of the health check mentioned in the upgrade guide, and they all run fine :)

Do you happen to have a full argocd-cm ConfigMap which includes the health check and triggers the error you are seeing, and that you could share?

Thanks.

@jannfis jannfis added bug/in-triage This issue needs further triage to be correctly classified more-information-needed Further information is requested labels Feb 5, 2021
@ymmt2005
Copy link
Author

ymmt2005 commented Feb 5, 2021

Hi @jannfis , thank you for the information.

I find the failed CI log and argocd-cm at that point https://github.com/cybozu-go/neco-apps/blob/a7aeab296ab9315a5354d9bc10a8bc78829dadb1/argocd/base/configmap.yaml#L19-L30 .

We gave up adding this and switched to unordered Application initialization then.

@no-response no-response bot removed the more-information-needed Further information is requested label Feb 5, 2021
@ionutleca
Copy link

@ymmt2005, try this

argoproj.io/Application:
  health.lua: |
    hs = {}
    hs.status = "Healthy"
    hs.message = ""
    if obj.status ~= nil then
      if obj.status.health ~= nil then
        hs.status = obj.status.health.status
        if obj.status.health.message ~= nil then
          hs.message = obj.status.health.message
        end
      end
    end
    return hs

@ymmt2005
Copy link
Author

@ionutleca will try it, thank you!

@ymmt2005
Copy link
Author

@ionutleca
Your script works like a charm. Thank you very much.
Should I leave this issue open?

@ymmt2005
Copy link
Author

ymmt2005 commented May 31, 2021

We needed to change the initial hs.status value to Progressing too.
cf. https://github.com/argoproj/argo-cd/pull/6281/files#diff-87b2d32677e1a95b91e9e29e79176a237f87a7da58898423c0683e532aa1c750

argoproj.io/Application:
  health.lua: |
    hs = {}
    hs.status = "Progressing"
    hs.message = ""
    if obj.status ~= nil then
      if obj.status.health ~= nil then
        hs.status = obj.status.health.status
        if obj.status.health.message ~= nil then
          hs.message = obj.status.health.message
        end
      end
    end
    return hs

@ionutleca
Copy link

I had my doubts with starting with hs="Healthy" as well. Glad it helped! As for leaving the issue open, I think an argo maintainer should answer this.

@alexmt
Copy link
Collaborator

alexmt commented Jun 3, 2021

Thanks a lot @ymmt2005 ! I totally agree - status should be "Progressing", not "Healthy". This explains why sync waves stopped working (#5146) but we never could reproduce it.

@alexmt alexmt added this to the v2.1 milestone Jun 3, 2021
@alexmt
Copy link
Collaborator

alexmt commented Jun 3, 2021

Closing since #6281 is merged.

@alexmt alexmt closed this as completed Jun 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug/in-triage This issue needs further triage to be correctly classified bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants