Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On dogfooding, sometimes cannot open workspace, redirected to dashboard instead #22288

Closed
dkwon17 opened this issue Jun 13, 2023 · 8 comments · Fixed by eclipse-che/che-operator#1709
Assignees
Labels
area/dogfooding Using Eclispe Che to code, test and build Eclipse Che kind/bug Outline of a bug - must adhere to the bug report template. severity/P1 Has a major impact to usage or development of the system.

Comments

@dkwon17
Copy link
Contributor

dkwon17 commented Jun 13, 2023

Describe the bug

Sometimes, when working on my private repository on the dogfooding cluster, the workspace enters the RUNNING state, however, the editor does not open, and instead I am being redirected to the dashboard:

output.mp4

When accessing the editor, the user should be redirected to the dashboard only if the workspace mainUrl returns a 5XX error. Since workspace is in the RUNNING state, this means that the health check must have returned a 2xx or 4xx code, see [1].

But for whatever reason, despite health check returning 2xx/4xx, it seems traefik receives a 5xx error when trying to access the mainUrl, therefore causing an unintended redirect to the dashboard.

[1] https://github.com/devfile/devworkspace-operator/blob/61bd5d1888bfa686b01b9b744a3bfcb955e38a8b/controllers/workspace/status.go#L211-L217

Che version

next (development version)

Steps to reproduce

Unfortunately I'm not able to reproduce this issue regularly. The GitHub repo I used to create the workspace with was a private repo

Expected behavior

After the workspace is in the RUNNING state, the user should be directed to the editor.

Runtime

OpenShift

Screenshots

No response

Installation method

OperatorHub

Environment

macOS, other (please specify in additional context)

Eclipse Che Logs

No response

Additional context

No response

@dkwon17 dkwon17 added kind/bug Outline of a bug - must adhere to the bug report template. dogfooding area/dogfooding Using Eclispe Che to code, test and build Eclipse Che labels Jun 13, 2023
@che-bot che-bot added the status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. label Jun 13, 2023
@svor svor added severity/P1 Has a major impact to usage or development of the system. and removed status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. labels Jun 13, 2023
@dmytro-ndp
Copy link
Contributor

I had the same problem with workspaces created from GitHub repo.

@dkwon17
Copy link
Contributor Author

dkwon17 commented Jun 13, 2023

I am now seeing this problem for private and public GitHub repositories

@dkwon17
Copy link
Contributor Author

dkwon17 commented Jun 14, 2023

@dmytro-ndp do you remember if this happened when starting a workspace that has other workspace with a similar name?

For example, did you have these workspaces:

quarkus-api-example
quarkus-api-example-abc1

and then did the problem happen when you tried to start quarkus-api-example-abc1?

@dkwon17
Copy link
Contributor Author

dkwon17 commented Jun 14, 2023

After more investigation, I am convinced this is a routing issue

When I experience this error, in the che-gateway log, I see logs such as:

time="2023-06-14T17:41:38Z" level=debug msg="'502 Bad Gateway' caused by: dial tcp 172.30.109.187:3030: connect: connection refused"
time="2023-06-14T17:41:38Z" level=debug msg="Caught HTTP Status Code 502, returning error page" middlewareType=customError middlewareName=workspace46c5dca769194c5f-errors@file

The che-gateway is trying to access the wrong workspace service, in the log above, it tries to reach 172.30.109.187 which is the incorrect workspace service

@dkwon17
Copy link
Contributor Author

dkwon17 commented Jun 14, 2023

cc @ibuziuk @olexii4 @akurinnoy

I would like to investigate a fix on the che-gateway side

@dmytro-ndp
Copy link
Contributor

dmytro-ndp commented Jun 15, 2023

@dkwon17 :

@dmytro-ndp do you remember if this happened when starting a workspace that has other workspace with a similar name?

yes, that was the case
I tried to create second workspace from eclipse/che repo factory, and then wasn't able to enter neither old workspace nor new one.

@RomanNikitenko
Copy link
Member

RomanNikitenko commented Jun 15, 2023

I faced the same issue.
I had few workspaces of the same repo, but they were stopped,
only one was running,
I was not able to open IDE for that workspace - it constantly redirected to the dashboard when the workspace was running.

Also, I noticed 504 error in the browser console

@dkwon17
Copy link
Contributor Author

dkwon17 commented Jun 19, 2023

I have a PR for a fix here: eclipse-che/che-operator#1709

This issue seems to happen when I have two workspaces with a similar name. For example, it sometimes happens if I have two workspaces named like the following:

try-in-web-ide
try-in-web-ide-cg07

The reason why there is a redirection to the dashboard is because traefik is routing the user to a different workspace's workspace service. That's because in the Traefik config, the PathPrefix used to direct traffic to the workspace services are overlapping, and they have the same priority of 100:

    http:
    ...
      routers:
        workspace3e9774e5963b472a:
          ...
          priority: 100
          rule: PathPrefix(`/cluster-admin/try-in-web-ide-cgo7`)
          service: workspace3e9774e5963b472a
    http:
    ...
      routers:
        workspace999e5375ff7f40d6:
          ...
          priority: 100
          rule: PathPrefix(`/cluster-admin/try-in-web-ide`)
          service: workspace999e5375ff7f40d6

In the config above, prefix /cluster-admin/try-in-web-ide overlaps with prefix /cluster-admin/try-in-web-ide-cgo7.

Therefore, if the user starts and opens the try-in-web-ide-cgo7 workspace, Traefik should direct the request to {CHE-HOST}/cluster-admin/try-in-web-ide-cgo7, but Traefik may sometimes direct the user to {CHE-HOST}/cluster-admin/try-in-web-ide instead.

As a result, the user would get a redirect to the dashboard because of errors middleware: eclipse-che/che-operator#1392

@ibuziuk ibuziuk moved this from 🚧 In Progress to Ready for Review in Eclipse Che Team A Backlog Jun 22, 2023
@ibuziuk ibuziuk moved this from Ready for Review to ✅ Done in Eclipse Che Team A Backlog Jun 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/dogfooding Using Eclispe Che to code, test and build Eclipse Che kind/bug Outline of a bug - must adhere to the bug report template. severity/P1 Has a major impact to usage or development of the system.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants