Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Tenant ID Not Propagating Correctly in Multi-Tenancy Setup When Triggering Workflows #6129

Closed
sfmskywalker opened this issue Nov 19, 2024 · 4 comments · Fixed by #6142
Assignees
Labels
bug Something isn't working
Milestone

Comments

@sfmskywalker
Copy link
Member

When setting up Elsa.Server.Web with MultiTenancy support, workflows are successfully created and stored in the database with the correct Tenant ID populated. The API and UI correctly pass the Tenant ID during workflow creation. However, an issue arises when manually triggering a workflow from the UI.

The problem appears to occur within the ProtoActor runtime. Specifically, the Tenant ID is not propagated correctly during the call to CreateAndRun. As a result, the Tenant ID is NULL in the SQL command executed by Entity Framework, which leads to an error indicating that the workflow definition cannot be found.

Analysis suggests that a new IServiceScope may be created during execution by IServiceScopeFactory, causing the TenantAccessor to reset and lose the Tenant ID context. Without the Tenant ID, the database query cannot locate the appropriate workflow definition for the tenant.

This issue is observed after integrating the new MultiTenancy packages (currently in preview). Previously, the same workflow execution process functioned as expected.

Error Details

The specific error encountered is as follows:

System.Exception: One or more errors occurred. (Workflow DefinitionVersionId: <ID from database> not found.)
    at Elsa.Workflows.Runtime.ProtoActor.ProtoBuf.WorkflowInstanceClient.CreateAndRun(CreateAndRunWorkflowInstanceRequest request, IDictionary`2 headers, CancellationToken ct)

Steps to Reproduce

  1. Configure Elsa.Server.Web with MultiTenancy support.
  2. Set up the API and UI to pass the Tenant ID correctly.
  3. Create a workflow through the UI, ensuring the Tenant ID is populated in the database.
  4. Attempt to manually trigger the workflow from the UI.

Expected Behavior

The Tenant ID should propagate through the Actor message headers to ensure the correct tenant's workflow instance is located and executed.

Actual Behavior

The Tenant ID is not propagated correctly, resulting in the following:

  • The SQL query is executed with a NULL Tenant ID.
  • The workflow definition cannot be found, leading to the error described above.

Possible Cause

The issue may stem from a new IServiceScope being created during the CreateAndRun call, resetting the TenantAccessor and losing the Tenant ID context. Consequently, the Tenant ID becomes NULL when the database query is executed.

Suggested Next Steps

  1. Investigate whether the IServiceScopeFactory or any internal scope creation during workflow execution is improperly resetting the TenantAccessor.
  2. Verify that the Tenant ID is correctly included in the Actor message headers.
  3. Ensure that the Tenant ID is preserved across all service scopes during the workflow execution flow.
  4. Debug the MultiTenancy packages locally to trace where the Tenant ID context might be lost.
@sfmskywalker sfmskywalker added the bug Something isn't working label Nov 19, 2024
@sfmskywalker sfmskywalker added this to the Elsa 3.3 milestone Nov 19, 2024
@sfmskywalker sfmskywalker moved this to In Progress in ELSA 3 Nov 19, 2024
@sfmskywalker sfmskywalker self-assigned this Nov 19, 2024
@CODEdire
Copy link

I was able to debug down to the creation of the scope, but it now looks like the ITenantAccessor wasn't initialized before the scope creation and therefore does not have a Tenant set before or after the scope creation. This may be further upstream than just the scope creation. Could the actor be running in a different service scope from the message received event?

@CODEdire
Copy link

I think the issue is with the DefaultTenantContextInitializer and how AsyncLocal works on the TenantAccessor. I ran a quick test taking out the AsyncLocal from the TenantAccessor just to see if it worked and messages are now containing the Tenant ID. I think this might be an issue with how AsyncLocal works against nested async methods and since it sets it under a nested async, this is causing it to lose "scope" after InitializeAsync and so it gets reset to NULL.

@CODEdire
Copy link

I think the issue is with the DefaultTenantContextInitializer and how AsyncLocal works on the TenantAccessor. I ran a quick test taking out the AsyncLocal from the TenantAccessor just to see if it worked and messages are now containing the Tenant ID. I think this might be an issue with how AsyncLocal works against nested async methods and since it sets it under a nested async, this is causing it to lose "scope" after InitializeAsync and so it gets reset to NULL.

Following this up with an additional test. I took the lines that were in the DefaultTenantContextInitializer out of that class and used the ITenantFinder and ITenantAccessor directly in the receive middleware. Results show that that this is working correctly with the Tenant ID being passed around. This leads me to conclude that the InitializeAsync is having an issue with the AsyncLocal since it would naturally want to reset the variable once outside of the InitializeAsync.

@sfmskywalker
Copy link
Member Author

I fixed it by creating a tenant scope (which is essentially a wrapper around a child service scope) and temporarily configuring ProtoActor's DI service provider with the service provider from said child scope. The same technique is applied to the multitenant ASP.NET middleware component, where the HttpContext.RequestServices object is temporarily replaced with a tenant scoped provider for the duration of the next middleware. Seems to work well.

Another option I considered was propagating the tenant from one service scope to the next by providing a custom IServiceScopeFactory, but the current fix seems to be enough.

Let's see if we can do something similar for MT.

@github-project-automation github-project-automation bot moved this from In Progress to Done in ELSA 3 Nov 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants