Import of standalone tasks cause NPEs due to missing flownodeID #4735

PHWaechtler · 2024-10-21T05:25:54Z

Environment (Required on creation)

Optimize 7

Description (Required on creation; please attach any relevant screenshots, stacktraces, log files, etc. to the ticket)

During import of flownode data for standalone tasks, Optimize importer throws an exception because the flownodeID is null when it is expected to be non null. For more details please refer to the support ticket.

Steps to reproduce (Required on creation)

Start Optimize 7 environment
Start standalone tasks
Wait for Optimize to import data
Observe Optimize log

Observed Behavior (Required on creation)

Exception during import:

13:08:28.108 [EngineImportScheduler-1] ERROR o.c.o.s.i.e.m.CompletedUserTaskEngineImportMediator - Was not able to import next page, retrying after sleeping for 5063ms.
java.lang.NullPointerException: flowNodeId is marked non-null but is null
at org.camunda.optimize.dto.optimize.query.event.process.FlowNodeInstanceDto.<init>(FlowNodeInstanceDto.java:98)
at org.camunda.optimize.service.importing.engine.service.CompletedUserTaskInstanceImportService.mapEngineEntityToOptimizeEntity(CompletedUserTaskInstanceImportService.java:99)

Expected behavior (Required on creation)

No exception during import.

Root Cause (Required on prioritization)

Standalone tasks have no flownodeID (activitiyInstanceID) in the engine, but Optimize ctor has NonNull annotation for this field.

Solution Ideas

For now, lets just focus on removing the exception rather than adding the ability to Optimize to import standalone tasks in a way that makes them usable for report analysis.

Some potential approaches detailed here. Specifically, we could consider these two options (or a suitable alternative):

Option 1.: Optimize removes the @nonnull restraint on Flownode ID and imports this data regardless
This would avoid the exception and the need to the manual workaround. However, Optimize would then keep flownode
data that is not very useful for reporting since the ID is missing so Optimize can for example not aggregate this flownode
data for flownode reports. I would also need to have a closer look at all our flownode import pipelines to determine if
Optimize would keep multiple entries, or overwrite one entry per standalone usertask import. In general, I think we should
avoid importing data that will not be useful for report analysis but it could be a quick "fix".

Option 2.: Optimize keeps the @nonnull restraint on Flownode ID but skips importing flownodes with no ID
Since the flownode data without ID is of limited use to Optimize reporting, we could also consider skipping the import of
usertask data that comes from the engine without an ID. This would mean there is no data in Optimize for standalone
tasks (unless other related data, like identity link logs, are imported.). Same as with Option 1, standalone tasks would also
not be available for report analysis, but at least we would avoid importing unnecessary data.

Current tendency: option 2 as it avoids unnecessary data import. However, need to double check if there are use cases where option 1 would be preferred or where option 2 doesnt work.

After discussion and getting more context on what standalone tasks are, we decided that there is no value in importing these to Optimize. Therefore, we should go for option 2 above or the following Option 3:

Option 3: Engine filters out standalone tasks on Optimize API
When Optimize fetches data for flownodes (/usertasks), the engine api only returns data for non standalone tasks so that Optimize does not need to do any additional filtering during its import. This could potentially be a more performant solution.

Hints

Consider filtering out all null values not only task fields related

Workaround (test on lower environment first)

Workaround 1. Disable standalone tasks in the engine.
Workaround 2. Remove history related to standalone tasks
Workaround 3. Set a default placeholder (e.g.: workaroundStandaloneTasks) to TASK_DEF_KEY , ACT_INST_ID, in ACT_HI_TASKINST table. If PROC_DEF_KEY_ and PROC_DEF_ID_ are null, populate them as well.

Links

https://jira.camunda.com/browse/SUPPORT-24021

Breakdown

Pull Requests

Give feedback

No tasks being tracked yet.

Options

Dev2QA handover

Does this ticket need a QA test and the testing goals are not clear from the description? Add a Dev2QA handover comment

The text was updated successfully, but these errors were encountered:

yanavasileva · 2024-11-22T17:39:58Z

Option 2 - fix on the Optimize side

pros
- easy fix by adding a filter when mapping engine entities to optimize entities
- good learning opportunity for onboarding
con
- unnecessary data imported by engine and filtering it for a second time
- there's no out of the box option to test import of standalone tasks in IT

Option 3 - fix on the engine side

pros
- easy pick to change the mybatis query and test it in the engine (JUnit)
- query only the data that is needed for Optimize
- good learning opportunity for onboarding

Manual testing should be done no mater of the solution.
Fetchers are independent from each other. Since there's no other reported issue for another null values related to standalone tasks (for example - operation logs), it's safe to assume that the issue occurs only for tasks.

Backport is straight forward for both options.

yanavasileva · 2024-12-03T08:20:49Z

Decision:

We will implement the fix on engine side. The customer agrees to apply the patch for the engine instead of Optimize.

PHWaechtler added type:bug Issues that describe a user-facing bug in the project. group:support All requests that are linked to a customer request. DRI: Tassilo scope:optimize Changes to Optimize. labels Oct 21, 2024

tasso94 added potential:optimize 3.14.1 version:optimize 3.15.0 labels Oct 29, 2024

PHWaechtler assigned PHWaechtler and yanavasileva and unassigned PHWaechtler Nov 4, 2024

tasso94 added potential:optimize 3.14.2 and removed potential:optimize 3.14.1 labels Nov 5, 2024

yanavasileva assigned PHWaechtler and unassigned yanavasileva Dec 3, 2024

yanavasileva added version:7.23.0 potential:7.22.2 scope:core-api Changes to the core API: engine, dmn-engine, feel-engine, REST API, OpenAPI potential:7.21.7 and removed version:optimize 3.15.0 potential:optimize 3.14.2 scope:optimize Changes to Optimize. labels Dec 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Import of standalone tasks cause NPEs due to missing flownodeID #4735

Import of standalone tasks cause NPEs due to missing flownodeID #4735

PHWaechtler commented Oct 21, 2024 •

edited

Loading

Pull Requests

yanavasileva commented Nov 22, 2024 •

edited

Loading

yanavasileva commented Dec 3, 2024

Import of standalone tasks cause NPEs due to missing flownodeID #4735

Import of standalone tasks cause NPEs due to missing flownodeID #4735

Comments

PHWaechtler commented Oct 21, 2024 • edited Loading

Environment (Required on creation)

Description (Required on creation; please attach any relevant screenshots, stacktraces, log files, etc. to the ticket)

Steps to reproduce (Required on creation)

Observed Behavior (Required on creation)

Expected behavior (Required on creation)

Root Cause (Required on prioritization)

Solution Ideas

Hints

Workaround (test on lower environment first)

Links

Breakdown

Pull Requests

Dev2QA handover

yanavasileva commented Nov 22, 2024 • edited Loading

yanavasileva commented Dec 3, 2024

PHWaechtler commented Oct 21, 2024 •

edited

Loading

yanavasileva commented Nov 22, 2024 •

edited

Loading