Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix dag.clear() to set multiple dags to running when necessary #15382

Merged
merged 7 commits into from
May 29, 2021

Conversation

yuqian90
Copy link
Contributor

@yuqian90 yuqian90 commented Apr 15, 2021

closes: #14260
related: #9824


When clearing task across dags using ExternalTaskMarker the dag state of the external DagRun is not set to active. So cleared tasks in the external dag will not automatically start if the DagRun is a Failed or Succeeded state.
#9824 tried to fix a similar issue for subdag. But it did not fix ExternalTaskMarker. This PR fixes both.

Two changes are made to fix the issue:

  1. Make clear_task_instances set DagRuns' state to dag_run_state for all the affected DagRuns.
  2. The filter for DagRun in clear_task_instances is fixed too. Previously, it made an assumption that execution_dates for all the dag_ids are the same, which is not always correct.

test_external_task_marker_clear_activate is added to make sure the fix does the right thing.

@boring-cyborg boring-cyborg bot added area:API Airflow's REST/HTTP API area:core-operators Operators, Sensors and hooks within Core Airflow labels Apr 15, 2021
@yuqian90 yuqian90 force-pushed the clear_activate_dagrun branch from 5d093f4 to 18bb10c Compare April 16, 2021 01:35
@yuqian90 yuqian90 added this to the Airflow 2.0.2 milestone Apr 19, 2021
@yuqian90 yuqian90 added the kind:bug This is a clearly a bug label Apr 19, 2021
@ashb ashb modified the milestones: Airflow 2.0.2, Airflow 2.0.3 Apr 22, 2021
@yuqian90 yuqian90 force-pushed the clear_activate_dagrun branch from 967c96e to 64b0b8d Compare April 26, 2021 13:51
@ashb ashb modified the milestones: Airflow 2.0.3, Airflow 2.1.1 May 7, 2021
@yuqian90 yuqian90 force-pushed the clear_activate_dagrun branch from 64b0b8d to 1afaf51 Compare May 22, 2021 02:38
@yuqian90 yuqian90 force-pushed the clear_activate_dagrun branch 3 times, most recently from 968f19b to 89f6ed6 Compare May 24, 2021 01:17
@potiuk
Copy link
Member

potiuk commented May 24, 2021

Closing/reopening to trigger build again.

@potiuk potiuk closed this May 24, 2021
@potiuk potiuk reopened this May 24, 2021
@yuqian90 yuqian90 force-pushed the clear_activate_dagrun branch from 89f6ed6 to 937c170 Compare May 25, 2021 04:54
@jhtimmins
Copy link
Contributor

What's the status of this? @yuqian90 are you just waiting for a final review?

@yuqian90
Copy link
Contributor Author

What's the status of this? @yuqian90 are you just waiting for a final review?

Yes. The PR has been tested and works well. Just waiting for reviews.

Copy link
Member

@ashb ashb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add back the removed fn (:sob:) then LGTM.

@yuqian90 yuqian90 force-pushed the clear_activate_dagrun branch from 937c170 to b251a1d Compare May 27, 2021 13:44
@yuqian90
Copy link
Contributor Author

Please add back the removed fn (😭) then LGTM.

Thanks. I put it back.

@yuqian90 yuqian90 requested a review from ashb May 27, 2021 14:12
@github-actions
Copy link

The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest master at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.

@github-actions github-actions bot added the full tests needed We need to run full set of tests for this PR to merge label May 27, 2021
airflow/models/dag.py Outdated Show resolved Hide resolved
@ashb
Copy link
Member

ashb commented May 28, 2021

Two random failures on MSSQL2019 only (where the DB didn't create the user -- /cc @potiuk @aneesh-joseph )

Merging this PR anyway

@potiuk
Copy link
Member

potiuk commented May 28, 2021

Yeah. Some teething problems with MSSQL and @aneesh-joseph created the #16134 that might help!

@yuqian90 yuqian90 merged commit 2bca8a5 into apache:master May 29, 2021
yuqian90 added a commit that referenced this pull request May 29, 2021
closes: #14260
related: #9824

When clearing task across dags using ExternalTaskMarker the dag state of the external DagRun is not set to active. So cleared tasks in the external dag will not automatically start if the DagRun is a Failed or Succeeded state.

Two changes are made to fix the issue:

Make clear_task_instances set DagRuns' state to dag_run_state for all the affected DagRuns.
The filter for DagRun in clear_task_instances is fixed too. Previously, it made an assumption that execution_dates for all the dag_ids are the same, which is not always correct.
test_external_task_marker_clear_activate is added to make sure the fix does the right thing.

(cherry picked from commit 2bca8a5)
jhtimmins pushed a commit to astronomer/airflow that referenced this pull request Jun 3, 2021
…e#15382)

closes: apache#14260
related: apache#9824

When clearing task across dags using ExternalTaskMarker the dag state of the external DagRun is not set to active. So cleared tasks in the external dag will not automatically start if the DagRun is a Failed or Succeeded state.
apache#9824 tried to fix a similar issue for subdag. But it did not fix ExternalTaskMarker. This PR fixes both.

Two changes are made to fix the issue:

Make clear_task_instances set DagRuns' state to dag_run_state for all the affected DagRuns.
The filter for DagRun in clear_task_instances is fixed too. Previously, it made an assumption that execution_dates for all the dag_ids are the same, which is not always correct.
test_external_task_marker_clear_activate is added to make sure the fix does the right thing.

(cherry picked from commit 2bca8a5)
ashb pushed a commit that referenced this pull request Jun 22, 2021
closes: #14260
related: #9824

When clearing task across dags using ExternalTaskMarker the dag state of the external DagRun is not set to active. So cleared tasks in the external dag will not automatically start if the DagRun is a Failed or Succeeded state.

Two changes are made to fix the issue:

Make clear_task_instances set DagRuns' state to dag_run_state for all the affected DagRuns.
The filter for DagRun in clear_task_instances is fixed too. Previously, it made an assumption that execution_dates for all the dag_ids are the same, which is not always correct.
test_external_task_marker_clear_activate is added to make sure the fix does the right thing.

(cherry picked from commit 2bca8a5)
@skabbass1
Copy link

Do we have an idea of when this fix will be released and made available via PyPI?

@potiuk
Copy link
Member

potiuk commented Jun 24, 2021

Do we have an idea of when this fix will be released and made available via PyPI?

It's cherry-picked to 2.1 branch and we are just about to release 2.1.1 with it (we are in feature-freeze/final test mode for it). Expect final release in ~ 1 week if everything goes well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:API Airflow's REST/HTTP API area:core-operators Operators, Sensors and hooks within Core Airflow full tests needed We need to run full set of tests for this PR to merge kind:bug This is a clearly a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Clearing using ExternalTaskMarker will not activate external DagRuns
6 participants