Skip to content
This repository has been archived by the owner on Oct 9, 2023. It is now read-only.

Releases: flyteorg/flytepropeller

Discretization of Statemachine fixes

02 Jan 23:36
a79c03f
Compare
Choose a tag to compare

Fixes a few issues uncovered during the investigation of the statemachine inconsistency issues last week. Specifically:

  • Ensure each node can a progress at most once per round (IsDirty flag)
  • Remove ParentTaskID and DataDir from NodeStatus field (Causing workflow etcd. obj size to bloat)
  • Add Parent RetryAttempt in the generated hierarchal name of dynamic sub-nodes to ensure retries do not reuse an existing sub-node status.

Details: https://docs.google.com/document/d/1ISaxIZeYLcBaeapEmeTqb-g0x04pJbf5t3i30qMfk6U/edit?usp=sharing

Adding the support of cluster-namespaced resource management

02 Jan 19:46
71cdc77
Compare
Choose a tag to compare

Pulling in the changes from flyteplugins

Prefixing Allocation Tokens in Resource Manager

02 Jan 18:50
0ebaa33
Compare
Choose a tag to compare

This release adds prefixes to allocation tokens used in the resource manager to simplify tracking and lineaging.

Node executor abort to call finalize even on error

02 Jan 18:14
66b40c7
Compare
Choose a tag to compare
Node executor abort to call finalize even on error (#51)

In cases where the abort call fails, we should still call finalize as this is the intended behavior of the finalize construct.

Adding Namespaced Resource-aware Exponential Backoff Handler

23 Dec 20:43
b3098cd
Compare
Choose a tag to compare

This release supports a per-namespace resource-aware backoff mechanism to guard pod creation.
Instead of having a separate back off on each execution or having one back off for the entire queue, this PR aims to supports per-namespace backoff which strikes a good balance on the granularity given the resource quota is set per namespace.

This backoff mechanism also is resource-aware. Even if a creation request is blocked and the blocking is still active, the next creation request coming in can be allowed to try if the resource requirement of this creation request is strictly smaller all the previous trials during the same backoff period. This prevents one single big creation request unnecessarily blocking everything else coming from the same namespace.

IDL to 0.16.3

20 Dec 18:13
1bbee1f
Compare
Choose a tag to compare
Bump IDL to 0.16.3 (#46)

https://github.com/lyft/flyteidl/releases/tag/v0.16.3
which needs a bump to plugins to
https://github.com/lyft/flyteplugins/releases/tag/v0.2.4

Use flytestdlib contextutils.RevisionVersionKey instead of the locally defined one

18 Dec 23:59
f966f81
Compare
Choose a tag to compare
v0.1.19

Update flytestdlib to use proper contextutil key (#45)

Add a log field for resource version

16 Dec 19:58
3323990
Compare
Choose a tag to compare
v0.1.18

Add ResourceVersion to log fields (#44)

Implementation for Node timeout

11 Dec 00:47
96c9ee9
Compare
Choose a tag to compare
Implementation for node timeout (#42)

* Implementation for node timeout

* .

* adding some tests

* bogus change to retrigger travis

* cr feedback

Creating a separate state in the CRD for failure type so we can distinguish between user and system error. For now, we will use it for timeout failures.

* removing failure type and adding TimingOut as a separate phase

* updated mockery

* finalize to abort on timeout

* fixing NodePhaseTimedout

Removing flytekit version check for fixing array task interface

06 Dec 21:24
f6db135
Compare
Choose a tag to compare
Remove flytekit version check (#41)

* Remove flytekit version check

* lint