Skip to content
This repository has been archived by the owner on Oct 9, 2023. It is now read-only.

Fix map task cache misses #363

Merged
merged 5 commits into from
Jun 27, 2023
Merged

Conversation

bstadlbauer
Copy link
Member

@bstadlbauer bstadlbauer commented Jun 16, 2023

TL;DR

Fixes an issue with cache misses on subsequent runs of map tasks

Checkout flyteorg/flyte#3787 for the full issue description.

Type

  • Bug Fix
  • Feature
  • Plugin

Are all requirements met?

  • Code completed
  • Smoke tested
  • Unit tests added
  • Code documentation added
  • Any pending items have an associated Issue

Complete description

At the moment, this adds the hashed inputs as suffix to the key. The hashed inputs implementation has been copied over from generateArtifactTagName (here) so in case we stick with this solution, we should probably refactor out into flyteplugins

This PR also refactors the LiteralMap hashing function from propeller into flyteplugins

Tracking Issue

fixes flyteorg/flyte#3787

@bstadlbauer bstadlbauer changed the title WIP: Show potential soluation WIP: Show potential solution Jun 16, 2023
@bstadlbauer bstadlbauer requested a review from hamersaw June 16, 2023 10:45
@hamersaw
Copy link
Contributor

@bstadlbauer thanks so much for finding this issue, and more for submitting a fix! I think this will work, two things:
(1) lets add some unit tests just to make sure subsequent submissions with different input values result in different workItemIDs.
(2) looks like the hashify function was taken from propeller catalog. not sure if we want to go through the hassle of refactoring this to live entirely in plugins (ie. move function here and call from propeller) rather than duplicating it. not sure how long until this code will be deprecated with the looming ArrayNode work.

Signed-off-by: Bernhard Stadlbauer <[email protected]>
Signed-off-by: Bernhard Stadlbauer <[email protected]>
@bstadlbauer bstadlbauer changed the title WIP: Show potential solution Fix map task cache misses Jun 17, 2023
@bstadlbauer
Copy link
Member Author

@hamersaw I've added a test to check that work item ids are different and refactored the literal map hashing out of propeller into here. The testing for the hash function has been copied and adapted from here.

@codecov
Copy link

codecov bot commented Jun 17, 2023

Codecov Report

Merging #363 (2d6e219) into master (2a7a6f9) will increase coverage by 1.56%.
The diff coverage is 94.85%.

❗ Current head 2d6e219 differs from pull request most recent head 7833836. Consider uploading reports for the commit 7833836 to get more accurate results

@@            Coverage Diff             @@
##           master     #363      +/-   ##
==========================================
+ Coverage   62.62%   64.18%   +1.56%     
==========================================
  Files         152      153       +1     
  Lines       12789    10441    -2348     
==========================================
- Hits         8009     6702    -1307     
+ Misses       4168     3125    -1043     
- Partials      612      614       +2     
Flag Coverage Δ
unittests ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
go/tasks/plugins/webapi/bigquery/plugin.go 67.44% <50.00%> (+2.39%) ⬆️
...tasks/pluginmachinery/catalog/async_client_impl.go 44.68% <69.23%> (+7.30%) ⬆️
go/tasks/pluginmachinery/catalog/hashing.go 94.73% <94.73%> (ø)
go/tasks/logs/logging_utils.go 90.00% <100.00%> (+2.50%) ⬆️
go/tasks/pluginmachinery/tasklog/template.go 97.80% <100.00%> (+0.83%) ⬆️
.../plugins/k8s/kfoperators/common/common_operator.go 69.07% <100.00%> (+8.83%) ⬆️
go/tasks/plugins/k8s/kfoperators/mpi/mpi.go 75.00% <100.00%> (+2.42%) ⬆️
...o/tasks/plugins/k8s/kfoperators/pytorch/pytorch.go 79.86% <100.00%> (+3.32%) ⬆️
...s/plugins/k8s/kfoperators/tensorflow/tensorflow.go 77.77% <100.00%> (+2.05%) ⬆️

... and 125 files with indirect coverage changes

Signed-off-by: Bernhard Stadlbauer <[email protected]>
Copy link
Contributor

@hamersaw hamersaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My nit all over flyte repos is to alphabetize import statements. I've seen place where the same package is imported under two different names, this would help mitigate these scenarios. Do you mind updating the added imports?

)

const specialEncoderKey = "abcdefghijklmnopqrstuvwxyz123456"

var base32Encoder = base32.NewEncoding(specialEncoderKey).WithPadding(base32.NoPadding)
var emptyLiteralMap = core.LiteralMap{Literals: map[string]*core.Literal{}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be defined in the hashing.go file instead? I believe it's the only place it's being used right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for spotting - moved it over 👍 30ca164

Signed-off-by: Bernhard Stadlbauer <[email protected]>
Signed-off-by: Bernhard Stadlbauer <[email protected]>
@bstadlbauer
Copy link
Member Author

bstadlbauer commented Jun 27, 2023

@hamersaw Thanks for reviewing!

Not being familiar with best practices in go, I've read up a bit on it, so let me know if the following conflicts with the default within Flyte.

My main information source has been the google style guide.

  1. Imports are organized in blocks separated by newlines
  2. The first block should be imports from stdlib
  3. The second block should be all other imports

The goimports tool only understands those two sections (unless started with -local then it could do a third block with local package imports)

Given the above, it seems like as long as there are only two blocks separated by one space, imports should be ordered automatically (or otherwise fail CI).

Does that sound about right? I've not run things assuming there -local flag

Looking at other files, it seems like it's a bit of a wild west out there - probably an easy fix could be to just remove all blank lines in all import() statements and then run goimports once to split them up into the two blocks mentioned above? 🤔

@bstadlbauer bstadlbauer requested a review from hamersaw June 27, 2023 08:04
@hamersaw
Copy link
Contributor

hamersaw commented Jun 27, 2023

@bstadlbauer re: imports - I try to stick with exactly as you have described.

Given the above, it seems like as long as there are only two blocks separated by one space, imports should be ordered automatically (or otherwise fail CI).

Exactly.

it seems like it's a bit of a wild west out there

And it drives me crazy! It's not a high priority item to clean all of this up (especially across repos), which is why it hasn't been done yet. I just try to keep it from getting worse.

@hamersaw hamersaw merged commit 5a5ad82 into flyteorg:master Jun 27, 2023
eapolinario pushed a commit that referenced this pull request Sep 6, 2023
* Add input hash to workItemID

Signed-off-by: Bernhard Stadlbauer <[email protected]>

* Add test to ensure different IDs

Signed-off-by: Bernhard Stadlbauer <[email protected]>

* Cleanup `make lint`

Signed-off-by: Bernhard Stadlbauer <[email protected]>

* Move `emptyLiteralMap` to `hashing.go`

Signed-off-by: Bernhard Stadlbauer <[email protected]>

* Fix import ordering

Signed-off-by: Bernhard Stadlbauer <[email protected]>

---------

Signed-off-by: Bernhard Stadlbauer <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Map (array) task cache miss
2 participants