Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1830: Remove concept of under/overloaded from TemperedWMin #1860

Draft
wants to merge 6 commits into
base: develop
Choose a base branch
from

Conversation

cz4rs
Copy link
Contributor

@cz4rs cz4rs commented Jun 27, 2022

fixes #1830

@github-actions
Copy link

github-actions bot commented Jun 27, 2022

Pipelines results

PR tests (gcc-6, ubuntu, mpich)

Build for 6121bc8

Compilation - successful

Testing - passed

Build log


PR tests (clang-3.9, ubuntu, mpich)

Build for 6121bc8

Compilation - successful

Testing - passed

Build log


PR tests (nvidia cuda 10.1, ubuntu, mpich)

Build for 3c8faeb

Compilation - successful

Testing - passed

Build log


PR tests (gcc-10, ubuntu, openmpi, no LB)

Build for 6121bc8

Compilation - successful

Testing - passed

Build log


PR tests (gcc-9, ubuntu, mpich, zoltan)

Build for 6121bc8

Compilation - successful

Testing - passed

Build log


PR tests (clang-5.0, ubuntu, mpich)

Build for 6121bc8

Compilation - successful

Testing - passed

Build log


PR tests (nvidia cuda 11.0, ubuntu, mpich)

Build for 6121bc8

Compilation - successful

Testing - passed

Build log


PR tests (gcc-7, ubuntu, mpich, trace runtime, LB)

Build for 6121bc8

Compilation - successful

Testing - passed

Build log


PR tests (clang-9, ubuntu, mpich)

Build for 6121bc8

Compilation - successful

Testing - passed

Build log


PR tests (clang-13, alpine, mpich)

Build for 6121bc8

Compilation - successful

Testing - passed

Build log


PR tests (intel icpx, ubuntu, mpich)

Build for 6121bc8

Compilation - successful

Testing - passed

Build log


PR tests (gcc-8, ubuntu, mpich, address sanitizer)

Build for 6121bc8

Compilation - successful

Testing - passed

Build log


PR tests (clang-12, ubuntu, mpich)

Build for 6121bc8

Compilation - successful

Testing - passed

Build log


PR tests (clang-11, ubuntu, mpich)

Build for 6121bc8

Compilation - successful

Testing - passed

Build log


PR tests (clang-13, ubuntu, mpich)

Build for 3c8faeb

Compilation - successful

Testing - passed

Build log


PR tests (gcc-11, ubuntu, mpich)

Build for 6121bc8

Compilation - successful

Testing - passed

Build log


PR tests (clang-14, ubuntu, mpich)

Build for 6121bc8

Compilation - successful

Testing - passed

Build log


PR tests (gcc-12, ubuntu, mpich)

Build for cf8804f (2023-02-28 21:41:36 UTC)

Compilation - successful

Testing - passed

Build log


PR tests (intel icpc, ubuntu, mpich)

Build for 6121bc8

Compilation - successful

Testing - passed

Build log


PR tests (clang-10, ubuntu, mpich)

Build for 6121bc8

Compilation - successful

Testing - passed

Build log


PR tests (gcc-5, ubuntu, mpich)

Build for 6121bc8

Compilation - successful

Testing - passed

Build log


@codecov
Copy link

codecov bot commented Jun 27, 2022

Codecov Report

Merging #1860 (6121bc8) into develop (893ca59) will decrease coverage by 0.54%.
The diff coverage is 84.61%.

❗ Current head 6121bc8 differs from pull request most recent head cf8804f. Consider uploading reports for the commit cf8804f to get more accurate results

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #1860      +/-   ##
===========================================
- Coverage    84.90%   84.37%   -0.54%     
===========================================
  Files          721      760      +39     
  Lines        25737    26811    +1074     
===========================================
+ Hits         21852    22621     +769     
- Misses        3885     4190     +305     
Impacted Files Coverage Δ
...vt/vrt/collection/balance/temperedlb/temperedlb.cc 71.02% <70.00%> (+0.20%) ⬆️
.../vt/vrt/collection/balance/temperedlb/temperedlb.h 100.00% <100.00%> (ø)
...rt/collection/balance/temperedwmin/temperedwmin.cc 82.85% <100.00%> (+3.54%) ⬆️
...vrt/collection/balance/temperedwmin/temperedwmin.h 100.00% <100.00%> (ø)
...c/vt/vrt/collection/balance/offlinelb/offlinelb.cc 0.00% <0.00%> (-100.00%) ⬇️
src/vt/scheduler/suspended_units.h 25.00% <0.00%> (-75.00%) ⬇️
...rc/vt/vrt/collection/balance/offlinelb/offlinelb.h 0.00% <0.00%> (-75.00%) ⬇️
src/vt/context/context_attorney.cc 33.33% <0.00%> (-66.67%) ⬇️
...vt/vrt/collection/balance/lb_data_restart_reader.h 40.00% <0.00%> (-45.72%) ⬇️
src/vt/context/runnable_context/collection.impl.h 60.00% <0.00%> (-40.00%) ⬇️
... and 249 more

@cz4rs cz4rs force-pushed the 1830-remove-under-over-loaded-concept branch 3 times, most recently from 6a85c17 to d863bd3 Compare June 29, 2022 16:56
Copy link
Collaborator

@nlslatt nlslatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Within propagateIncomingSync and propagateIncomingAsync there are conditionals on isUnderloaded() that I think should be (effectively) unconditional for TemperedWMin.

You might consider renaming underloaded_ to potential_recipients_ or something like that.

There are configurations possible for TemperedLB that are not valid for TemperedWMin, such as NormByMaxExcludeIneligible for the CMF. We'll need to validate the options at some point but, for now, know that makeSufficientlyUnderloaded should never be used with TemperedWMin.

@cz4rs cz4rs force-pushed the 1830-remove-under-over-loaded-concept branch 2 times, most recently from 3c8faeb to 2ae50dc Compare July 8, 2022 21:24
@cz4rs cz4rs force-pushed the 1830-remove-under-over-loaded-concept branch from f7b4908 to f363c13 Compare July 13, 2022 13:00
@cz4rs cz4rs marked this pull request as ready for review July 13, 2022 16:35
Copy link
Collaborator

@nlslatt nlslatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your changes to the algorithms to remove the concept of under/overloaded look good. When doing any testing, be sure to use the default CMF NormByMax because the others will not be valid for TemperedWMin.

A good test might be that, with knowledge set to Complete, informSync and informAsync correctly gain knowledge of loads for either all ranks or all underloaded ranks depending if TemperedWMin or TemperedLB is in use. I haven't thought through how to set up such a test yet.

@cz4rs cz4rs force-pushed the 1830-remove-under-over-loaded-concept branch from f363c13 to 6121bc8 Compare July 14, 2022 13:09
@cz4rs cz4rs marked this pull request as draft September 13, 2022 11:45
@cz4rs cz4rs force-pushed the 1830-remove-under-over-loaded-concept branch from 6121bc8 to 8ca421f Compare September 13, 2022 11:46
@cz4rs cz4rs force-pushed the 1830-remove-under-over-loaded-concept branch 3 times, most recently from 742d140 to 5f41170 Compare October 7, 2022 17:03
@cz4rs cz4rs force-pushed the 1830-remove-under-over-loaded-concept branch from 5f41170 to 7573b94 Compare October 13, 2022 18:07
.clang-format Outdated Show resolved Hide resolved
@cz4rs cz4rs force-pushed the 1830-remove-under-over-loaded-concept branch 2 times, most recently from 89ea20e to e2777a3 Compare October 20, 2022 19:20
@cz4rs cz4rs force-pushed the 1830-remove-under-over-loaded-concept branch from e2777a3 to 23608d2 Compare December 21, 2022 18:31
@cz4rs cz4rs force-pushed the 1830-remove-under-over-loaded-concept branch from 23608d2 to cf8804f Compare February 28, 2023 21:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove concept of under/overloaded from TemperedWMin
3 participants