Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiprocessing testgen runner #3347

Merged
merged 6 commits into from
May 18, 2023
Merged

Multiprocessing testgen runner #3347

merged 6 commits into from
May 18, 2023

Conversation

hwwhww
Copy link
Contributor

@hwwhww hwwhww commented May 6, 2023

Background

Before this PR, when we ran the test generator, we could only set process numbers at the makefile command level (-jN). For example, if there are 10 test generators and we set -j 3, the 3 processes will start by picking one test generator to run, e.g., process-1 picks the sanity test generator. Once process-1 finishes the first test generator, it will pick another test generator from the queue.

However, the execution time of each test generator can significantly differ from one another. Despite utilizing a multi-core machine, the test generator with the longest execution time has a noticeable impact on the overall execution speed. The other cores are idle, waiting for the slowest generator to finish.

Therefore, it would be more beneficial to divide the jobs at the test cases level, rather than only at the test generator level. This approach would allow for a more efficient distribution of work among the available cores, enabling parallel execution of multiple test cases simultaneously and reducing the impact of slower test generators on the overall execution time.

Refactoring

  1. Break down run_generator function into multiple functions
  2. Create Diagnostics dataclass to pass info on each test case

The goal is to exact the generate_test_vector job for each process to execute. Plus, it improves readability essentially.

Multiprocessing

  1. Use pathos.multiprocessing.ProcessingPool to implement the pool: the args to multiprocessing.Pool must be picklable. However, the way we pass case_fn is not picklable. Therefore, I use pathos library which can serialize the argument I pass
  2. Add two generator modes:
    • MODE_SINGLE_PROCESS: the previous mode. Just one process per run_generator.
    • MODE_MULTIPROCESSING: use multiprocessing in run_generator. It first collects all test cases of the given test_providers into all_test_case_params, then spawns num_process processes to execute these test cases.

How run_generator is used

For the state tests, run_generator was called at each (runner_name, preset, fork) combination:

def run_state_test_generators(runner_name: str,
all_mods: Dict[str, Dict[str, str]],
presets: Iterable[PresetBaseName] = ALL_PRESETS,
forks: Iterable[SpecForkName] = TESTGEN_FORKS) -> None:
"""
Generate all available state tests of `TESTGEN_FORKS` forks of `ALL_PRESETS` presets of the given runner.
"""
for preset_name in presets:
for fork_name in forks:
if fork_name in all_mods:
gen_runner.run_generator(runner_name, get_provider(
create_provider_fn=get_create_provider_fn(runner_name),
fork_name=fork_name,
preset_name=preset_name,
all_mods=all_mods,
))

For example, for the sanity tests, there are multiple run_generator call for:

  • (sanity, minimal, phase0)
  • (sanity, minimal, altair)
  • (sanity, minimal, bellatrix)
  • (sanity, minimal, capella)
  • (sanity, minimal, deneb)
  • (sanity, minimal, eip6110)
  • (sanity, mainnet, phase0)
  • (sanity, mainnet, altair)
  • (sanity, mainnet, bellatrix)
  • (sanity, mainnet, capella)
  • (sanity, mainnet, deneb)
  • (sanity, mainnet, eip6110)

It's possible that the parent process collects 30 test cases from (sanity, mainnet, phase0), but only 3 test cases are the slow test cases.

If we only assign one job to run make generate_tests at makefile command level, it will be inefficient when the number of the remaining slow test cases is smaller than the number of cores.

Therefore, it would be more efficent to set -jN with N > 1. This allows for parallel execution across multiple cores, enabling faster processing when the number of slow test cases is relatively small compared to the available cores.

Performance

Platform

  • Apple M1 Max machine (10-core)
  • 8 high-performance cores
  • 2 energy-efficient cores
  • PyPy 3.9
  • Base commit: b617c62

Case 1: Run single test generator

Same command for both modes since -j jobs won't help when running single generator:

make gen_sanity

Case 2: Run all test generators

MODE_SINGLE_PROCESS

Command and setting
make -j 8 generate_tests
  • 8 jobs to run all the generators -> 8 processes are running at the start

MODE_MULTIPROCESSING

Command and setting
num_process = multiprocessing.cpu_count() // 2 - 1
  • 10 // 2 - 1 = 4 workers
make -j 2 generate_tests
  • 2 jobs to run all the generators -> 4 * 2 = 8 processes are running in normal case

Execution time results

Case MODE_SINGLE_PROCESS MODE_MULTIPROCESSING
Case 1 27m 19s 7m 14s
Case 2 1h 26m 43s 40m 34s

As evident from the results, the MODE_MULTIPROCESSING mode outperforms. 🥹🎉

Copy link
Contributor

@djrtwo djrtwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice refactor to start!

the approach looks good and correct from a high level. I haven't done a line-by-line review but am comfortable getting this merged to enjoy the improvement

@hwwhww hwwhww merged commit e18e974 into dev May 18, 2023
@hwwhww hwwhww deleted the testgen-refactor branch May 18, 2023 15:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants