Multiprocessing testgen runner #3347

hwwhww · 2023-05-06T13:36:15Z

Background

Before this PR, when we ran the test generator, we could only set process numbers at the makefile command level (-jN). For example, if there are 10 test generators and we set -j 3, the 3 processes will start by picking one test generator to run, e.g., process-1 picks the sanity test generator. Once process-1 finishes the first test generator, it will pick another test generator from the queue.

However, the execution time of each test generator can significantly differ from one another. Despite utilizing a multi-core machine, the test generator with the longest execution time has a noticeable impact on the overall execution speed. The other cores are idle, waiting for the slowest generator to finish.

Therefore, it would be more beneficial to divide the jobs at the test cases level, rather than only at the test generator level. This approach would allow for a more efficient distribution of work among the available cores, enabling parallel execution of multiple test cases simultaneously and reducing the impact of slower test generators on the overall execution time.

Refactoring

Break down run_generator function into multiple functions
Create Diagnostics dataclass to pass info on each test case

The goal is to exact the generate_test_vector job for each process to execute. Plus, it improves readability essentially.

Multiprocessing

Use pathos.multiprocessing.ProcessingPool to implement the pool: the args to multiprocessing.Pool must be picklable. However, the way we pass case_fn is not picklable. Therefore, I use pathos library which can serialize the argument I pass
Add two generator modes:
- MODE_SINGLE_PROCESS: the previous mode. Just one process per run_generator.
- MODE_MULTIPROCESSING: use multiprocessing in run_generator. It first collects all test cases of the given test_providers into all_test_case_params, then spawns num_process processes to execute these test cases.

How `run_generator` is used

For the state tests, run_generator was called at each (runner_name, preset, fork) combination:

consensus-specs/tests/core/pyspec/eth2spec/gen_helpers/gen_from_tests/gen.py

Lines 96 to 111 in b6df4b5

    
           def run_state_test_generators(runner_name: str, 
        
                                         all_mods: Dict[str, Dict[str, str]], 
        
                                         presets: Iterable[PresetBaseName] = ALL_PRESETS, 
        
                                         forks: Iterable[SpecForkName] = TESTGEN_FORKS) -> None: 
        
               """ 
        
               Generate all available state tests of `TESTGEN_FORKS` forks of `ALL_PRESETS` presets of the given runner. 
        
               """ 
        
               for preset_name in presets: 
        
                   for fork_name in forks: 
        
                       if fork_name in all_mods: 
        
                           gen_runner.run_generator(runner_name, get_provider( 
        
                               create_provider_fn=get_create_provider_fn(runner_name), 
        
                               fork_name=fork_name, 
        
                               preset_name=preset_name, 
        
                               all_mods=all_mods, 
        
                           ))

For example, for the sanity tests, there are multiple run_generator call for:

(sanity, minimal, phase0)
(sanity, minimal, altair)
(sanity, minimal, bellatrix)
(sanity, minimal, capella)
(sanity, minimal, deneb)
(sanity, minimal, eip6110)
(sanity, mainnet, phase0)
(sanity, mainnet, altair)
(sanity, mainnet, bellatrix)
(sanity, mainnet, capella)
(sanity, mainnet, deneb)
(sanity, mainnet, eip6110)

It's possible that the parent process collects 30 test cases from (sanity, mainnet, phase0), but only 3 test cases are the slow test cases.

If we only assign one job to run make generate_tests at makefile command level, it will be inefficient when the number of the remaining slow test cases is smaller than the number of cores.

Therefore, it would be more efficent to set -jN with N > 1. This allows for parallel execution across multiple cores, enabling faster processing when the number of slow test cases is relatively small compared to the available cores.

Performance

Platform

Apple M1 Max machine (10-core)
8 high-performance cores
2 energy-efficient cores
PyPy 3.9
Base commit: b617c62

Case 1: Run single test generator

Same command for both modes since -j jobs won't help when running single generator:

make gen_sanity

Case 2: Run all test generators

MODE_SINGLE_PROCESS

Command and setting

make -j 8 generate_tests

8 jobs to run all the generators -> 8 processes are running at the start

MODE_MULTIPROCESSING

Command and setting

num_process = multiprocessing.cpu_count() // 2 - 1

10 // 2 - 1 = 4 workers

make -j 2 generate_tests

2 jobs to run all the generators -> 4 * 2 = 8 processes are running in normal case

Execution time results

Case	MODE_SINGLE_PROCESS	MODE_MULTIPROCESSING
Case 1	27m 19s	7m 14s
Case 2	1h 26m 43s	40m 34s

As evident from the results, the MODE_MULTIPROCESSING mode outperforms. 🥹🎉

djrtwo

nice refactor to start!

the approach looks good and correct from a high level. I haven't done a line-by-line review but am comfortable getting this merged to enjoy the improvement

hwwhww added 4 commits May 5, 2023 16:23

Refactor run_generator

9f5bb03

Try multiprocessing

aeccd20

Fix test_randomized_state and test_randomized_state_leaking

98d0ca4

Fix and set to MODE_MULTIPROCESSING

3ae4bf1

hwwhww added the scope:CI/tests/pyspec label May 6, 2023

hwwhww added 2 commits May 9, 2023 21:42

Merge branch 'dev' into testgen-refactor

d4be8f1

Add settings.py of testgen

1008714

djrtwo approved these changes May 16, 2023

View reviewed changes

hwwhww merged commit e18e974 into dev May 18, 2023

hwwhww deleted the testgen-refactor branch May 18, 2023 15:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiprocessing testgen runner #3347

Multiprocessing testgen runner #3347

hwwhww commented May 6, 2023 •

edited

Loading

djrtwo left a comment

	def run_state_test_generators(runner_name: str,
	all_mods: Dict[str, Dict[str, str]],
	presets: Iterable[PresetBaseName] = ALL_PRESETS,
	forks: Iterable[SpecForkName] = TESTGEN_FORKS) -> None:
	"""
	Generate all available state tests of `TESTGEN_FORKS` forks of `ALL_PRESETS` presets of the given runner.
	"""
	for preset_name in presets:
	for fork_name in forks:
	if fork_name in all_mods:
	gen_runner.run_generator(runner_name, get_provider(
	create_provider_fn=get_create_provider_fn(runner_name),
	fork_name=fork_name,
	preset_name=preset_name,
	all_mods=all_mods,
	))

Multiprocessing testgen runner #3347

Multiprocessing testgen runner #3347

Conversation

hwwhww commented May 6, 2023 • edited Loading

Background

Refactoring

Multiprocessing

How run_generator is used

Performance

Platform

Case 1: Run single test generator

Case 2: Run all test generators

MODE_SINGLE_PROCESS

Command and setting

MODE_MULTIPROCESSING

Command and setting

Execution time results

djrtwo left a comment

Choose a reason for hiding this comment

hwwhww commented May 6, 2023 •

edited

Loading

How `run_generator` is used