Skip to content

Commit

Permalink
Add benchcomp (model-checking#2274)
Browse files Browse the repository at this point in the history
This PR adds benchcomp, a tool for comparing one or more suites of benchmarks using two or more 'variants' (command line arguments and environment variables).

benchcomp runs all combinations of suite x variant, parsing the unique output formats of each of these runs. benchcomp then combines the parsed outputs and writes them into a single file. benchcomp can post-process that combined file to create visualizations, exit if the results are not as expected, or perform other actions.
  • Loading branch information
karkhaz authored Mar 17, 2023
1 parent 56e7f93 commit 12c343e
Show file tree
Hide file tree
Showing 21 changed files with 894 additions and 0 deletions.
2 changes: 2 additions & 0 deletions tools/benchcomp/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# the regression tests write result.yaml files into their directories
result.yaml
11 changes: 11 additions & 0 deletions tools/benchcomp/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Benchcomp

This directory contains `bin/benchcomp`, a tool for comparing one or
more suites of benchmarks using two or more 'variants' (command line
arguments and environment variables).

`benchcomp` runs all combinations of suite x variant, parsing the unique
output formats of each of these runs. `benchcomp` then combines the
parsed outputs and writes them into a single file. `benchcomp` can
post-process that combined file to create visualizations, exit if the
results are not as expected, or perform other actions.
119 changes: 119 additions & 0 deletions tools/benchcomp/benchcomp/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
# Copyright Kani Contributors
# SPDX-License-Identifier: Apache-2.0 OR MIT
#
# Common utilities for benchcomp


import argparse
import collections
import contextlib
import dataclasses
import logging
import pathlib
import sys
import textwrap

import yaml


class ConfigFile(collections.UserDict):
_schema: str = textwrap.dedent("""\
variants:
type: dict
keysrules:
type: string
valuesrules:
schema:
config:
type: dict
keysrules:
type: string
valuesrules:
allow_unknown: true
schema:
command_line:
type: string
directory:
type: string
env:
type: dict
keysrules:
type: string
valuesrules:
type: string
run:
type: dict
keysrules:
type: string
schema:
suites:
type: dict
keysrules:
type: string
valuesrules:
schema:
variants:
type: list
parser:
type: dict
keysrules:
type: string
valuesrules:
anyof:
- schema:
type: {}
filter: {}
visualize: {}
""")

def __init__(self, path):
super().__init__()

try:
with open(path, encoding="utf-8") as handle:
data = yaml.safe_load(handle)
except (FileNotFoundError, OSError) as exc:
raise argparse.ArgumentTypeError(
f"{path}: file not found") from exc

schema = yaml.safe_load(self._schema)
try:
import cerberus
validate = cerberus.Validator(schema)
if not validate(data):
for error in validate._errors:
doc_path = "/".join(error.document_path)
msg = (
f"config file '{path}': key "
f"'{doc_path}': expected "
f"{error.constraint}, got '{error.value}'")
if error.rule:
msg += f" (rule {error.rule})"
msg += f" while traversing {error.schema_path}"
logging.error(msg)
logging.error(validate.document_error_tree["variants"])
raise argparse.ArgumentTypeError(
"failed to validate configuration file")
except ImportError:
pass
self.data = data


@dataclasses.dataclass
class Outfile:
"""Return a handle to a file on disk or stdout if given '-'"""

path: str

def __str__(self):
return str(self.path)

@contextlib.contextmanager
def __call__(self):
if self.path == "-":
yield sys.stdout
return
path = pathlib.Path(self.path)
path.parent.mkdir(exist_ok=True)
with open(path, "w", encoding="utf-8") as handle:
yield handle
225 changes: 225 additions & 0 deletions tools/benchcomp/benchcomp/cmd_args.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,225 @@
# Copyright Kani Contributors
# SPDX-License-Identifier: Apache-2.0 OR MIT
#
# Command line argument processing


import argparse
import importlib
import pathlib
import re
import textwrap

import benchcomp
import benchcomp.entry.benchcomp
import benchcomp.entry.run


def _get_epilogs():
epilogs = {
"top_level": """\
benchcomp can help you to understand the difference between two or
more toolchains, by running benchmarks that use those toolchains and
comparing the results.
benchcomp runs two or more 'variants' of a set of benchmark suites,
and compares and visualizes the results of these variants. This
allows you to understand the differences between the two variants,
for example how they affect the benchmarks' performance or output or
even whether they pass at all.
benchmark is structured as a pipeline of several commands. Running
`benchcomp` runs each of them sequentially. You can run the
subcommands manually to dump the intermediate files if required.""",
"run": """\
The run command writes one YAML file for each (suite, variant) pair.
These YAML files are in "suite.yaml" format. Typically, users
should read the combined YAML file emitted by `benchcomp collate`
rather than the multiple YAML files written by `benchcomp run`.
The `run` command writes its output files into a directory, which
`collate` then reads from. By default, `run` writes the files into a
new directory with a common prefix on each invocation, meaning that
all previous runs are preserved without the user needing to specify
a different directory each time. Benchcomp also creates a symbolic
link to the latest run. Thus, the directories after several runs
will look something like this:
/tmp/benchcomp/suites/2F0D3DC4-0D02-4E95-B887-4759F08FA90D
/tmp/benchcomp/suites/119F11EB-9BC0-42D8-9EC1-47DFD661AC88
/tmp/benchcomp/suites/A3E83FE8-CD42-4118-BED3-ED89EC88BFB0
/tmp/benchcomp/suites/latest -> /tmp/benchcomp/suites/119F11EB...
'/tmp/benchcomp/suites' is the "out-prefix"; the UUID is the
"out-dir"; and '/tmp/benchcomp/latest' is the "out-symlink". Users
can set each of these manually by passing the corresponding flag, if
needed.
Passing `--out-symlink ./latest` will place the symbolic link in the
current directory, while keeping all runs under /tmp to avoid
clutter. If you wish to keep all previous runs in a local directory,
you can do so with
`--out-prefix ./output --out-symlink ./output/latest`""",
"filter": "", # TODO
"visualize": "", # TODO
"collate": "",
}

wrapper = textwrap.TextWrapper()
ret = {}
for subcommand, epilog in epilogs.items():
paragraphs = re.split(r"\n\s*\n", epilog)
buf = []
for p in paragraphs:
p = textwrap.dedent(p)
buf.extend(wrapper.wrap(p))
buf.append("")
ret[subcommand] = "\n".join(buf)
return ret


def _existing_directory(arg):
path = pathlib.Path(arg)
if not path.exists():
raise ValueError(f"directory '{arg}' must already exist")
return path


def _get_args_dict():
epilogs = _get_epilogs()
ret = {
"top_level": {
"description":
"Run and compare variants of a set of benchmark suites",
"epilog": epilogs["top_level"],
"formatter_class": argparse.RawDescriptionHelpFormatter,
},
"args": [],
"subparsers": {
"title": "benchcomp subcommands",
"description":
"You can invoke each stage of the benchcomp pipeline "
"separately if required",
"parsers": {
"run": {
"help": "run all variants of all benchmark suites",
"args": [{
"flags": ["--out-prefix"],
"metavar": "D",
"type": pathlib.Path,
"default": benchcomp.entry.run.get_default_out_prefix(),
"help":
"write suite.yaml files to a new directory under D "
"(default: %(default)s)",
}, {
"flags": ["--out-dir"],
"metavar": "D",
"type": str,
"default": benchcomp.entry.run.get_default_out_dir(),
"help":
"write suite.yaml files to D relative to "
"--out-prefix (must not exist) "
"(default: %(default)s)",
}, {
"flags": ["--out-symlink"],
"metavar": "D",
"type": pathlib.Path,
"default":
benchcomp.entry.run.get_default_out_prefix() /
benchcomp.entry.run.get_default_out_symlink(),
"help":
"symbolically link D to the output directory "
"(default: %(default)s)",
}],
},
"collate": {
"args": [{
"flags": ["--suites-dir"],
"metavar": "D",
"type": _existing_directory,
"default":
benchcomp.entry.run.get_default_out_prefix() /
benchcomp.entry.run.get_default_out_symlink(),
"help":
"directory containing suite.yaml files "
"(default: %(default)s)"
}, {
"flags": ["--out-file"],
"metavar": "F",
"default": benchcomp.Outfile("result.yaml"),
"type": benchcomp.Outfile,
"help":
"write result to F instead of %(default)s. "
"'-' means print to stdout",
}],
},
"filter": {
"help": "transform a result by piping it through a program",
"args": [],
},
"visualize": {
"help": "render a result in various formats",
"args": [{
"flags": ["--result-file"],
"metavar": "F",
"default": pathlib.Path("result.yaml"),
"type": pathlib.Path,
"help":
"read result from F instead of %(default)s. "
}],
},
}
}
}
for subcommand, info in ret["subparsers"]["parsers"].items():
info["epilog"] = epilogs[subcommand]
info["formatter_class"] = argparse.RawDescriptionHelpFormatter
return ret


def _get_global_args():
return [{
"flags": ["-c", "--config"],
"default": "benchcomp.yaml",
"type": benchcomp.ConfigFile,
"metavar": "F",
"help": "read configuration from file F (default: %(default)s)",
}, {
"flags": ["-v", "--verbose"],
"action": "store_true",
"help": "enable verbose output",
}]


def get():
ad = _get_args_dict()
parser = argparse.ArgumentParser(**ad["top_level"])

parser.set_defaults(func=benchcomp.entry.benchcomp.main)

global_args = _get_global_args()

ad["args"].extend(global_args)
for arg in ad["args"]:
flags = arg.pop("flags")
parser.add_argument(*flags, **arg)

subparsers = ad["subparsers"].pop("parsers")
subs = parser.add_subparsers(**ad["subparsers"])
for subcommand, info in subparsers.items():
args = info.pop("args")
subparser = subs.add_parser(name=subcommand, **info)

# Set entrypoint to benchcomp.entry.visualize.main()
# when user invokes `benchcomp visualize`, etc
mod = importlib.import_module(f"benchcomp.entry.{subcommand}")
subparser.set_defaults(func=mod.main)

for arg in args:
flags = arg.pop("flags")
subparser.add_argument(*flags, **arg)
if arg not in global_args:
parser.add_argument(*flags, **arg)

return parser.parse_args()
3 changes: 3 additions & 0 deletions tools/benchcomp/benchcomp/entry/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Each file X.py in this directory contains a `main` method, which
bin/benchcomp will call when you run `benchcomp X`. Running `benchcomp`
with no arguments will invoke the `main` method in `benchcomp.py`.
2 changes: 2 additions & 0 deletions tools/benchcomp/benchcomp/entry/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Copyright Kani Contributors
# SPDX-License-Identifier: Apache-2.0 OR MIT
17 changes: 17 additions & 0 deletions tools/benchcomp/benchcomp/entry/benchcomp.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Copyright Kani Contributors
# SPDX-License-Identifier: Apache-2.0 OR MIT
#
# Entrypoint when running `benchcomp` with no arguments. This runs the other
# subcommands in sequence, for a single-command way of running, comparing, and
# post-processing the suites from a single reproducible config file.


import benchcomp.entry.collate
import benchcomp.entry.run


def main(args):
run_result = benchcomp.entry.run.main(args)

args.suites_dir = run_result.out_prefix / run_result.out_symlink
results = benchcomp.entry.collate.main(args)
Loading

0 comments on commit 12c343e

Please sign in to comment.