DPHPC - Lightning McQueens

The goal of our project is to develop optimized operators from the ONNX standard for the snitch architecture.

Requirements

Either:
- docker/podman
or ("local")
- bender
- verilator
- riscv compiler
- dependencies of snitch/python-requirements.txt and snitch/apt-requirements.txt
cmake and gcc

Setup - Development Environment

Clone the snitch repo

Execute in the project root directory:

git submodule update --init

Load aliases

After installing everthing you can source env.sh inside the project root directory:

source ./scripts/env.sh

This sets the following variables and aliases:

proot to change the directory to the project root
dbuild to build for banshee using docker
pbuild to build for banshee using podman
build to build for banshee locally
dbuild_size SIZE to build for banshee using docker with given input size for the benchmark
pbuild_size SIZE to build for banshee using podman with given input size for the benchmark
build_size SIZE to build for banshee locally with given input size for the benchmark
build_sim to build for the simulator locally
- Run clean before switching from a banshee build and vice versa!
build_sim_size to build for the simulator locally with given input size for the benchmark
- Run clean before switching from a banshee build and vice versa!
clean to remove all temporary build files
run to run using banshee (which must be on your PATH)
bench to run all benchmarks (all binaries inside build/ that start with benchmark_)

Building the simulator

Only if you are not using banshee Make sure to not forget sourcing env.sh.

This needs only to be done once:

cd snitch/hw/system/snitch_cluster/
make bin/snitch_cluster.vlt

Install a riscv compiler

If you want to install the environment yourself and do not want to use docker. This allows you to use build and build_size.

First define $RISCV to point to the directory you want to have the toolchain installed to.

export RISCV=$HOME/.local/llvm-riscv

curl -Ls --progress-bar -o riscv-llvm.tar.gz https://sourceforge.net/projects/pulp-llvm-project/files/nightly/riscv32-pulp-llvm-ubuntu2004.tar.gz/download

tar -C $RISCV -xf riscv-llvm.tar.gz --strip-components=1

for file in riscv64-*; do ln -s $file $(echo "$file" | sed 's/^riscv64/riscv32/g'); done

Building banshee

To build banshee you have to execute the following commands:

cd ./snitch/sw/banshee
cargo install --path .

Building the project

First run source ./scripts/env.sh if you have not done so already
dbuild for docker (or build)
- The build files should then be in ./build

Now you can simulate the built applications with:

run ./build/hello_world

Implemented

SSR
- abs, acos, acosh, add, argmax, asinh, batchnorm, conv, conv2d, copy, cumsum, div, dot, dropout, gemm, masked_dropout, max, maxpool, maxpool2d, relu, sigmoid, sin, sum, transpose, unique
FREP
- abs, add, batchnorm, conv, conv2d, copy, cumsum, div, dot, dropout, gemm, masked_dropout, max, maxpool, maxpool2d, relu, sigmoid, sin, sum, transpose
Parallelised (w/o any helpers except barriers)
- abs, add, argmax, conv, gemm, sin, sum
OMP
- add, add, gemm, sin, sum

Running benchmarks and generating plots

Note: To use the python-benchmarker, you need banshee and python3 set up.

First install the python dependencies found in plots/requirements.txt. We recommend doing this in a virtual environment. To set up the virtual python environment, run the following commands from the project root:

python3 -m venv plots/.venv
source ./plots/.venv/bin/activate
pip install -r plots/requirements.txt

To run the benchmarks for some specific operater (e.g. for the abs-operator), you can execute the following:

python3 plots/scraper.py -include abs

You can also specify up to which size we double (We start at 10 and double each run). For example:

python3 plots/scraper.py -inclide abs -builder 'dbuild_size 40'

This will build and run the sizes: 10, 20, 40

This script builds the project using docker (or any other build command from above using, i.e: -builder 'pbuild_size ZXY), runs the benchmark using banshee and stores the measurements in a file for later use. Note that this might take a couple of minutes depending on the operator. To view a runtime plot of the abs-operator which you have just benchmarked, run:

python3 plots/runtime_plot.py -include abs

If you want to exclude a plot line from the runtime plot, use the "-exclude" flag. For example:

python3 plots/runtime_plot.py -include abs -exclude frep

If you want to generate speedup plots for the different implementations, you can run:

python3 plots/speedup_plot.py -include abs -exclude frep

Helpful Links

TODO

plot some theoretical lower bounds
calculate theoretical runtime
parallelize
- conv2d, maxpool, maxpool2d, transpose, max, unique

DONE

colors/plotstyle consistent
comparison of theoretical runtime of problem relative to runtime on x86
look into bug of multiple SSR configurations
- previous SSR configuration for writing was still active
TEST core count configuration of SIMULATOR
Use multiple cores
- argmax
conv parallel
fix batchnorm benchmark
- manually created as SSR is configured three times and this causes problems
bar plot
barrier wait indefinitely; reduce has "undefined symbol: __kmpc_reduce_nowait" compile error
SSR+FREP
- abs, acos (no frep), acosh (no frep), add, argmax (no frep), asinh (no frep), batchnorm, copy, cumsum, div, dot, dropout, gemm, masked_dropout, max, maxpool, relu, sigmoid, sin, sum, transpose
Parallel
- abs, add, copy, sin (no frep), sum, gemm
OMP:
- abs, add, copy, sin (no frep), sum (broken due to SSR 'leaking' or wrong impl.), gemm
Use memory start pointer instead of l1?
- Use start pointer; implemented in lmq.c
- L1 has a better latency than memory
should we do it for all datatypes (uint8, uint16, float32, ...)?
- for float for now
ask if we need to accept n dimensional input
- Answer: Use vectors where possible
Build for the simulator using a compile flag?
- use -DCLUSTER_SIM=1 when calling cmake .. or simply use the alias build_sim
Have a command to output assembly
- can be done if compiled as executable. sufficient?
setup building
Find project and TA
Created repo
Created scraper scripts that runs benchmarks and dumps their runtime into json files
Created scripts for automatically generating runtime plots

Name		Name	Last commit message	Last commit date
Latest commit History 467 Commits
build		build
plots		plots
report		report
scripts		scripts
snitch @ e29f0d5		snitch @ e29f0d5
src		src
.clang-format		.clang-format
.gitignore		.gitignore
.gitmodules		.gitmodules
.sync-exclude.lst		.sync-exclude.lst
CMakeLists.txt		CMakeLists.txt
README.md		README.md
container_build.sh		container_build.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DPHPC - Lightning McQueens

Requirements

Setup - Development Environment

Clone the snitch repo

Load aliases

Building the simulator

Install a riscv compiler

Building banshee

Building the project

Implemented

Running benchmarks and generating plots

Helpful Links

TODO

DONE

About

Releases

Packages

Contributors 5

Languages

dbernhard-0x7CD/dphpc

Folders and files

Latest commit

History

Repository files navigation

DPHPC - Lightning McQueens

Requirements

Setup - Development Environment

Clone the snitch repo

Load aliases

Building the simulator

Install a riscv compiler

Building banshee

Building the project

Implemented

Running benchmarks and generating plots

Helpful Links

TODO

DONE

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages