Skip to content

Latest commit

 

History

History
133 lines (96 loc) · 8.89 KB

Development.md

File metadata and controls

133 lines (96 loc) · 8.89 KB

Getting Started

In order to build and test lexical, only a modern Rust toolchain (1.65+) (1.61.0 to build, 1.65.0 to test) is required. However, for reasons described below, we highly recommend you install a recent (1.55+) nightly toolchain.

cargo +nightly build
cargo +nightly test

Code Structure

Lexical is broken up into compact, relatively isolated workspaces to separate functionality based on the numeric conversion, minimizing compile times and simplifying testing feature-dependent code. The workspaces are:

Functionality is generally made public to separate the tests from the implementation, although non-documented members is not stable, and any changes to this code is not considered a breaking change. Tests are separated from the actual implementation, and comprehensively test each individual component.

Furthermore, any unsafe code uses the following conventions:

  1. Each unsafe function must contain a # Safety section.
  2. Unsafe operations/calls in unsafe functions must be marked as unsafe, with their safety guarantees clearly documented via a // SAFETY: section.

Dependencies

In order to fully test and develop lexical, a recent, nightly compiler along with following Rust dependencies is required:

  • Clippy
  • Rustfmt
  • Miri
  • Valgrind
  • Tarpaulin
  • Fuzz
  • Count

These dependencies may be installed via:

rustup toolchain install nightly
rustup +nightly component add clippy
rustup +nightly component add rustfmt
rustup +nightly component add miri
cargo +nightly install cargo-valgrind
cargo +nightly install cargo-tarpaulin
cargo +nightly install cargo-fuzz

# This is only needed if checking the number of lines of unsafe code,
# which uses a deprecated binary that requires an old nightly to
# install.
cargo +nightly install cargo-count --git https://github.com/kbknapp/cargo-count --rev eebe6f8 --locked

# Only if editing doc comments. This requires a Linux or macOS install.
# On Ubuntu, the packages `libclang-dev` and `llvm` are required.
cargo install cargo-spellcheck

In addition, the following non-Rust dependencies must be installed:

  • Python3.6+
  • python-magic (python-magic-win64 on Windows)
  • Valgrind

In order to minimize build times, dependencies, and potential version conflicts, all tests with development dependencies are moved to a extras workspace. This allows testing from that directory without influencing the core crate. This is less than ideal but is required to simplify version and dependency management without affecting test coverage. Each keeps the identical workspace structure so we can perfectly replicate the original unittests.

Development Process

The scripts directory contains numerous scripts for testing, fuzzing, analyzing, and formatting code. Since many development features are nightly-only, this ensures the proper compiler features are used. This requires a recent version of a nightly compiler (1.65.0+) installed via Rustup, which can be invoked as cargo +nightly.

  • asm.sh: Emit assembly for numeric conversion routines, to identify performance regression.
  • bench.sh: Check the benchmarks compile and run.
  • check.sh: Check rustfmt and clippy without formatting any code.
  • fmt.sh: Run cargo fmt and cargo clippy in all projects and workspaces, on nightly.
  • fuzz.sh: Run fuzzer for a given target.
  • hooks.sh: Install formatting and lint hooks on commits.
  • link.sh: Rebuild all symbolic links.
  • size.py: Calculate lexical binary sizes.
  • test.sh: Run the test suite with Valgrind and Miri.
  • timings.py: Plot build times.
  • unsafe.sh: Count lines of code and metrics of unsafe code usage.

Please run fmt.sh before committing any code, ideally by installing the pre-commit hook via hooks.sh.

All PRs must pass the following checks:

# Check all safety sections and other features are properly documented.
RUSTFLAGS="--deny warnings" cargo +nightly build --features=lint
# Ensure all rustfmt and clippy checks pass.
scripts/check.sh
# Ensure all tests pass with common feature combinations.
# Miri is too slow, so skip those tests for most commits.
SKIP_MIRI=1 scripts/test.sh

Safety

In order to ensure memory safety even when using unsafe features, we have the following requirements.

  • All code with local unsafety must be marked as an unsafe function.
  • All unsafe macros must have a # Safety section in the documentation.
  • All unsafe functions must have a # Safety section in the documentation.
  • All code using unsafe functionality must have a // SAFETY: section on the previous line, and must contain an unsafe block, even in unsafe functions.
  • If multiple lines have similar safety guarantees, a // SAFETY: section can be used for a block or small segment of code.

In order to very that the safety guarantees are met, any changes to unsafe code must be fuzzed, the test suite must be run with Valgrind, and must pass the following commands:

# Ensure `unsafe` blocks are used within `unsafe` functions.
RUSTFLAGS="--deny warnings" cargo +nightly build --features=lint
# Ensure clippy checks pass for `# Safety` sections.
cargo +nightly clippy --all-features -- --deny warnings

Algorithm Changes

Each workspace has a "docs" directory containing detailed descriptions of algorithms and benchmarks. If you make any substantial changes to an algorithm, you should both update the algorithm description and the provided benchmarks.

Pitfalls

ALWAYS benchmark, even for trivial changes. I've been burned many times by #[cfg(...)] being way faster than if cfg!(), which youl would think both would be eliminated during optimization, just one during the first stage of compilation. It's better to confirm than assume. This is a nightmare development-wise because of how many features we support but there's not many alternatives: it seems it doesn't entirely remove code as if by tree-shaking which can majorly impact performance.

Inlining Hints

There have been major performance regressions and improvements between Rust versions due to changes in inlining. We also, with the formatting API, have many layers of code that are all removed at compile-time, but only sometimes if function inlining occurs. Due to the extra complexity from this configurability, the functions seem more complex than they actually are and forcing inlining can have dramatic performance benefits. For example, in parse and parse_number, ~350 lines of code reduces to less than 150 when most of the formatting options are removed, and can be reduced even shorter yet. This complexity affects inlining elsewhere and the many layers of abstractions to make our formatting API as performant as possible and changes to how aggressively Rust inlines version to version make manually providing hints required for consistent version-to-version performance.

Documentation

If making significant changes to the documentation, running the spellchecker can be useful. Remember these are guidelines and anything inside libm.rs should be ignored. To check the spelling, run cargo spellcheck check.