This repository has been archived by the owner on Feb 18, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 224
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
fa11752
commit dce372d
Showing
2 changed files
with
81 additions
and
58 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,72 @@ | ||
# Developer guide to this crate | ||
|
||
This crate follows the standard for developing a Rust library via `cargo`. | ||
The CI is our "ground truth" over the state of the library. Check out the different parts of | ||
the CI to understand how to test the different parts of this library locally. | ||
|
||
## Merging | ||
|
||
We currently do not have maintaince versions and thus only PR and merge to `main`. | ||
|
||
We use labels to build a changelog - it is very important to label issues and/or PRs | ||
accordingly. Because our changelog can contain both issues and PRs, when a PR closes | ||
an issue, we favor having the PR on the changelog, since it includes a reference to | ||
the author (credits). | ||
|
||
Summary: | ||
* pull requests with both backward-incompatible changes and new | ||
features/enchancements MUST close at least one issue (the one | ||
documenting the backward-incompatible change) | ||
* Every other pull request MAY close one issue | ||
|
||
issues are only used to document situations whose a single PR adds two entries to | ||
the changelog (e.g. a backward incompatible change + an new enchancement). | ||
|
||
Merging a PR to main has the following checklist: | ||
|
||
1. Does it close an issue? If yes, add the label `no-changelog` to the issue. | ||
2. Label the PR accordingly (`Testing`, `Documentation`, `Enchancement`, `Feature`, `Bug`) | ||
3. If the PR is backward incompatible: | ||
1. create a new issue labeled `backward-incompatible` with what changed and how to migrate | ||
from the old API to the new API | ||
2. Edit the PR's description with `Closes #...` | ||
4. Adjust the PRs title with a description suitable for the changelog | ||
* In the past tense | ||
* backtick code names (e.g. `MutableArray`, not MutableArray) | ||
* place "(-X%)" or "(Yx)" at the end if it is a performance improvement | ||
5. Press merge (squash merge) and adjust any items added by github to your liking | ||
|
||
This will reduce the burden of a release, where we go through every item from the | ||
changelog and adjust titles, labels, etc. | ||
|
||
<!--- | ||
To be completed. | ||
## Releases | ||
Releasing this library is done by the following steps: | ||
1. Identify or create the commit to release | ||
2. Identify the version to apply to it | ||
3. Create a changelog (see below) | ||
4. Verify that the version is consistent with the changelog | ||
5. Bump the version accordingly | ||
6. Commit the bump and changelog | ||
7. Tag the commit | ||
8. publish to [crates.io](https://crates.io) | ||
## 1. Identify or create commit to release | ||
If from the main branch, it is usually a minor release | ||
## How to generate the changelog | ||
```bash | ||
docker run -it --rm -v "$(pwd)":/usr/local/src/your-app githubchangeloggenerator/github-changelog-generator --user jorgecarleitao --project arrow2 --token TOKEN | ||
``` | ||
## How to publish | ||
```bash | ||
cargo publish --features full | ||
``` | ||
--> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -27,37 +27,30 @@ As such, it | |
|
||
Design documents about each of the parts of this repo are available on their respective READMEs. | ||
|
||
Check [DEVELOPMENT.md](DEVELOPMENT.md) for our development practices. | ||
|
||
## Tests | ||
|
||
The test suite is a _superset_ of all tests that the original implementation has against golden files from the arrow project. It includes both little and big endian files. | ||
|
||
Furthermore, the CI runs all integration tests against [apache/arrow@master](https://github.com/apache/arrow), demonstrating full interoperability with other implementations. | ||
|
||
Integration tests against parquet files created by pyarrow require generating parquet files. These tests run by default. | ||
To not run them, pass `ARROW2_IGNORE_PARQUET` to the tests (the tests will be marked as OK/PASS). | ||
|
||
```bash | ||
git clone [email protected]:jorgecarleitao/arrow2.git | ||
cd arrow2 | ||
git submodule update --init | ||
ARROW2_IGNORE_PARQUET= cargo test | ||
``` | ||
Finally, we have integration tests against `parquet` generated by `pyarrow` under different | ||
configurations, as well as integration tests against `pyspark` demonstrating compatibility with | ||
its `parquet` reader. | ||
|
||
To generate the necessary parquet files, run | ||
## Versioning | ||
|
||
```bash | ||
python3 -m venv venv | ||
venv/bin/pip install pyarrow==3 | ||
venv/bin/python parquet_integration/write_parquet.py | ||
``` | ||
We use the SemVer 2.0 used by Cargo and the remaining of the Rust ecosystem, | ||
we also use the `0.x.y` versioning, since we are iterating over the API. | ||
|
||
## Features in this crate and not in the official | ||
|
||
### Safety and Security | ||
|
||
* safe by design (i.e. no transmutes, runtime type checking nor pointer casts) | ||
* Uses Rust's compiler whenever possible to prove that memory reads are sound | ||
* All non-IO components pass MIRI checks (MIRI and file systems are a bit funny atm) | ||
* All non-IO components pass MIRI checks (MIRI can't open files atm) | ||
|
||
### Arrow Format | ||
|
||
|
@@ -95,36 +88,6 @@ venv/bin/python parquet_integration/write_parquet.py | |
|
||
Too many to enumerate; e.g. nested dictionary arrays, map, nested parquet. | ||
|
||
## How to develop | ||
|
||
This is a normal Rust project. Clone and run tests with `cargo test`. | ||
|
||
### Tips for coverage reporting | ||
|
||
On a Linux machine, with VS-code: | ||
|
||
* install the extension `coverage gutters` | ||
* install `tarpaulin` | ||
* run the command `Coverage Gutters: Watch` and run | ||
|
||
```bash | ||
cargo tarpaulin --target-dir target-tarpaulin --lib --out Lcov | ||
``` | ||
|
||
This will cause tarpaulin to run all tests under coverage and show the coverage on VS-code. | ||
`--target-dir target-tarpaulin` is used to avoid collisions with rust-analyzer / Cargo, as `tarpaulin` | ||
uses different compilation flags. | ||
|
||
### How to improve coverage | ||
|
||
An overall goal of this repo is to have high coverage over all its code base. To achieve this, we recommend to run coverage against a single module of this project; e.g. | ||
|
||
```bash | ||
cargo tarpaulin --target-dir target-tarpaulin --lib --out Lcov -- buffer::immutable | ||
``` | ||
|
||
and evaluate coverage of that module alone. | ||
|
||
## FAQ | ||
|
||
### Why? | ||
|
@@ -220,15 +183,3 @@ at your option. | |
### Contribution | ||
|
||
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions. | ||
|
||
## How to generate the changelog | ||
|
||
```bash | ||
docker run -it --rm -v "$(pwd)":/usr/local/src/your-app githubchangeloggenerator/github-changelog-generator --user jorgecarleitao --project arrow2 --token TOKEN | ||
``` | ||
|
||
## How to publish | ||
|
||
```bash | ||
cargo publish --features full | ||
``` |