Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Binary serialisation format for hugr-model based on capnproto. #1557

Merged
merged 7 commits into from
Oct 11, 2024

Conversation

zrho
Copy link
Contributor

@zrho zrho commented Oct 7, 2024

A binary serialisation format for hugr-model based on capnproto.

  • Export from hugr-core to hugr-model now deduplicates terms on the fly.
  • Removed types on ports and instead used the signature. This allows for more deduplication and also provides a natural place for extension set annotations which was missing before.
  • Added the NodeId of the node that defines a parameter to the indexed variant of a LocalRef so that deduplication does not accidentally merge two local variables that refer to different parameters.

Serializing the same 2000 2-qubit gate + 1000 1-qubit gate circuit by going from core via the model takes 5.5682 ms per roundtrip on Apple M3. Meanwhile the JSON roundtrip baseline is 10.760 ms. A significant portion of the time is spent on needless copies of signatures since export and import have to interact with the hugr-core API (#1551), so there is still a lot of potential to make this faster yet.

Copy link

codecov bot commented Oct 8, 2024

Codecov Report

Attention: Patch coverage is 75.13228% with 235 lines in your changes missing coverage. Please review.

Project coverage is 85.68%. Comparing base (307937e) to head (9781b14).
Report is 9 commits behind head on main.

Files with missing lines Patch % Lines
hugr-model/src/v0/binary/read.rs 69.49% 36 Missing and 43 partials ⚠️
hugr-core/src/import.rs 73.11% 16 Missing and 34 partials ⚠️
hugr-core/src/export.rs 85.18% 33 Missing and 3 partials ⚠️
hugr-model/src/v0/binary/write.rs 79.35% 31 Missing and 1 partial ⚠️
hugr-model/src/v0/text/print.rs 57.44% 5 Missing and 15 partials ⚠️
hugr-model/src/v0/text/parse.rs 65.78% 2 Missing and 11 partials ⚠️
hugr-model/src/v0/mod.rs 70.58% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1557      +/-   ##
==========================================
- Coverage   85.72%   85.68%   -0.04%     
==========================================
  Files         132      134       +2     
  Lines       24017    24512     +495     
  Branches    21017    21512     +495     
==========================================
+ Hits        20588    21003     +415     
- Misses       2365     2411      +46     
- Partials     1064     1098      +34     
Flag Coverage Δ
python 92.46% <ø> (ø)
rust 84.73% <75.13%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@zrho
Copy link
Contributor Author

zrho commented Oct 8, 2024

The filesize for the capnp serialization is also better than the JSON baseline: the 2000 2-qubit gate + 1000 1-qubit gate circuit serializes to 295kb with binary serialization in contrast to 695K for JSON. This is not just due to being a binary format, but also due to term deduplication. Note that the JSON serialization also does a form of this by serializing the prelude.qubit type simply as "Q". But whereas this only works for qubits for the JSON serializer, the term deduplication performed by the export to the model works for all types.

The file size can be driven down further and the speed increased once we have custom operation declarations: Then nodes that use a custom operation don't have to repeat the operation's full name every time, but can simply refer to the declaring node by index.

This form of deduplication is not only useful to decrease file size; that could similarly be achieved by just compressing the output (as was suggested here: #1307). When reading a deduplicated file back in, we can use the sharing to also get an efficient in-memory representation, and to avoid redundantly validating the same term over and over. At this point, we do not make good use of this yet since the hugr-core data structures do not allow any sharing.

@zrho zrho marked this pull request as ready for review October 8, 2024 12:43
@zrho zrho requested a review from a team as a code owner October 8, 2024 12:43
@zrho zrho requested review from croyzor and ss2165 October 8, 2024 12:43

struct Operation {
union {
invalid @0 :Void;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the invalid for?

Copy link
Contributor Author

@zrho zrho Oct 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the docs on Operation::Invalid in the Rust code. It serves two purposes:

  • being able to represent modules whose nodes have non-contiguous indices without needing to compact
  • a placeholder operation for when you have partially constructed a module. This was useful to break a cycle in export after adding the defining node index in LocalRef.

Copy link
Member

@ss2165 ss2165 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, minor queries

hugr/Cargo.toml Show resolved Hide resolved
hugr/Cargo.toml Show resolved Hide resolved
hugr-model/capnp/hugr-v0.capnp Outdated Show resolved Hide resolved
hugr-model/src/v0/binary/read.rs Show resolved Hide resolved
hugr-model/src/v0/mod.rs Show resolved Hide resolved
hugr-core/src/export.rs Show resolved Hide resolved
hugr-core/src/export.rs Show resolved Hide resolved
hugr-core/src/export.rs Outdated Show resolved Hide resolved
hugr-core/Cargo.toml Outdated Show resolved Hide resolved
Copy link
Member

@ss2165 ss2165 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@zrho zrho added this pull request to the merge queue Oct 11, 2024
Merged via the queue into main with commit d03b91e Oct 11, 2024
20 of 21 checks passed
@zrho zrho deleted the zrho/model-binary branch October 11, 2024 11:26
This was referenced Oct 11, 2024
github-merge-queue bot pushed a commit that referenced this pull request Oct 14, 2024
## 🤖 New release
* `hugr`: 0.13.0 -> 0.13.1 (✓ API compatible changes)
* `hugr-core`: 0.10.0 -> 0.13.1 (✓ API compatible changes)
* `hugr-model`: 0.1.0 -> 0.13.1 (✓ API compatible changes)
* `hugr-passes`: 0.8.2 -> 0.13.1 (✓ API compatible changes)
* `hugr-cli`: 0.6.1 -> 0.13.1

<details><summary><i><b>Changelog</b></i></summary><p>

## `hugr`
<blockquote>

##
[0.13.1](hugr-v0.13.0...hugr-v0.13.1)
- 2024-10-14

### New Features

- return replaced ops from lowering
([#1568](#1568))
- Make `BuildHandle::num_value_outputs` public
([#1560](#1560))
- `FunctionBuilder::add_{in,out}put`
([#1570](#1570))
- Binary serialisation format for hugr-model based on capnproto.
([#1557](#1557))
</blockquote>

## `hugr-core`
<blockquote>

##
[0.13.1](hugr-core-v0.10.0...hugr-core-v0.13.1)
- 2024-10-14

### New Features

- Make `BuildHandle::num_value_outputs` public
([#1560](#1560))
- Binary serialisation format for hugr-model based on capnproto.
([#1557](#1557))
- `FunctionBuilder::add_{in,out}put`
([#1570](#1570))
</blockquote>

## `hugr-model`
<blockquote>

##
[0.13.1](hugr-model-v0.1.0...hugr-model-v0.13.1)
- 2024-10-14

### New Features

- Binary serialisation format for hugr-model based on capnproto.
([#1557](#1557))
</blockquote>

## `hugr-passes`
<blockquote>

##
[0.13.1](hugr-passes-v0.8.2...hugr-passes-v0.13.1)
- 2024-10-14

### New Features

- return replaced ops from lowering
([#1568](#1568))
</blockquote>

## `hugr-cli`
<blockquote>

## 0.6.0 (2024-09-04)

### Features

- [**breaking**] Allow registry specification in `run_dump`
([#1501](#1501))
- [**breaking**] Add `Package::validate` and return `ExtensionRegistry`
in helpers. ([#1507](#1507))
</blockquote>


</p></details>

---
This PR was generated with
[release-plz](https://github.com/MarcoIeni/release-plz/).

---------

Co-authored-by: Agustín Borgna <[email protected]>
Co-authored-by: Agustín Borgna <[email protected]>
Co-authored-by: Alec Edgington <[email protected]>
This was referenced Oct 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants