Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consideration for decode array to tuple by default option. #30

Closed
goodboy opened this issue Jun 11, 2021 · 3 comments
Closed

Consideration for decode array to tuple by default option. #30

goodboy opened this issue Jun 11, 2021 · 3 comments

Comments

@goodboy
Copy link

goodboy commented Jun 11, 2021

msgpack-python has an option: use_list=False to its unpacker to allow for decoding to tuple by default.

I noticed in the docs that tuples are only used for array types when used as hashable keys.

Is there a reason there isn't a way to either offer the tuple-as-default by a manual flag or, just by default decode to the same type considering they're ostensibly more performant in python then list?

@goodboy goodboy changed the title Cosideration for decode array to tuple by default option. Consideration for decode array to tuple by default option. Jun 11, 2021
@goodboy goodboy mentioned this issue Jun 11, 2021
7 tasks
goodboy added a commit to goodboy/tractor that referenced this issue Jun 11, 2021
Add a `tractor._ipc.MsgspecStream` type which can be swapped in for
`msgspec` serialization transparently. A small msg-length-prefix framing
is implemented as part of the type and we use
`tricycle.BufferedReceieveStream` to handle buffering logic for the
underlying transport.

Notes:
- had to force cast a few more list  -> tuple spots due to no native
  `tuple`decode-by-default in `msgspec`: jcrist/msgspec#30
- the framing can be understood by this protobuf walkthrough:
  https://eli.thegreenplace.net/2011/08/02/length-prefix-framing-for-protocol-buffers
- `tricycle` becomes a new dependency
goodboy added a commit to goodboy/tractor that referenced this issue Jun 14, 2021
Add a `tractor._ipc.MsgspecStream` type which can be swapped in for
`msgspec` serialization transparently. A small msg-length-prefix framing
is implemented as part of the type and we use
`tricycle.BufferedReceieveStream` to handle buffering logic for the
underlying transport.

Notes:
- had to force cast a few more list  -> tuple spots due to no native
  `tuple`decode-by-default in `msgspec`: jcrist/msgspec#30
- the framing can be understood by this protobuf walkthrough:
  https://eli.thegreenplace.net/2011/08/02/length-prefix-framing-for-protocol-buffers
- `tricycle` becomes a new dependency
@jcrist
Copy link
Owner

jcrist commented Jun 27, 2021

Apologies for the delay here. Tuples aren't more performant than lists to create or use. If you read through the answers in the link above, you'd see that only constant tuples (e.g. (1, 2, 3)) are "faster" since they're built only once by the compiler. Both lists and tuples have similar representations in cpython, and take equivalent time to construct dynamically. A quick benchmark using msgspec:

In [5]: data = list(range(1000))

In [6]: dec_list = msgspec.Decoder(list)

In [7]: dec_tuple = msgspec.Decoder(tuple)

In [8]: buf = msgspec.encode(data)

In [9]: %timeit dec_list.decode(buf)
12.9 µs ± 20.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [10]: %timeit dec_tuple.decode(buf)
12.9 µs ± 17.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

I'm not enthused about adding a use_list-like option. Lists are the natural default type for MessagePack's array type. If you want to use a different type for arrays then you likely have a schema you're following and I'd direct you to use msgspec's support for typed serialization.

@goodboy
Copy link
Author

goodboy commented Jun 28, 2021

@jcrist learn something new every time I report something here 🏄🏼

You'd think i would have double checked the tuple create speed claim 🙄
Is it possible the .encode() step here is faster though?

Honestly, keeping it as is works for me as simpler is always better imo.
I can close this is no one else is going to have quiffs.

and I'd direct you to use msgspec's support for typed serialization.

Yeah i think focusing on a struct schema is really the right way to designing things anyway 👍🏼

@jcrist
Copy link
Owner

jcrist commented Jun 28, 2021

No problem, happy to help.

Is it possible the .encode() step here is faster though?

Both store their data as an array of PyObject*, so I wouldn't expect a difference. Easy enough to benchmark though:

In [1]: import msgspec

In [2]: enc = msgspec.Encoder()

In [3]: msg_tuple = tuple(range(1000))

In [4]: msg_list = list(range(1000))

In [5]: %timeit enc.encode(msg_tuple)
10.3 µs ± 26.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [6]: %timeit enc.encode(msg_list)
10.3 µs ± 20.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

I can close this is no one else is going to have quiffs.

Closing!

@jcrist jcrist closed this as completed Jun 28, 2021
goodboy added a commit to goodboy/tractor that referenced this issue Jul 1, 2021
Add a `tractor._ipc.MsgspecStream` type which can be swapped in for
`msgspec` serialization transparently. A small msg-length-prefix framing
is implemented as part of the type and we use
`tricycle.BufferedReceieveStream` to handle buffering logic for the
underlying transport.

Notes:
- had to force cast a few more list  -> tuple spots due to no native
  `tuple`decode-by-default in `msgspec`: jcrist/msgspec#30
- the framing can be understood by this protobuf walkthrough:
  https://eli.thegreenplace.net/2011/08/02/length-prefix-framing-for-protocol-buffers
- `tricycle` becomes a new dependency
goodboy added a commit to goodboy/tractor that referenced this issue Jul 1, 2021
Add a `tractor._ipc.MsgspecStream` type which can be swapped in for
`msgspec` serialization transparently. A small msg-length-prefix framing
is implemented as part of the type and we use
`tricycle.BufferedReceieveStream` to handle buffering logic for the
underlying transport.

Notes:
- had to force cast a few more list  -> tuple spots due to no native
  `tuple`decode-by-default in `msgspec`: jcrist/msgspec#30
- the framing can be understood by this protobuf walkthrough:
  https://eli.thegreenplace.net/2011/08/02/length-prefix-framing-for-protocol-buffers
- `tricycle` becomes a new dependency
goodboy added a commit to goodboy/tractor that referenced this issue Sep 5, 2021
Add a `tractor._ipc.MsgspecStream` type which can be swapped in for
`msgspec` serialization transparently. A small msg-length-prefix framing
is implemented as part of the type and we use
`tricycle.BufferedReceieveStream` to handle buffering logic for the
underlying transport.

Notes:
- had to force cast a few more list  -> tuple spots due to no native
  `tuple`decode-by-default in `msgspec`: jcrist/msgspec#30
- the framing can be understood by this protobuf walkthrough:
  https://eli.thegreenplace.net/2011/08/02/length-prefix-framing-for-protocol-buffers
- `tricycle` becomes a new dependency
goodboy added a commit to goodboy/tractor that referenced this issue Sep 5, 2021
`msgspec` sends python lists over the wire
(jcrist/msgspec#30) which is fine and dandy
but we use them as lookup keys so we need to be sure we tuple-cast
first.
goodboy added a commit to goodboy/tractor that referenced this issue Sep 5, 2021
`msgspec` sends python lists over the wire
(jcrist/msgspec#30) which is fine and dandy
but we use them as lookup keys so we need to be sure we tuple-cast
first.
goodboy added a commit to goodboy/tractor that referenced this issue Sep 8, 2021
Add a `tractor._ipc.MsgspecStream` type which can be swapped in for
`msgspec` serialization transparently. A small msg-length-prefix framing
is implemented as part of the type and we use
`tricycle.BufferedReceieveStream` to handle buffering logic for the
underlying transport.

Notes:
- had to force cast a few more list  -> tuple spots due to no native
  `tuple`decode-by-default in `msgspec`: jcrist/msgspec#30
- the framing can be understood by this protobuf walkthrough:
  https://eli.thegreenplace.net/2011/08/02/length-prefix-framing-for-protocol-buffers
- `tricycle` becomes a new dependency
goodboy added a commit to goodboy/tractor that referenced this issue Sep 8, 2021
`msgspec` sends python lists over the wire
(jcrist/msgspec#30) which is fine and dandy
but we use them as lookup keys so we need to be sure we tuple-cast
first.
goodboy added a commit to goodboy/tractor that referenced this issue Sep 18, 2021
Add a `tractor._ipc.MsgspecStream` type which can be swapped in for
`msgspec` serialization transparently. A small msg-length-prefix framing
is implemented as part of the type and we use
`tricycle.BufferedReceieveStream` to handle buffering logic for the
underlying transport.

Notes:
- had to force cast a few more list  -> tuple spots due to no native
  `tuple`decode-by-default in `msgspec`: jcrist/msgspec#30
- the framing can be understood by this protobuf walkthrough:
  https://eli.thegreenplace.net/2011/08/02/length-prefix-framing-for-protocol-buffers
- `tricycle` becomes a new dependency
goodboy added a commit to goodboy/tractor that referenced this issue Sep 18, 2021
`msgspec` sends python lists over the wire
(jcrist/msgspec#30) which is fine and dandy
but we use them as lookup keys so we need to be sure we tuple-cast
first.
goodboy added a commit to goodboy/tractor that referenced this issue Sep 18, 2021
Add a `tractor._ipc.MsgspecStream` type which can be swapped in for
`msgspec` serialization transparently. A small msg-length-prefix framing
is implemented as part of the type and we use
`tricycle.BufferedReceieveStream` to handle buffering logic for the
underlying transport.

Notes:
- had to force cast a few more list  -> tuple spots due to no native
  `tuple`decode-by-default in `msgspec`: jcrist/msgspec#30
- the framing can be understood by this protobuf walkthrough:
  https://eli.thegreenplace.net/2011/08/02/length-prefix-framing-for-protocol-buffers
- `tricycle` becomes a new dependency
goodboy added a commit to goodboy/tractor that referenced this issue Sep 18, 2021
`msgspec` sends python lists over the wire
(jcrist/msgspec#30) which is fine and dandy
but we use them as lookup keys so we need to be sure we tuple-cast
first.
goodboy added a commit to goodboy/tractor that referenced this issue Oct 4, 2021
Add a `tractor._ipc.MsgspecStream` type which can be swapped in for
`msgspec` serialization transparently. A small msg-length-prefix framing
is implemented as part of the type and we use
`tricycle.BufferedReceieveStream` to handle buffering logic for the
underlying transport.

Notes:
- had to force cast a few more list  -> tuple spots due to no native
  `tuple`decode-by-default in `msgspec`: jcrist/msgspec#30
- the framing can be understood by this protobuf walkthrough:
  https://eli.thegreenplace.net/2011/08/02/length-prefix-framing-for-protocol-buffers
- `tricycle` becomes a new dependency
goodboy added a commit to goodboy/tractor that referenced this issue Oct 4, 2021
`msgspec` sends python lists over the wire
(jcrist/msgspec#30) which is fine and dandy
but we use them as lookup keys so we need to be sure we tuple-cast
first.
goodboy added a commit to goodboy/tractor that referenced this issue Oct 4, 2021
Add a `tractor._ipc.MsgspecStream` type which can be swapped in for
`msgspec` serialization transparently. A small msg-length-prefix framing
is implemented as part of the type and we use
`tricycle.BufferedReceieveStream` to handle buffering logic for the
underlying transport.

Notes:
- had to force cast a few more list  -> tuple spots due to no native
  `tuple`decode-by-default in `msgspec`: jcrist/msgspec#30
- the framing can be understood by this protobuf walkthrough:
  https://eli.thegreenplace.net/2011/08/02/length-prefix-framing-for-protocol-buffers
- `tricycle` becomes a new dependency
goodboy added a commit to goodboy/tractor that referenced this issue Oct 4, 2021
`msgspec` sends python lists over the wire
(jcrist/msgspec#30) which is fine and dandy
but we use them as lookup keys so we need to be sure we tuple-cast
first.
goodboy added a commit to goodboy/tractor that referenced this issue Oct 5, 2021
Add a `tractor._ipc.MsgspecStream` type which can be swapped in for
`msgspec` serialization transparently. A small msg-length-prefix framing
is implemented as part of the type and we use
`tricycle.BufferedReceieveStream` to handle buffering logic for the
underlying transport.

Notes:
- had to force cast a few more list  -> tuple spots due to no native
  `tuple`decode-by-default in `msgspec`: jcrist/msgspec#30
- the framing can be understood by this protobuf walkthrough:
  https://eli.thegreenplace.net/2011/08/02/length-prefix-framing-for-protocol-buffers
- `tricycle` becomes a new dependency
goodboy added a commit to goodboy/tractor that referenced this issue Oct 5, 2021
`msgspec` sends python lists over the wire
(jcrist/msgspec#30) which is fine and dandy
but we use them as lookup keys so we need to be sure we tuple-cast
first.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants