Skip to content

Commit

Permalink
v0.1.4
Browse files Browse the repository at this point in the history
All dependencies relative to hash function implementation are now optional, dynamically imported only when the hash functions using them are used for the first time. Running `pip install --upgrade multiformats` will not install any of them, but they can be all installed by running `pip install --upgrade multiformats[full]`. In particular, this closes #4.

Hash function implementations are loaded and registered transparently on first use, to reduce memory footprint and module loading times. Analogously, a number of multibases are created and registered transparently on first use.

All hash functions with a readily available, well-supported existing Python implementation are now supported.

Finally, closes #3.
  • Loading branch information
sg495 committed Jul 21, 2022
1 parent b76a9ea commit 01046b8
Show file tree
Hide file tree
Showing 41 changed files with 1,547 additions and 550 deletions.
6 changes: 4 additions & 2 deletions MULTIFORMATS-LICENSE → ADDITIONAL-LICENSES
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
The following items are subject to MIT License by Protocol Labs Inc:
The following items are subject to MIT License by Protocol Labs Inc, included below:

- multibase table, downloaded from https://github.com/multiformats/multibase/raw/master/multibase.csv
- multicodec table, downloaded from https://github.com/multiformats/multicodec/raw/master/table.csv
- test vectors for multihash, downloaded from https://github.com/multiformats/multihash/raw/master/tests/values/test_cases.csv on 14 Dec 2021
- the test vectors for multihash in multihash-test-str-vectors.csv, downloaded from https://github.com/multiformats/multihash/raw/master/tests/values/test_cases.csv on 14 Dec 2021

Test vectors for murmur3 hash are public domain, courtesy of Ian Boyd https://stackoverflow.com/questions/14747343/murmurhash3-test-vectors#31929528


The MIT License (MIT)
Expand Down
23 changes: 23 additions & 0 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,27 @@ You can install the latest release from `PyPI <https://pypi.org/project/multifor
$ pip install --upgrade multiformats
The following are mandatory dependencies for this module:

- `typing-extensions <https://github.com/python/typing_extensions>`_, for backward compatibility of static typing.
- `typing-validation <https://github.com/hashberg-io/typing-validation>`_, for dynamic typechecking
- `bases <https://github.com/hashberg-io/bases>`_, for implementation of base encodings used by Multibase

The following are optional dependencies for this module:

- `pysha3 <https://github.com/tiran/pysha3>`_, for the ``keccak`` hash functions.
- `blake3 <https://github.com/oconnor663/blake3-py>`_, for the ``blake3`` hash function.
- `pyskein <https://pythonhosted.org/pyskein/>`_, for the ``skein`` hash functions.
- `mmh3 <https://github.com/hajimes/mmh3>`_, for the ``murmur3`` hash functions.
- `pycryptodomex <https://github.com/Legrandin/pycryptodome/>`_, for the ``ripemd-160`` hash function, \
the ``kangarootwelve`` hash function and the ``sha2-512-224``/``sha2-512-256`` hash functions.

You can install the latest release together with all optional dependencies as follows:

.. code-block:: console
$ pip install --upgrade multiformats[full]
Usage
-----
Expand Down Expand Up @@ -311,3 +332,5 @@ License
-------

`MIT © Hashberg Ltd. <LICENSE>`_

See `additional Licenses <ADDITIONAL-LICENSES>`_ for licensing of the multicodec table, the multibase table and test vectors for multihashes.
3 changes: 2 additions & 1 deletion docs/api/multiformats.multihash.raw.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,8 @@ multiformats.multihash.raw
Hashfun
-------

.. autodata:: multiformats.multihash.raw.Hashfun
.. autoclass:: multiformats.multihash.raw.Hashfun
:members:

MultihashImpl
-------------
Expand Down
21 changes: 21 additions & 0 deletions docs/getting-started.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,4 +29,25 @@ The above will import the following names:
The first five are modules implementing the homonymous specifications,
while :class:`~multiformats.cid.CID` is a class for Content IDentifiers.

The following are mandatory dependencies for this module:

- `typing-extensions <https://github.com/python/typing_extensions>`_, for backward compatibility of static typing.
- `typing-validation <https://github.com/hashberg-io/typing-validation>`_, for dynamic typechecking
- `bases <https://github.com/hashberg-io/bases>`_, for implementation of base encodings used by Multibase

The following are optional dependencies for this module:

- `pysha3 <https://github.com/tiran/pysha3>`_, for the ``keccak`` hash functions.
- `blake3 <https://github.com/oconnor663/blake3-py>`_, for the ``blake3`` hash function.
- `pyskein <https://pythonhosted.org/pyskein/>`_, for the ``skein`` hash functions.
- `mmh3 <https://github.com/hajimes/mmh3>`_, for the ``murmur3`` hash functions.
- `pycryptodomex <https://github.com/Legrandin/pycryptodome/>`_, for the ``ripemd-160`` hash function, \
the ``kangarootwelve`` hash function and the ``sha2-512-224``/``sha2-512-256`` hash functions.

You can install the latest release together with all optional dependencies as follows:

.. code-block:: console
$ pip install --upgrade multiformats[full]
GitHub repo: https://github.com/hashberg-io/multiformats
8 changes: 5 additions & 3 deletions docs/make-api.json
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,12 @@
"exclude_members": {
"multiformats.multicodec": ["build_multicodec_tables"],
"multiformats.multibase": ["build_multibase_tables"],
"multiformats.multibase.raw": ["identity_raw_encoder", "identity_raw_decoder", "proquint_raw_encoder", "proquint_raw_decoder", "RawEncoder", "RawDecoder"],
"multiformats.multibase.raw": ["RawEncoder", "RawDecoder"],
"multiformats.cid": ["CIDVersionNumbers", "byteslike"],
"multiformats.multiaddr.raw": ["ip4_encoder", "ip4_decoder", "ip6_encoder", "ip6_decoder", "tcp_udp_encoder", "tcp_udp_decoder"]
},
"include_modules": [],
"exclude_modules": []
}
"exclude_modules": [
"multiformats.multihash._hashfuns"
]
}
10 changes: 9 additions & 1 deletion docs/make-api.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,8 +87,12 @@ def make_apidocs() -> None:
os.remove(apidoc_file)
print()

mod_name_to_del: List[str] = []

for mod_name, mod in modules_dict.items():
if mod_name in exclude_modules:
if any(mod_name.startswith(name) for name in exclude_modules):
# if mod_name in exclude_modules:
mod_name_to_del.append(mod_name)
continue
filename = f"{apidocs_folder}/{mod_name}.rst"
print(f"Writing API docfile {filename}")
Expand Down Expand Up @@ -164,6 +168,10 @@ def make_apidocs() -> None:
f.write("\n".join(lines))
print("")


for mod_name in mod_name_to_del:
del modules_dict[mod_name]

toctable_lines = [
".. toctree::",
" :maxdepth: 2",
Expand Down
1 change: 0 additions & 1 deletion docs/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,3 @@ sphinx_autodoc_typehints
bases
typing-extensions
typing-validation
pyskein
2 changes: 1 addition & 1 deletion multiformats/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
while :class:`~multiformats.cid.CID` is a class for Content IDentifiers.
"""

__version__ = "0.1.3"
__version__ = "0.1.4"

from . import varint
from . import multicodec
Expand Down
3 changes: 1 addition & 2 deletions multiformats/multiaddr/err.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,7 @@

class MultiaddrKeyError(builtins.KeyError): # pylint: disable = redefined-builtin
""" Class for :mod:`~multiformats.multiaddr` key errors. """
...


class MultiaddrValueError(builtins.ValueError): # pylint: disable = redefined-builtin
""" Class for :mod:`~multiformats.multiaddr` value errors. """
...
18 changes: 10 additions & 8 deletions multiformats/multibase/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ class Multibase:
:param name: the multibase name
:type name: :obj:`str`
:param code: the multibase code, as single-char string or ``0xYZ`` hex-string of a byte
:param code: the multibase code, as single-char string or ``0x...`` hex-string of a non-empty bytestring
:type code: :obj:`str`
:param status: the multibase status
:type status: ``'draft'``, ``'candidate'`` or ``'default'``, *optional*
Expand Down Expand Up @@ -91,20 +91,20 @@ def validate_code(code: str) -> str:
MultibaseValueError: Multibase codes must be single-character strings
or the hex digits '0xYZ' of a single byte.
:param code: the multibase code, as single character or ``0xYZ`` hex-string of a single byte
:param code: the multibase code, as single character or ``0x...`` hex-string of a non-empty bytestring
:type code: :obj:`str`
:raises ValueError: if the code is invalid
"""
validate(code, str)
if re.match(r"^0x[0-9a-zA-Z][0-9a-zA-Z]$", code):
if re.match(r"^0x([0-9a-zA-Z][0-9a-zA-Z])+$", code):
ord_code = int(code, base=16)
if ord_code in range(0x20, 0x7F):
raise MultibaseValueError("Multibase codes in hex format cannot be printable ASCII characters.")
code = chr(ord_code)
elif len(code) != 1:
raise MultibaseValueError("Multibase codes must be single-character strings or the hex digits '0xYZ' of a single byte.")
if ord(code) not in range(0x00, 0x80):
raise MultibaseValueError("Multibase codes must be ASCII characters.")
raise MultibaseValueError("Multibase codes must be single-character strings or the hex digits '0x...' of a non-empty bytestring.")
return code

@staticmethod
Expand Down Expand Up @@ -145,7 +145,9 @@ def code_printable(self) -> str:
code = self.code
ord_code = ord(code)
if ord_code not in range(0x20, 0x7F):
return "0x"+base16.encode(bytes([ord_code]))
ord_code_num_bytes = max(1, math.ceil(ord_code.bit_length()/8))
ord_code_bytes = ord_code.to_bytes(ord_code_num_bytes, byteorder="big")
return "0x"+base16.encode(ord_code_bytes)
return code

@property
Expand Down Expand Up @@ -555,6 +557,6 @@ def build_multibase_tables(bases: Iterable[Multibase]) -> Tuple[Dict[str, Multib
# Create the global code->multibase and name->multibase mappings.
_code_table: Dict[str, Multibase]
_name_table: Dict[str, Multibase]
with importlib_resources.open_text("multiformats.multibase", "multibase-table.json") as _table_f:
with importlib_resources.open_text("multiformats.multibase", "multibase-table.json", encoding="utf8") as _table_f:
_table_json = json.load(_table_f)
_code_table, _name_table = build_multibase_tables(Multibase(**row) for row in _table_json)
2 changes: 0 additions & 2 deletions multiformats/multibase/err.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,6 @@

class MultibaseKeyError(builtins.KeyError): # pylint: disable = redefined-builtin
""" Class for :mod:`~multiformats.multibase` key errors. """
...

class MultibaseValueError(builtins.ValueError): # pylint: disable = redefined-builtin
""" Class :mod:`~multiformats.multibase` value errors. """
...
51 changes: 26 additions & 25 deletions multiformats/multibase/multibase-table.csv
Original file line number Diff line number Diff line change
@@ -1,25 +1,26 @@
encoding, code, description, status
identity, 0x00, 8-bit binary (encoder and decoder keeps data unmodified), default
base2, 0, binary (01010101), candidate
base8, 7, octal, draft
base10, 9, decimal, draft
base16, f, hexadecimal, default
base16upper, F, hexadecimal, default
base32hex, v, rfc4648 case-insensitive - no padding - highest char, candidate
base32hexupper, V, rfc4648 case-insensitive - no padding - highest char, candidate
base32hexpad, t, rfc4648 case-insensitive - with padding, candidate
base32hexpadupper, T, rfc4648 case-insensitive - with padding, candidate
base32, b, rfc4648 case-insensitive - no padding, default
base32upper, B, rfc4648 case-insensitive - no padding, default
base32pad, c, rfc4648 case-insensitive - with padding, candidate
base32padupper, C, rfc4648 case-insensitive - with padding, candidate
base32z, h, z-base-32 (used by Tahoe-LAFS), draft
base36, k, base36 [0-9a-z] case-insensitive - no padding, draft
base36upper, K, base36 [0-9a-z] case-insensitive - no padding, draft
base58btc, z, base58 bitcoin, default
base58flickr, Z, base58 flicker, candidate
base64, m, rfc4648 no padding, default
base64pad, M, rfc4648 with padding - MIME encoding, candidate
base64url, u, rfc4648 no padding, default
base64urlpad, U, rfc4648 with padding, default
proquint, p, PRO-QUINT https://arxiv.org/html/0901.4016, draft
encoding, code, description, status
identity, 0x00, 8-bit binary (encoder and decoder keeps data unmodified), default
base2, 0, binary (01010101), candidate
base8, 7, octal, draft
base10, 9, decimal, draft
base16, f, hexadecimal, default
base16upper, F, hexadecimal, default
base32hex, v, rfc4648 case-insensitive - no padding - highest char, candidate
base32hexupper, V, rfc4648 case-insensitive - no padding - highest char, candidate
base32hexpad, t, rfc4648 case-insensitive - with padding, candidate
base32hexpadupper, T, rfc4648 case-insensitive - with padding, candidate
base32, b, rfc4648 case-insensitive - no padding, default
base32upper, B, rfc4648 case-insensitive - no padding, default
base32pad, c, rfc4648 case-insensitive - with padding, candidate
base32padupper, C, rfc4648 case-insensitive - with padding, candidate
base32z, h, z-base-32 (used by Tahoe-LAFS), draft
base36, k, base36 [0-9a-z] case-insensitive - no padding, draft
base36upper, K, base36 [0-9a-z] case-insensitive - no padding, draft
base58btc, z, base58 bitcoin, default
base58flickr, Z, base58 flicker, candidate
base64, m, rfc4648 no padding, default
base64pad, M, rfc4648 with padding - MIME encoding, candidate
base64url, u, rfc4648 no padding, default
base64urlpad, U, rfc4648 with padding, default
proquint, p, PRO-QUINT https://arxiv.org/html/0901.4016, draft
base256emoji, 🚀, base256 with custom alphabet using variable-sized-codepoints, draft
6 changes: 6 additions & 0 deletions multiformats/multibase/multibase-table.json
Original file line number Diff line number Diff line change
Expand Up @@ -142,5 +142,11 @@
"code": "z",
"status": "default",
"description": "base58 bitcoin"
},
{
"name": "base256emoji",
"code": "0x01F680",
"status": "draft",
"description": "base256 with custom alphabet using variable-sized-codepoints"
}
]
Loading

0 comments on commit 01046b8

Please sign in to comment.