Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add type stubs #194

Closed
wants to merge 19 commits into from
Closed

Add type stubs #194

wants to merge 19 commits into from

Conversation

kylebarron
Copy link
Contributor

@kylebarron kylebarron commented Aug 10, 2021

Overview

This PR adds type stubs (.pyi files) to declare the types of the h3 bindings. This allows consumers of the h3 library to type check their interactions with it.

Having type stubs in separate .pyi files is beneficial because consumers of h3 can freely opt in to the typing ecosystem if they wish, while the h3 runtime code doesn't need to change at all. I.e. this should have no changes for Python 2.7; it's my understanding that .pyi files are completely ignored except by type checkers, linters, etc.

Examples

Here are some examples with using mypy on this PR. To test locally, install this branch, copy the Python snippet into a temporary file and call mypy on it (e.g. mypy example.py). (I had to use a local copy of mypy within the virtualenv, separated from my global mypy).

Basic int

from h3.api.basic_int import k_ring
from typing import List, Set

out: Set[int] = k_ring(1, 2)
# Passes

out2: Set[str] = k_ring(1, 2)
# error: Incompatible types in assignment (expression has type "Set[int]", variable has type "Set[str]")

out3: List[int] = k_ring(1, 2)
# Incompatible types in assignment (expression has type "Set[int]", variable has type "List[int]")

out4 = k_ring('a', 2)
# Argument 1 has incompatible type "str"; expected "int"

Basic str

from h3.api.basic_str import k_ring
from typing import List, Set

out: Set[int] = k_ring(1, 2)
# error: Incompatible types in assignment (expression has type "Set[str]", variable has type "Set[int]")
# error: Argument 1 has incompatible type "int"; expected "str"

out2: Set[str] = k_ring(1, 2)
# error: Argument 1 has incompatible type "int"; expected "str"

out3: List[int] = k_ring(1, 2)
# error: Incompatible types in assignment (expression has type "Set[str]", variable has type "List[int]")
# error: Argument 1 has incompatible type "int"; expected "str"

out4 = k_ring('a', 2)
# Passes

Numpy int

Recent versions of numpy have an optional typed API. Note that you need to enable the mypy Numpy plugin to run these tests.

from h3.api.numpy_int import k_ring
from typing import List, Set
import numpy as np
from numpy.typing import NDArray

out: Set[int] = k_ring(1, 2)
# error: Incompatible types in assignment (expression has type "ndarray[Any, dtype[unsignedinteger[_64Bit]]]", variable has type "Set[int]")
# error: Argument 1 has incompatible type "int"; expected "unsignedinteger[_64Bit]"

out2: Set[str] = k_ring(1, 2)
# error: Incompatible types in assignment (expression has type "ndarray[Any, dtype[unsignedinteger[_64Bit]]]", variable has type "Set[str]")
# error: Argument 1 has incompatible type "int"; expected "unsignedinteger[_64Bit]"

out3: List[int] = k_ring(1, 2)
# error: Incompatible types in assignment (expression has type "ndarray[Any, dtype[unsignedinteger[_64Bit]]]", variable has type "List[int]")
# error: Argument 1 has incompatible type "int"; expected "unsignedinteger[_64Bit]"

out4 = k_ring('a', 2)
# error: Argument 1 has incompatible type "str"; expected "unsignedinteger[_64Bit]"

int32 = np.int32(1)
out5: NDArray[np.bool_] = k_ring(int32, 2)
# error: Incompatible types in assignment (expression has type "ndarray[Any, dtype[unsignedinteger[_64Bit]]]", variable has type "ndarray[Any, dtype[bool_]]")
# error: Argument 1 has incompatible type "signedinteger[_32Bit]"; expected "unsignedinteger[_64Bit]"

uint64 = np.uint64(1)
out6: NDArray[np.uint64] = k_ring(uint64, 2)
# Passes

Note that here the input types are also quite restrictive. We could relax the type errors for numeric input if we split the H3Cell generic type into H3CellInput and H3CellOutput types. The input we could allow to be numeric-like but the output would always be of type np.uint64.

Todo:

  • Fix functions such as hex_range_distances that return Ordered[Unordered[H3Cell]]. I wasn't able to get return values of Generic[Generic] to work
  • Check return typing of h3_set_to_multi_polygon
  • Add polyfill types.
  • Check typing on Python <3.8. This PR uses the Literal type, which is only in the standard library from 3.8+. Someone on Python <3.8 who doesn't use type checking should not be affected. But I need to check the behavior for type checkers on Python 3.6 and 3.7. We might need to import Literal from typing_extensions. So it might be best to add a new extra to extras_require called typing that installs typing_extensions on Python <3.8.
  • Is it possible to add tests for pyi files?

Other Notes

  • I don't think memoryview has a generic typing API. So we could declare types that accept and return memoryviews, but I don't think require memoryviews with uint64 data.

@codecov
Copy link

codecov bot commented Aug 10, 2021

Codecov Report

Merging #194 (893fa27) into master (fd43513) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##            master      #194   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            9         9           
  Lines          213       213           
=========================================
  Hits           213       213           
Impacted Files Coverage Δ
src/h3/api/_api_template.py 100.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fd43513...893fa27. Read the comment docs.

@ajfriend
Copy link
Contributor

Cool, this is great! I'll try to take a look at this sometime next week.

@jongbinjung
Copy link
Collaborator

jongbinjung commented Aug 13, 2021

New to .pyi files here; but how to folks typically keep these stub files in-sync with the implementation code?

@kylebarron
Copy link
Contributor Author

but how to folks typically keep these stub files in-sync with the implementation code?

That's a good question.

  • In JavaScript projects like react-map-gl that export Typescript types via a .d.ts file (akin to a .pyi file for Typescript), I've seen occasional issues/PRs come up from the runtime API and typing API becoming out of sync.

  • It's my impression that the H3 API is quite stable, so hopefully the two becoming out of sync are less likely here than in a fast-moving repo. And there's no requirement that all runtime methods be included in .pyi files; typing can be added incrementally.

  • There are some discussions about this in Python, see https://stackoverflow.com/q/51716200 and Test stub files against real code python/mypy#5028. Mypy has an undocumented stubtest CLI that aims to check a runtime API against .pyi types. I tried it out but it doesn't seem to work for h3-py because of the Cython internals.

    > python -m mypy.stubtest h3.unstable.vect
    error: h3.unstable.vect failed to import: No module named 'h3._cy.cells'
    Stub: at line 1
    MypyFile:1(
    /Users/kyle/github/mapping/h3-py/h3/unstable/vect.pyi)
    Runtime:
    MISSING
    
    > python -m mypy.stubtest h3.api.basic_str
    error: not checking stubs due to mypy build errors:
    h3/_cy/__init__.py:16: error: Cannot find implementation or library stub for module named "h3._cy.cells"
    h3/_cy/__init__.py:16: note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports
    h3/_cy/__init__.py:41: error: Cannot find implementation or library stub for module named "h3._cy.edges"
    h3/_cy/__init__.py:53: error: Cannot find implementation or library stub for module named "h3._cy.geo"
    h3/_cy/__init__.py:64: error: Cannot find implementation or library stub for module named "h3._cy.to_multipoly"
    h3/_cy/__init__.py:68: error: Cannot find implementation or library stub for module named "h3._cy.util"
    h3/api/basic_int.pyi:17: error: Type application is only supported for generic classes
    h3/api/basic_int.pyi:18: error: Type application is only supported for generic classes
    ...
    

    Aside: I did some searching on the Type application is only supported for generic classes error, and I found mostly this: How do I type a generic factory function? python/mypy#6073 (comment).

    From mypy perspective it is totally fine...
    the first error is bogus, because it is actually supported in mypy, just not at runtime

    Essentially, what I do in each of these APIs for code reuse:
    https://github.com/kylebarron/h3-py/blob/136433ebac51606d89864e14c12a00fb00fd93c2/src/h3/api/basic_int.pyi#L17

    Is only valid at runtime for generic classes not generic functions. So that would be a syntax error if it were to be run through the Python interpreter... but it's not; within the .pyi file it's only run through Mypy or similar.

  • Maybe it would be beneficial to write tests that use mypy with the pyi files. I.e. it should be possible to put this logic into a test somehow:

    from h3.api.basic_int import k_ring
    from typing import Set
    
    out: Set[int] = k_ring(1, 2)
    # Passes
    
    out2: Set[str] = k_ring(1, 2)
    # error: Incompatible types in assignment (expression has type "Set[int]", variable has type "Set[str]")

@ajfriend
Copy link
Contributor

Did you want to land this as-is, or are you still planning on tackling some of the TODOs you listed above?

Also, do you any idea on the pros/cons of storing the stubs in https://github.com/python/typeshed instead of within the repo? (If I'm understanding things correctly that that is a possibility.)

@kylebarron
Copy link
Contributor Author

Oh cool you can see the commit comments inside this PR!

Did you want to land this as-is

No I think a couple things should be resolved first.

  • Is it possible to add tests for pyi files?

I learned how to do this using https://github.com/TypedDjango/pytest-mypy-plugins, see: https://github.com/kylebarron/types-affine/blob/master/tests/test_affine.yml. Would be good to add.

When I was originally hacking on these types, I never hit the h3/api/basic_int.pyi:17: error: Type application is only supported for generic classes error. When I added them in this repo, that started to come up and that might be something I actually need to fix somehow before merging.

do you any idea on the pros/cons of storing the stubs in python/typeshed instead of within the repo

Yes that is an alternative. I think that would be the place to go if h3-py didn't want to include type stubs itself. From Contributing.md:

We accept stubs for third-party packages into typeshed as long as:

  • the package is publicly available on the Python Package Index;
  • the package supports any Python version supported by typeshed; and
  • the package does not ship with its own stubs or type annotations.

I think including the types with the package itself is seen as more "official". If a package starts to include types, typeshed's policy is to remove its stubs 6-12 months later.

@kylebarron
Copy link
Contributor Author

Note I tried to incorporate the suggestion from python/mypy#6073 (comment), but that didn't work because __new__ messes with Mypy's conception of what the output should be, since __new__ should return an instance of the class usually I think.

I think the best option is to copy the text of _api.pyi into each API file. (e.g. c7c0b5d (#194)) It's not ideal because of the duplication, but nothing below

# Should be no changes in each API file below this line

should change in any file since we can just change the parameterized aliases at the top of the file. We could even write a tiny script for CI to verify that the content below that line is identical in each api file.

setup.py Outdated Show resolved Hide resolved
@kylebarron
Copy link
Contributor Author

@ajfriend It was a little annoying to try and get the test scaffolding set up, since the type tests are only run on python 3.6+.

Maybe it would actually be easier to separate tests into two CI steps, change the existing one to pytest tests/*.py and have a new pytest tests/*.yml for the type tests.

At this point it looks like what's left to do is:

  • Update numpy_int.pyi and remove _api.pyi.
  • Add tests (not sure how thorough these need to be; maybe I can model based on existing tests 🤷‍♂️ )
  • Script to test that each typing file has the same API below the comment line
  • Documentation

@ajfriend
Copy link
Contributor

ajfriend commented Sep 1, 2021

@ajfriend It was a little annoying to try and get the test scaffolding set up, since the type tests are only run on python 3.6+.

Maybe it would actually be easier to separate tests into two CI steps, change the existing one to pytest tests/*.py and have a new pytest tests/*.yml for the type tests.

Separating out tests would be nice, and I think it would be fine to run the tests for a subset of Python versions, or even just a single Python version. Having the a separate CI workflow would also make it easier to spot if a build issue was coming from the typing or normal test suite.

@kylebarron
Copy link
Contributor Author

I tried to separate out the tests. Do you know how to fix the Windows tests? I guess you can't use tests/*.py on Windows 😒 .

Copy link
Collaborator

@isaacbrodsky isaacbrodsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re: ulonglong vs uint64: I would recommend uint64 when handling H3 indexes since it's better to be explicit that it is a fixed size and not platform dependent.

Re: Windows: This is probably an issue with how PowerShell resolves the *. Perhaps try specifying shell: bash.

Comment on lines +18 to +19
def string_to_h3(h: str) -> int: ...
def h3_to_string(x: int) -> str: ...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this accept H3Cell as input?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noting this was typed as int because h3_to_string always takes an int, no matter the API

x : int
Unsigned 64-bit integer

Comment on lines +19 to +23
def versions() -> Dict[str, str]: ...
def string_to_h3(h: str) -> int: ...
def h3_to_string(x: int) -> str: ...
def num_hexagons(resolution: Resolution) -> int: ...
def hex_area(resolution: Resolution, unit: AreaUnits = 'km^2') -> float: ...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to avoid repeating the entire API definition after defining H3Cell each time? (As below in the PR?)

@pfw
Copy link

pfw commented Nov 2, 2021

I tried to separate out the tests. Do you know how to fix the Windows tests? I guess you can't use tests/*.py on Windows 😒 .

I've done some poking around as I'm keen to having typing available as well. I believe the change to make for Windows is just to surround arguments with quotes, eg.

      - name: Tests
        run: |
          pytest --cov=h3 --full-trace "tests/*.py"

I created a fork and tried that out and it appears to do the job.

@kylebarron
Copy link
Contributor Author

Hmm I tried to make that change but I'm now getting INTERNALERROR

@pfw
Copy link

pfw commented Nov 2, 2021

I think the quotes are only required on the tests, not the typing tests.

@pfw
Copy link

pfw commented Nov 2, 2021

pfw@7aa011b This was the change I had success with. Just on tests as the quotes blew up on the typing tests which only ran on Ubuntu.

@pfw
Copy link

pfw commented Nov 2, 2021

At least using bash allows for the config to be common across Windows and Linux.

@jmccartin
Copy link

Is this PR still being actively worked on? Am curious if the changes in version 4 will disrupt much of the stub work done here.

@kylebarron
Copy link
Contributor Author

kylebarron commented Feb 17, 2022

I looked back at this PR and had a new (maybe probably crazy) idea. Right now h3-py dynamically loads modules for each API by modifying the globals object. This is hacky and breaks linting tools #80, #180. Might it be useful to explore having a class-based approach? This seems (at first glance, though there are probably issues I haven't thought of yet 😅 ) to be a win-win-win of

  1. Support for linting tools like pylint
  2. Type hints in code (Potentially even supporting Python 2.7 with type comments)
  3. Backwards compatible imports

Here's an example. Replace _api_functions() with a class:

# _api_template.py
from h3 import _cy
from typing import TypeVar, Generic

HexType = TypeVar("HexType")


class _API_FUNCTIONS(Generic[HexType]):
    def __init__(
        self,
        _in_scalar,
        _out_scalar,
        _in_collection,
        _out_unordered,
        _out_ordered,
    ) -> None:
        self._in_scalar = _in_scalar
        self._out_scalar = _out_scalar
        self._in_collection = _in_collection
        self._out_unordered = _out_unordered
        self._out_ordered = _out_ordered

    def geo_to_h3(self, lat: float, lng: float, resolution: int) -> HexType:
        """
        Return the cell containing the (lat, lng) point
        for a given resolution.

        Returns
        -------
        H3Cell

        """
        return self._out_scalar(_cy.geo_to_h3(lat, lng, resolution))

Then create an instance of this class in __init__.py:

# __init__.py

from . import _cy
from ._api_template import _API_FUNCTIONS
from .basic_int import _id as _basic_int_id

basic_int = _API_FUNCTIONS[int](_out_scalar=_basic_int_id, ...)
basic_str = _API_FUNCTIONS[str](_out_scalar=__cy.int2hex, ...)

Then existing code that imports basic_int or basic_str as a module will still work. To keep backwards compatibility of function imports, we could still modify globals within each api file

# basic_int.py
basic_int = _API_FUNCTIONS[int](_out_scalar=_basic_int_id, ...)

methods = {
    name: getattr(basic_int, name)
    for name in dir(basic_int)
    if callable(getattr(basic_int, name)) and not name.startswith("_")
}

globals().update(methods)

But I think pylint and other linting tools should be able to see the methods inside the basic_int class instance, and this works for mypy type inference:

reveal_type(basic_int.geo_to_h3(0, 0, 1))
reveal_type(basic_str.geo_to_h3(0, 0, 1))
# temp.py:44: note: Revealed type is "builtins.int*"
# temp.py:45: note: Revealed type is "builtins.str*"

@kylebarron
Copy link
Contributor Author

Closing in favor of #213, which is a better approach

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants