Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix namespace argument regression in argparse.parse_args #10387

Merged
merged 4 commits into from
Jul 19, 2023

Conversation

Jakob-Stadler
Copy link
Contributor

The previous commit ( #10307 ) explicitly added Namespace to the first overload, which causes a false-postive when the used namespace= argument is a subclass of Namespace.

As far as I can tell, the explicit Namespace type is unnecessary to begin with, since it's already covered in the TypeVar _N.

Example code, flagged both in pyright and mypy (with most recent typeshed):

import argparse
import sys

# Explicit subclass of Namepsace
# not strictly necessary, but it conveys intent better than a standalone class
class ExampleNamespace(argparse.Namespace):
  example_argument: str

parser = argparse.ArgumentParser()
parser.add_argument('example_argument')


ns = parser.parse_args(sys.argv, ExampleNamespace())
reveal_type(ns)  # Revealed type is "Namespace"


# error: Incompatible types in assignment (expression has type "Namespace", variable has type "ExampleNamespace")  [assignment]
ens: ExampleNamespace = parser.parse_args(sys.argv, ExampleNamespace())
reveal_type(ens)  # Revealed type is "ExampleNamespace"

The previous commit [1] explicitly added `Namespace` to the first overload, which causes a false-postive when the used `namespace=` argument is a subclass of  `Namespace`.

Example code, flagged both in pyright and mypy (with most recent typeshed):

```python
import argparse
import sys

# Explicit subclass of Namepsace
# not strictly necessary, but it conveys intent better than a standalone class
class ExampleNamespace(argparse.Namespace):
  example_argument: str

parser = argparse.ArgumentParser()
parser.add_argument('example_argument')


ns = parser.parse_args(sys.argv, ExampleNamespace())
reveal_type(ns)  # Revealed type is "Namespace"


# error: Incompatible types in assignment (expression has type "Namespace", variable has type "ExampleNamespace")  [assignment]
ens: ExampleNamespace = parser.parse_args(sys.argv, ExampleNamespace())
reveal_type(ens)  # Revealed type is "ExampleNamespace"
```

[1] python#10307
@github-actions

This comment has been minimized.

@srittau
Copy link
Collaborator

srittau commented Jun 30, 2023

Thanks for the analysis!

The new pytest primer hit is because pytest passes an argument annotated as Namespace | None to parse_known_args:

https://github.com/pytest-dev/pytest/blob/81cfb3fc874d5a4127dcc8f11e6bb4ae424fb094/src/_pytest/config/argparsing.py#L433

This used to trigger the first overload, now it triggers the second one. Having a Not type (python/typing#801) would probably be the easiest solution here. I wonder whether using (namespace: _N | None) -> _N would help in the second overload.

In any case, due to the complexity of the situation we should have test cases for the various combinations of types.

@Jakob-Stadler
Copy link
Contributor Author

I wonder whether using (namespace: _N | None) -> _N would help in the second overload.

I made another small analysis based on that train of thought.

Instead of replacing the second overload, I tried injected it before the second overload.
I noticed that limiting the return type to _N does not always get the correct result, so I did another pass with _N | Namespace (see additional_test3 for why that matters)

'''
Additional tests with another injected overload

    @overload
    def parse_known_args(self, args: Sequence[str] | None = None, namespace: None = None) -> tuple[Namespace, list[str]]: ...  # type: ignore[misc]
    @overload
--> def parse_known_args(self, args: Sequence[str] | None, namespace: _N | None) -> tuple[_N, list[str]]: ...
    @overload
    def parse_known_args(self, args: Sequence[str] | None, namespace: _N) -> tuple[_N, list[str]]: ...
    @overload
    def parse_known_args(self, *, namespace: _N) -> tuple[_N, list[str]]: ...

And a second time with a different return type: [_N | Namespace , list[str]]
'''

import argparse
from typing import Optional, Sequence, Dict, Any, reveal_type
FILE_OR_DIR = "file_or_dir"


# Explicit subclass of Namepsace
# not strictly necessary, but it conveys intent better than a standalone class
class ExplicitNamespace(argparse.Namespace):
    example_argument: str


class ImplicitNamespace:
    example_argument: str


class MyOptionParser(argparse.ArgumentParser):
    extra_info: Dict[str, Any]

    # from pytest:

    def parse_args(  # type: ignore
        self,
        args: Optional[Sequence[str]] = None,
        namespace: Optional[argparse.Namespace] = None,
    ) -> argparse.Namespace:
        """Allow splitting of positional arguments."""
        parsed, unrecognized = self.parse_known_args(args, namespace)
        if unrecognized:
            for arg in unrecognized:
                if arg and arg[0] == "-":
                    lines = ["unrecognized arguments: %s" % (" ".join(unrecognized))]
                    for k, v in sorted(self.extra_info.items()):
                        lines.append(f"  {k}: {v}")
                    self.error("\n".join(lines))
            getattr(parsed, FILE_OR_DIR).extend(unrecognized)
        return parsed
        # Result: all pass, for both variants

    # Some additional tests:

    def additional_test1(self, ns_in: argparse.Namespace | None) -> None:
        # Namespace -> Namespace
        # None -> Namespace
        ns_out, _ = self.parse_known_args(['some', 'args'], ns_in)
        reveal_type(ns_out)
        # Results: (all correct)
        # (_N | None) -> _N
        # in pyright/mypy:  Namespace
        #
        # (_N | None) -> _N | Namespace
        # in pyright/mypy:  Namespace


    def additional_test2(self, ns_in: ExplicitNamespace | None) -> None:
        # ExplicitNamespace -> ExplicitNamespace
        # None -> Namespace
        ns_out, _ = self.parse_known_args(['some', 'args'], ns_in)
        reveal_type(ns_out)
        # Results:
        # (_N | None) -> _N
        # in pyright/mypy:  ExplicitNamespace (incorrect result)
        #
        # (_N | None) -> _N | Namespace
        # in pyright:  ExplicitNamespace | Namespace (correct result)
        # in mypy:  Namespace (absorbed subclass? -> also correct)


    def additional_test3(self, ns_in: ImplicitNamespace | None) -> None:
        # ImplicitNamespace -> ImplicitNamespace 
        # None -> Namespace
        ns_out, _ = self.parse_known_args(['some', 'args'], ns_in)
        reveal_type(ns_out)
        # Results:
        # (_N | None) -> _N
        # in pyright/mypy:  ImplicitNamespace (incorrect result)
        #
        # (_N | None) -> _N | Namespace
        # in pyright/mypy:  ImplicitNamespace | Namespace (correct result)

In any case, due to the complexity of the situation we should have test cases for the various combinations of types.

Agreed. Sounds like a good idea to me.

Jakob-Stadler and others added 2 commits July 6, 2023 17:41
This solution may be overly verbose.

The real problem is that the TypeVar _N should be bound to "mutable objects" that allow adding attributes via setattr(), and ignore builtin immutable objects like None.
@github-actions

This comment has been minimized.

@github-actions
Copy link
Contributor

github-actions bot commented Jul 6, 2023

Diff from mypy_primer, showing the effect of this PR on open source code:

pytest (https://github.com/pytest-dev/pytest)
+ src/_pytest/config/argparsing.py:448: error: Incompatible return value type (got "Namespace | None", expected "Namespace")  [return-value]

materialize (https://github.com/MaterializeInc/materialize)
- misc/python/materialize/mzcompose/__init__.py:965: note:          def parse_known_args(self, args: Optional[Sequence[str]] = ..., namespace: Optional[Namespace] = ...) -> tuple[Namespace, list[str]]
+ misc/python/materialize/mzcompose/__init__.py:965: note:          def parse_known_args(self, args: Optional[Sequence[str]] = ..., namespace: None = ...) -> tuple[Namespace, list[str]]
- misc/python/materialize/cli/mzcompose.py:670: note:          def parse_known_args(self, args: Optional[Sequence[str]] = ..., namespace: Optional[Namespace] = ...) -> tuple[Namespace, list[str]]
+ misc/python/materialize/cli/mzcompose.py:670: note:          def parse_known_args(self, args: Optional[Sequence[str]] = ..., namespace: None = ...) -> tuple[Namespace, list[str]]
+ misc/python/materialize/cli/mzcompose.py:677: error: Incompatible return value type (got "tuple[Optional[Namespace], list[str]]", expected "tuple[Namespace, list[str]]")  [return-value]
- misc/python/materialize/cli/mzcompose.py:681: note:          def parse_known_args(self, args: Optional[Sequence[str]] = ..., namespace: Optional[Namespace] = ...) -> tuple[Namespace, list[str]]
+ misc/python/materialize/cli/mzcompose.py:681: note:          def parse_known_args(self, args: Optional[Sequence[str]] = ..., namespace: None = ...) -> tuple[Namespace, list[str]]
+ misc/python/materialize/cli/mzcompose.py:688: error: Incompatible return value type (got "tuple[Optional[Namespace], list[str]]", expected "tuple[Namespace, list[str]]")  [return-value]

Copy link
Collaborator

@srittau srittau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late reply. Your final solution looks not only correct to me, but is also simpler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants