Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: closes #11343's [attr-defined] type errors #11345

Conversation

WarrenTheRabbit
Copy link
Contributor

@WarrenTheRabbit WarrenTheRabbit commented Aug 24, 2023

On Windows 10.0.19045, the mypy hook is failing on pre-commit because of two [attr-defined] errors, which prevents me from staging changes. The problem was first reported in #11343. This PR applies a fix by adding type: ignore comments to the erroring lines:

src\_pytest\compat.py:324: error: Module has no attribute "getuid"; maybe "getpid" or "getppid"?  [attr-defined]
testing\test_parseopt.py:294: error: Module has no attribute "getencoding"  [attr-defined]

WarrenTheRabbit and others added 2 commits August 24, 2023 09:38
On Windows 10.0.19045, the mypy hook is failing on pre-commit because of two [attr-defined] errors, which prevents me from staging stages. The problem was first reported in pytest-dev#11343. This PR solves the problem by adding `type ignore` comments to the erroring lines:

```
src\_pytest\compat.py:324: error: Module has no attribute "getuid"; maybe "getpid" or "getppid"?  [attr-defined]
testing\test_parseopt.py:294: error: Module has no attribute "getencoding"  [attr-defined]
```
@@ -321,7 +321,7 @@ def get_user_id() -> int | None:
return None
# getuid shouldn't fail, but cpython defines such a case.
# Let's hope for the best.
uid = os.getuid()
uid = os.getuid() # type: ignore[attr-defined]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this pass if the else block was used?

Copy link
Contributor Author

@WarrenTheRabbit WarrenTheRabbit Aug 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this pass if the else block was used?

Hi, Ronny. Thanks for taking a look. I'll need a clarification, and I also thank you for your patience while I'm learning.

Are you asking
1. what happens if I remove the type: ignore and use an else block instead?
2. what happens when the current code is executed through the implied else block (lines 322 to 325)?
3. something else?

Responding to 1 (use an else block instead)

The Python docs state that os.getuid only has availability on Unix. Since my system is Windows, changing the code to

    if sys.platform in ("win32", "emscripten"):
        return None
    # getuid shouldn't fail, but cpython defines such a case.
    # Let's hope for the best.
    else:
        uid = os.getuid()
        return uid if uid != -1 else None

still fails the mypy hook:

pre-commit run mypy --files .\src\_pytest\compat.py
mypy.....................................................................Failed
- hook id: mypy
- exit code: 1

src\_pytest\compat.py:325: error: Module has no attribute "getuid"; maybe "getpid" or "getppid"?  [attr-defined]
Found 1 error in 1 file (checked 1 source file)

But either option (using type: ignore or an explicit if: else: block) passes when I run the hook on VS Code's WSL.

I'm still very new and I'm trying to understand the motivation behind the question. Are there advantages to having an else block instead of the type: ignore comment - such that you would prefer it, if possible?

Responding to 2 (execute through lines 321 - 325)

The function seems to work fine when executed on a Windows system (it halts at the guard on line 320 and returns None) and a Unix system (it completes execution at 325 and returns uid if uid != -1 else None.

If this is the heart of your question, would you like me to write a test that demonstrates that?

Responding to 3

I worry I have not understood the underlying concern and that my responses do not address them!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the response,

I was wondering about 1

My concerns are resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is someone able to give feedback on my communication? I've reflected and think this would have been easier for Ronny:

Hi, Ronny. Thanks for taking a look. I'm quite new and I'm not confident I have understood the question but I'll respond with the interpretation I think is most likely - let me know if I miss the mark.

(and then the 'Responding to 1' content as before)

I know everyone is busy. But I'd like to flag that I am always open to and welcome feedback. I'm still learning. Moreover, I want to be a very harmonious and easy-to-work-with contributor.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're fine in my book 🙃 I didn't understand what Ronny meant exactly either. If he would have meant something else, your detailed answer would have prevented another round-trip.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WarrenTheRabbit it's always fine to ask for clarifications. Let me explain a bit more what (I think) @RonnyPfannschmidt was aiming for.

Some functions, like os.getuid, are only defined on certain platforms. This is reflected in the type definition of os.getuid in typeshed, the project which defines the typing for all of the standard library. See here:

https://github.com/python/typeshed/blob/a094aa09c2aa47721664d3fdef91eda4fac24ebb/stdlib/os/__init__.pyi#L460-L476

You can see that os.getuid is only defined when sys.platform != "win32".

Now, mypy is a static type checker, and such platform-dependence puts it in a conundrum - which platform should it use when type checking? This can cause meaningful differences, like in this example - when run on linux platform (like CI) there is no error, but on Windows there is. What mypy does is to type-check using the platform it is running on, but you can ask it use a different platform using for example mypy --platform win32 or mypy --platform linux.

So we see that mypy can give different results on different platforms, that's not great. But it turns out that mypy does provide a solution to this -- it understands if sys.platform == "..." checks and knows to ignore either the if or the else according to the platform it's using. See here for the mypy docs on this.

OK, if mypy does this then why doesn't work here? The reason is that while mypy is smart, it's not smart enough to understand if sys.platform in (...) checks. So to help it we need to write the code using or and else instead:

    if sys.platform == "win32" or sys.platform == "emscripten":
         return None
    else:
        # getuid shouldn't fail, but cpython defines such a case.
        # Let's hope for the best.
        uid = os.getuid()
        return uid if uid != -1 else None

Now when running with win32 platform, mypy ignores the else branch and everything is good :)

I recommend switching to this, as it removes the need for the type: ignore which is better avoided when possible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in that case we might also want report a upstream bug for mypy
after all previously if/else made sense due to win32, now with emscripten, its much more likely to see those in checks

Copy link
Contributor Author

@WarrenTheRabbit WarrenTheRabbit Aug 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you everyone for being so welcoming and supportive - either in this repo or on LinkedIn. 🙏
And thank you, @bluetech, for taking the time to elaborate and teach me. 🎓💛

bluetech: I recommend switching to {non-containment checks}, as it removes the need for the type: ignore which is better avoided when possible.

Perfect. I'll do that. I've done that.
I'll also I've added a comment that gives the reader more type checking context. But I can remove this if the information is obvious to all but the very inexperienced:

    # mypy follows the version and platform checking expectation of PEP 484:
    # https://mypy.readthedocs.io/en/stable/common_issues.html?highlight=platform#python-version-and-system-platform-checks
    # Containment checks are too complex for mypy v1.5.0 and cause failure.

Outstanding items

Ronny: in that case we might also want report a upstream bug for mypy
after all previously if/else made sense due to win32, now with emscripten, its much more likely to see those in checks

1

That sounds like a great idea, @RonnyPfannschmidt. I'll happily look into how to do that. I've posted to the python/typing gitter with my understanding of the motivation and asked how to officially discuss the code reachability issue. I'll report in when I have been successful or become stuck.

2

The in-code comments confused me.

 # getuid shouldn't fail, but cpython defines such a case.
 # Let's hope for the best.

I don't know what we are hoping for or if it is a good idea to! So I'll be looking into that.
At a minimum, I'd like to link to something that gives more context (unless that is a bad idea?).

3

I need to figure out how to add the discussed changes in a way that maintains continuity. Am unclear how to do

"Add more commits by pushing to the add-type-ignore-for-attr-defined-errors branch on WarrenTheRabbit/pytest."

as advised.

3

I need to understand and resolve the codecov/patch failure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The in-code comments confused me

I believe it means that the return from os.getuid() should always work, but it might return -1 so we account for that in the last line:

return uid if uid != -1 else None

We should move that comment to above that line. 👍

# win32 does not have a getuid() function.
# On Emscripten, getuid() is a stub that always returns 0.
if sys.platform in ("win32", "emscripten"):
def get_user_id(*, UNRELIABLE: int = -1) -> int | None:
Copy link
Member

@nicoddemus nicoddemus Aug 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean this extra argument as some future-proofing?

I think this is the wrong approach: the caller should not need to know which is the value that represents an error from os.getuid. If it ever varies between platforms, it should be handled internally by get_user_id itself.

Now, if your intent is to improve reliability, instead of the hardcoded value we had before, you can move it to a constant:

UNRELIABLE = -1
return uid if uid != UNRELIABLE else None

(Also parameters should be camel case).

@WarrenTheRabbit
Copy link
Contributor Author

WarrenTheRabbit commented Aug 26, 2023

Thank you, @nicoddemus, for that feedback. It was very helpful.

@nicoddemus: Did you mean this extra argument as some future-proofing?

That was the idea. But with some awkwardness, I was also unclear on the current needs of the platforms ecosystem.
[Reflection 3 relates].

@nicoddemus: if your intent is to improve readability [use an inline constant]

That was the original intent. The future/present proofing was 'value added'.
An inline constant is nice and simple. Should we just leave it at that?
The open PR is fix: closes #11343's [attr-defined] type errors] not feature: future-proof getuid indicator errors.
[Reflection 2 relates]. I've also included some housekeeping tasks:

  • added a comment to prevent problematic refactorings of the platform-check logic (should this be done as a test instead?)
  • added comments and a variable to clarify the behaviour of os.getuid on different platforms

@nicoddemus: I think [the caller needing to know the error indicator] is the wrong approach . . . [and] it should be handled internally by get_user_id itself.

You are right! Mine is the wrong approach. 🙏
By Reflection 2 and 3, I think I won't attempt any future-proofing now.
Going forward with that decision, which is better?

  • Just use the constant UNRELIABLE (-1).
  • Make a call to a function stub, _get_uid_error_indicator, that returns -1.
    (Side note: I need to look through the guidelines on conventions around using _function_name over function_name.

@nicoddemus: parameters should be camel case

Thank you. I was unsure. Should it have been unreliable: int = -1 or without a space: unreliable: int=-1?
(The parameter will be removed from the implementation however.)
Also: camel case or snake case?

Reflection

1. I should provide succinct context and motivation in the message body.

I regret not providing a commit body. I am new to making commits, having a workflow and working with others but I can see how a short statement of the motivation/thinking behind the changes would have taken ownership for some of the thinking that was instead left to, for example, @nicoddemus.

2. I should not drift and I should not bloat.

This issue was originally opened to fix an error. I have a feeling that my workflow should be:

  • create an issue/PR that is well-named and attends to one thing only
  • If other things come up during that issue/PR, deal with them in a separate, well-named issue/PR.
    Then link the issues for context if useful.
3. I need to learn the repo's processes for noting/triaging future work.

Future-proofing might be a good idea. What communities should I talk to? What stops me from forgetting to follow up? Can I mark it as something to be researched? Those are the questions I'm asking myself.

The previous implementation required the caller
to have knowledge of its platform's `getuid` error
code. Although this had the benefit of replacing a
hard-coded error value with a descriptive variable,
the attempt at future proofing `get_user_id` was
flagged as a poor design decision by Nicoddemus:
to future proof the function, `get_user_id` should
determine the error flag internally.

However, future-proofing was not the reason
the issue was opened. The issue was opened to
resolve an `[attr-defined]` error on Windows
platforms caused by *mypy*'s handling of
code reachability through platform checks.

Over the course of the open issue, additional housekeeping
tasks were also undertaken:
- adding a comment that links to an explanation of
*mypy*'s platform and version checking requirements
- adding a variable name and comment to clarify
the reason for the conditional return `return uid if uid . . .`
@nicoddemus
Copy link
Member

(Side note: I need to look through the guidelines on conventions around using _function_name over function_name)

We don't have written guidelines, but basically everything without a _ is public so should be used carefully, however we are already in a private module, so in those cases we leave out the _, as it is internal only anyway.

Thank you. I was unsure. Should it have been unreliable: int = -1 or without a space: unreliable: int=-1?

Don't worry, black decides it for us. 😉

Also: camel case or snake case?

Parameters are snake case.

Just use the constant UNRELIABLE (-1).

I think the constant is simple, we do not need to over-engineer this. (Btw I think ERROR is more appropriate. UNRELIABLE conveys that it returned some user id, but cannot be relied upon, which is not the case).

  1. I should provide succinct context and motivation in the message body.

I like this post about commit messages: https://cbea.ms/git-commit/

I should not drift and I should not bloat.

In general that is the recommendation, however small refactorings/improvements in the same PR are fine and welcome, but better kept in separate commit messages for easier reviewing/backout if necessary.

For a master on this, see @bluetech's PRs.

  1. I need to learn the repo's processes for noting/triaging future work.

Indeed, but in my experience, repos tend to lean towards smaller PRs, specially from developers new to the codebase.

src/_pytest/compat.py Outdated Show resolved Hide resolved
src/_pytest/compat.py Outdated Show resolved Hide resolved
It is misleading to say that None is returned because
the uid is unreliable. That implies that a uid of uknown
integrity was retrieved. More accurately, and across all
platforms, None indicates that no such id was determined.
The previous constant name implied that a user id was returned
 and that it cannot be relied upon. That is not the case. -1 is an
 error flag and the constant should identify it as such.
Returning None does not mean that all future calls to the underlying
platform will result in a return value of None. It is a certainty for win32
and Emscripten platforms but only a possibility for other platforms.
The documentation should only state that the *current* call to the
underlying platform was unsuccessful.
Copy link
Member

@nicoddemus nicoddemus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @WarrenTheRabbit!

@nicoddemus nicoddemus merged commit ff23347 into pytest-dev:main Aug 27, 2023
23 checks passed
@WarrenTheRabbit
Copy link
Contributor Author

WarrenTheRabbit commented Aug 27, 2023

Thank you, @RonnyPfannschmidt , @The-Compiler , @bluetech and @nicoddemus for helping me to be involved in something I find very meaningful. 💕

I will be revisiting and reflecting on all the advice that was shared with me throughout this process too. 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants