Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PackageMetadata.__getitem__ returns None #371

Closed
saaketp opened this issue Mar 8, 2022 · 13 comments · Fixed by #416
Closed

PackageMetadata.__getitem__ returns None #371

saaketp opened this issue Mar 8, 2022 · 13 comments · Fixed by #416

Comments

@saaketp
Copy link
Contributor

saaketp commented Mar 8, 2022

(second problem moved to #374).

@jaraco jaraco changed the title Problems with PackageMetadata protocol PackageMetadata.__getitem__ returns None Mar 12, 2022
@jaraco
Copy link
Member

jaraco commented Mar 12, 2022

Hi @saaketp. Thanks for the report. I do believe this issue is a bug. I'm yet unsure if the proper fix is to update the protocol to match the implementation or to update the implementation to match the protocol. Historically, the metadata handling relied primarily on the interface of email.message.EmailMessage. That documentation indicates that None is returned for missing values, so maybe the protocol should be updated to match.

@jaraco
Copy link
Member

jaraco commented Mar 12, 2022

Honestly, I'm conflicted. My instinct is that the current protocol definition is right and that __getitem__ should raise a KeyError if a key is missing. I'm unsure why EmailMessage deviated from this typical dict behavior. @warsaw, do you know why EmailMessage does not raise a KeyError for missing headers? Do you have reason to think that importlib.metadata should continue that tradition or should align instead with dict and other typical behavior for __getitem__?

@jaraco
Copy link
Member

jaraco commented Mar 12, 2022

Separately, I'm disappointed that mypy was no help here. As much grief as mypy gives, nitpicking and requiring lots of redundant type hints, it's simultaneously failed here to identify the reported issue, despite the fact that the return type is declared but the returned object is a subclass of a class that violates that declaration. It's conceivable I've missed an important detail that would have allowed mypy to do this validation here.

Perhaps the issue is that typeshed underdefines the return type as Any. Perhaps someone would like to fix that.

@saaketp
Copy link
Contributor Author

saaketp commented Mar 12, 2022

Looking at the issue that resulted in the change of the return type of email.message.EmailMessage.__getitem__ to Any, looks like it can contain lot more types that just str and Header (python/typeshed#2863) and they chose to return Any instead of object for convenience, but it bites us here.

My vote would be to change the implementation to match the Protocol, I don't like __getitem__ returning None.
Luckily we have a subclass of EmailMessage already, so we can just override the __getitem__ to do an additional check and throw KeyError if the key doesn't exist.

@DMRobertson
Copy link

DMRobertson commented Mar 18, 2022

I've just filed bpo-47060 python/cpython#91216 (originally python/typeshed#7513) which I think might be dancing around the same root cause. Apologies for intruding; I just wanted to provide a cross-reference to BPO for completeness.

@jaraco
Copy link
Member

jaraco commented Mar 19, 2022

Spurred by considerations in that bpo issue, I'm inclined to say the implementation should be limited to raise a KeyError when an item is not defined in the metadata. I would like to confirm that this library itself does not rely heavily on that expectation.

This backward-incompatible change will likely cause some disruption, so an incremental approach is probably needed to deprecate the existing behavior (warn when None result is returned for a missing key).

@jaraco
Copy link
Member

jaraco commented Mar 19, 2022

In 614a522, I confirmed that at least internally, there's no issue with raising KeyError on missing keys.

@jaraco
Copy link
Member

jaraco commented May 21, 2022

Since this change is likely to create disruption, I'd like to reserve it for a new backward-incompatible release to only go into new versions of CPython (3.12 or later), but since it's a refinement of behavior toward the specified behavior, I don't think it needs a deprecation period in CPython's implementation.

@jaraco
Copy link
Member

jaraco commented Dec 18, 2022

Before the release could even hit PyPI, the backward incompatibility struck. The release process failed because twine is relying on __getitem__ returning None.

This finding leads me to believe there may need to be a deprecation period for this change.

rominf pushed a commit to rominf/sentry-python that referenced this issue Jul 5, 2023
In rare cases, `importlib.metadata` values may contain `None`, see
python/cpython#91216
and
python/importlib_metadata#371

The fix skips all distributions with incomplete metadata.
antonpirker added a commit to getsentry/sentry-python that referenced this issue Jul 12, 2023
In rare cases, `importlib.metadata` values may contain `None`, see python/cpython#91216 and python/importlib_metadata#371


Co-authored-by: Ivana Kellyerova <[email protected]>
Co-authored-by: Anton Pirker <[email protected]>
@encukou
Copy link
Member

encukou commented Oct 3, 2023

What's the proper way to write code that can handle a missing item?
If you wrap the getitem in a try/except, you still get a deprecation warning.

cc @hroncok

@int19h
Copy link

int19h commented Feb 23, 2024

I would also like some advice on how code should be structured for cases where one needs to query for metadata that may or may not be there, without triggering the warning.

@jaraco
Copy link
Member

jaraco commented Mar 20, 2024

What's the proper way to write code that can handle a missing item? If you wrap the getitem in a try/except, you still get a deprecation warning.

How about .get(<key>, None)?

Here's a helper you could use:

def future_getitem(metadata: PackageMetadata, key: str) -> str:
    """retrieve a key from metadata similar to a future implementation of PackageMetadata.__getitem__"""
    val = metadata.get(key, None)
    if val is None:
        raise KeyError(key)
    return val

Should importlib_metadata provide this helper?

@jaraco
Copy link
Member

jaraco commented Jun 23, 2024

Finally, importlib_metadata 8 provides the desired behavior, raising KeyError on a missing key. If you want the KeyError behavior, depend on and use importlib_metadata 8 or later (to be incorporated in Python 3.14).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants