-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use, misuse and confusion of @check_messages
decorator
#6060
Comments
Let me give an example on this point, as I think it is the most problematic:
The checker I wanted to prepare has a good example for that. The relevant parts can be stripped down to this: class MessagesChecker(BaseChecker):
"""Checks if messages are handled correctly in checker classes."""
__implements__ = (IAstroidChecker,)
name = "messages_checker"
def __init__(self, linter: PyLinter) -> None:
super().__init__(linter)
# List of all messages defined in the checker's msgs attribute
self._defined_messages = None
def visit_classdef(self, node: nodes.ClassDef) -> None:
if not _is_checker_class(node):
return
self._defined_messages = _get_defined_messages(node)
@check_messages("undefined-message")
def visit_call(self, node: nodes.Call) -> None:
... # omitted: check if this is a ``self.add_message`` call and extract the msgid
if msgid not in self._defined_messages:
self.add_message("undefined-message", node=node, args=(msgid,))
Now imagine a new message is added, for example "inconsistent-message-ids", which is checked from the """Checks if messages are handled correctly in checker classes."""
__implements__ = (IAstroidChecker,)
name = "messages_checker"
def __init__(self, linter: PyLinter) -> None:
super().__init__(linter)
# List of all messages defined in the checker's msgs attribute
self._defined_messages: Dict[Str, nodes.Dict] = {}
@check_messages("inconsistent-message-ids") # <-- ADDED
def visit_classdef(self, node: nodes.ClassDef) -> None:
if not _is_checker_class(node):
return
self._defined_messages = _get_defined_messages(node)
self._check_inconsistent_message_ids() # <-- ADDED
@check_messages("undefined-message")
def visit_call(self, node: nodes.Call) -> None:
... # omitted: check if this is a ``self.add_message`` call and extract the msgid
if msgid not in self._defined_messages:
self.add_message("undefined-message", node=node, args=(msgid,)) Looks good, doesn't it? TypeError: argument of type 'NoneType' is not iterable So, what happened? By adding the My proposal: We could do this in steps:
What do you think? |
@DudeNr33 Thanks for this extensive write-up. I agree with most points here and I think this can/is indeed problematic. I had one question though: have you checked how much of a performance benefit this actually gives? Even if we rename to |
Yeah that's very useful I was using it without understanding it properly personally. It probably contributed to hard to debug issues along the way. I like the proposed steps. If the performance improvment is worth it, we could also create our own internal checker to verify that it's on a function when there is no side effectand conversely that there is no side effects if it's on a function. |
No, I haven't. The question is what we should measure here. With check_messages active: With check_messages replaced with a dummy: The reason we also have similar speedups in the second time is that if none of the messages a checker has is enabled, the whole checker will be ignored - this is done without relying on We can see that the difference between the two runs is marginal. As stated above a lot of If we want to aim for performance I guess it would be more promising to write the For completeness, here the quick'n'dirty script: import time
from contextlib import suppress
from pathlib import Path
from pylint.lint import Run
if __name__ == "__main__":
categories = ["E", "F", "R", "C", "W", "I"]
for category in categories:
start = time.perf_counter()
with suppress(SystemExit):
Run(
[
str(Path(__file__).parent.parent / "pylint"),
"--enable=all",
f"--disable={category}",
"--output-format=pylint.testutils.MinimalTestReporter",
]
)
end = time.perf_counter()
print(f"Disabled category: {category} - Elapsed time: {end-start:0.3f}s") |
Hmm, since the documentation project is likely to find all these issues it is very likely that all What about creating a |
The speed-up when running with
Can you give an example for |
This was mostly in relation to the above point. I think that in the |
There might be a benefit to separating messages inside checker to reduce this complexity. For example the "basic" checker is dispatched between multiple classes so it seems to be possible. https://github.com/PyCQA/pylint/blob/main/pylint/checkers/base/basic_checker.py#L36 There is a benefit with doing checks only once instead of doing it in each class but if it's possible to separate them, we're creating a mess by putting two independent check in the same |
A counterpoint to that is that many of these I explored passing a pre-inferred value to |
I agree, bloating the This, of course, makes it harder to keep track of all messages that can be possibly emitted by this
I am confused on what the benefit of this |
Apparently nothing else, I opened a MR to check with the full test suites (#6091). |
Should we vote for the future of
Maybe also @cdce8p and @jacobtylerwalls want to cast their vote on this? |
Are we sure the renaming (and thus |
Oh, it's public? I didn't realize that. Is it documented anywhere? I didn't realize we were talking about deprecation warnings. |
We need some sort of definition of the public API footprint. I wouldn't know where to get that info. |
At this point I don't know: but it is an non-underscored function in |
Alright. I'm changing my vote to "let's document this in the development guide somewhere" |
I'm voting that as well. I do think the suggested name is much better, but I am not sure if the hassle of changing it is worth it. |
The public API is/was undocumented so it's always a guess as to how much use there is in downstream libraries. For example confidences could be refactored to an ordered enum and save us a lot of trouble but I suppose it's used somewhere and we can't just do that. Creating such a document would be nice but we'll still need to guess what was public API because we could be wrong and pointing to this document when we break a widely used API (#4399) is not going to save us :)
The python3 checker is not based on pylint, it's pylint's code directly (we removed it from pylint a while ago). The doc do not talk about the decorator : https://pylint.pycqa.org/en/latest/how_tos/custom_checkers.html, nor does the examples uses it : https://github.com/PyCQA/pylint/tree/main/examples. Of course it's still possible that someone at some point copy pasted one of our checker with the decorator in it so we should be safe about it. The documentation project for each message as shown that there are a lot of issue with this decorator, maybe 25% of message don't work alone if other messages are disabled. I'm not sure keeping the name is a good idea when it creates so much hard to debug issues in our own code (before @DudeNr33 had the good idea to check what it actually does 😄). It's also going to create problems in downstream code where it's used imo. So I think we should rename it to |
Sorry for being so indecisive! A self-documenting name is probably less effort than writing a hard-to-understand paragraph people might not find. So let's do the rename/deprecation warnings 👍🏻 |
Current problem
The work on #5953 showed that specifically enabling just a single message is sometimes not working as expected.
This is due to either missing message declaration in a checkers
msgs
attribute or the incorrect usage of the@check_messages
decorator.Ideally I would like to implement a new, optional checker aimed at Pylint core development and custom checker development.
This checker would focus on correct message declaration and handling.
I already started working on it, but noticed that there is especially quite some problems with the
@check_messages
decorator. I totally understand that, because while being extensively used across all checker classes, there is not a single word on its purpose and effect in the documentation. Its docstring is also not really helpful, you have to dig deeper into the codebase to understand what it does.Desired solution
Before continuing my work on the new checker, I want to address and clarify some questions about the intended use of
@check_messages
.After that this issue can be closed with a PR that extends the documentation, and if necessary some code changes.
A brief explanation on what
@check_messages
does:It helps the
ASTWalker
decide whether a callback for the current node type should be called or not.This is done by adding the list of messages passed in as a
checks_messages
attribute on method object.When a checker is added to the
ASTWalker
instance, it loops over all of its methods. Every method starting withvisit_
orleave_
is now checked for thechecks_messages
attribute:checks_messages
list is enabled, this method will be added to the list of callbacksEssentially this means:
@check_messages
decorator is to improve performance@check_messages
decorator on any method which is not a "top level callback" (methods starting withvisit_
orleave_
does not have any effect@check_messages
decorator has no negative functional consequences, it only affects performance (which, of course, is always an issue)What I want to gain a consensus on:
check_messages
? I don't think it really conveys the effect it has on the decorated method.Looking over the code base there are quite a lot of places where the decorator is not used correctly:
@check_messages(*MSGS)
Additional context
No response
The text was updated successfully, but these errors were encountered: