Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot get pylint to register custom checker, is there a missing step in the docs? #7264

Closed
daogilvie opened this issue Aug 4, 2022 · 30 comments · Fixed by #7502
Closed

Comments

@daogilvie
Copy link
Contributor

daogilvie commented Aug 4, 2022

Question

Hi! First: Thanks for supporting and maintaining such a great ecosystem!

My question: I'm trying to make a simple checker around some custom formatting in log strings. I have followed every tutorial I can find from within the last few years, and naturally looked at the guide in the documentation.

However, I am finding that the register function, described as the way to get your checker actually loaded into pylint, simply is not being called. The module is being loaded, for sure — I can see breakpoints in the top level of the module being hit, but not in the register function itself. I've tried setting the --load-plugins argument, I've tried using the load_plugins setting in pylintrc.

Is there an additional step or constraint missing in the implementation of custom checkers from the docs? Or am I simply making a mistake?

Below is the code for my absolute bare-bones first attempt.

Stored in testchecker.py:

from astroid import nodes
from pylint.checkers import BaseChecker

breakpoint() # HIT!

class SimpleCheck(BaseChecker):
    """Simple"""
    name = "simple-test-check"
    msgs = {
            "W6900": ("Test", "test-check", "Just a test")
            }
    options = ( ("fake-option", {
                "default": False,
                "type": "yn", # EDIT: bool here was not allowed
                "metavar": "<y or n>",
                "help": "Some fake option"}
                ),
            )

    def visit_import(self, _: nodes.Import) -> None:
        breakpoint() # NOT HIT!

def register(linter):
    breakpoint() # NOT HIT!
    linter.register_checker(SimpleCheck(linter))

My file I am linting, lintme.py:

"""MODULE DOCSTRING"""
import logging

logger = logging.getLogger(__name__)

logger.info("hello world")

Both files are in the same directory, in which I execute:

pylint --load-plugins=testchecker lintme.py
$ pylint --version
pylint 2.14.5
astroid 2.11.7
Python 3.10.4 (main, May 19 2022, 08:51:27) [Clang 13.1.6 (clang-1316.0.21.2.5)]
$ uname -a
Darwin OVO4941L.ovoenergy.local 21.6.0 Darwin Kernel Version 21.6.0: Sat Jun 18 17:07:25 PDT 2022; root:xnu-8020.140.41~1/RELEASE_X86_64 x86_64

Thanks again!

Documentation for future user

I expect the guide to creating a custom checker to contain everything needed to do so, and to have a section highlighting common and easy mistakes (I'm assuming I have made such a mistake)

Additional context

I have searched stack overflow and pre-existing issues on this repo, and not found anyone reporting this.
There is https://stackoverflow.com/questions/71590677/cant-load-custom-pylint-module, but they report getting an error about not finding the module, then succeeding by placing the module in the pylint source. This is not what I am experiencing.

@daogilvie daogilvie added Documentation 📗 Needs triage 📥 Just created, needs acknowledgment, triage, and proper labelling Question labels Aug 4, 2022
@Pierre-Sassoulas Pierre-Sassoulas added Needs investigation 🔬 A bug or crash where it's not immediately obvious what is happenning and removed Needs triage 📥 Just created, needs acknowledgment, triage, and proper labelling labels Aug 4, 2022
@Pierre-Sassoulas
Copy link
Member

Thank you for opening the issue and the kind words. I don't see what you did wrong at first sight so we'll have to investigate, but I don't have the time right now. Here's a real world example in pylint_django to maybe unblock you without waiting for us: https://github.com/PyCQA/pylint-django/blob/master/pylint_django/checkers/__init__.py

@daogilvie
Copy link
Contributor Author

Hi @Pierre-Sassoulas, thank you for such a quick response! It seems they use a different top-level function name (register_checkers, not just register), but this doesn't seem to work in my case. I was hoping I'd simply made a silly mistake, and maybe I still have! If it's not obvious though, then I will spend some time tomorrow digging into the pylint internals and see if I can't debug what's happening directly, hopefully save you the time.

@daogilvie
Copy link
Contributor Author

daogilvie commented Aug 5, 2022

Update: I added a breakpoint to the dynamic module loader in pylinter.py and... now it does load my checker? I have changed literally nothing else 🤔 Or, at least, I'm almost certain I've changed nothing else.

Update Update: It calls register, but actually there is some other error that results in a stack trace. Continuing to investigate.

pylint/config/argument.py", line 240, in __init__
    self.type = _TYPE_TRANSFORMERS[arg_type]
KeyError: 'bool'

I suspect I've supplied something wrong with the option spec.

Updated update update: Yep, I misunderstood the docs and assumed the type for options would mirror python types, but this is not the case. Have updated the example to accomodate. HOWEVER, adding a breakpoint seems to make it work, and taking the breakpoint away... makes it not work. 🤯 I've not encountered that before, so this is a fun new learning experience for me.

@daogilvie
Copy link
Contributor Author

daogilvie commented Aug 5, 2022

Ok, right. @Pierre-Sassoulas I don't know if this is a bug, or just a quirk that needs documenting.

NOTE: Have tweaked the body below after doing some further research after eating something 😄

TL;DR: pylint registers custom plugins specified on command line without the current directory included in sys.path, and even if an init-hook does this later it never retries registration if it can't import a module first time round.

Detailed Description

When supplying custom plugin module names to load via the command line, this list is sent to the importer code before the current directory is added to sys.path. This means that, as in my case, it fails to import the custom checker as it cannot find it in the path.
This first attempt at importing, crucially, is the only time that pylint tries to call register. It will never attempt to register a plugin module of the same name twice, adding their names to a set of ones it has tried, it seems. This means that, even though having the module in a pylintrc file would work on its own, specifying it once in the file and once in commandline will not work.

The reason the module is always being imported though, is that later on, pylint loops through the set of plugin names and attempts to load those modules again. This time, it looks for the load_configuration function. I don't define this in my case, so nothing happens, but by this time the current directory has been added to the sys.path list, by some init-hook code I have for an unrelated reason, so the module is imported without any trouble by astroid.

Putting a breakpoint in makes this work in 2 ways:

  1. The act of putting a breakpoint in apparently adds the current working directory into the syspath, something I did not know.
  2. It seems that the init-hook runs in time for the sys path to be updated for when I cont/step out of the breakpoint ¯_(ツ)_/¯

Steps to Reproduce

  1. Create custom checker in current directory.
  2. Run pylint, supplying --load_plugins on command line
  3. Observe that your checker is imported but not registered.
  4. Add a pylintrc file, where you add your plugin to the load_plugins configuration item, and an init-hook to add the current directory to sys.path.
  5. Run pylint, with no command line plugin arguments
  6. Observe that everything works as you expect

Happy to Help!

I'm happy to investigate further, and to contribute documentation, or code, to work around this, but I'd want to make sure I was doing something that was Pylint Approved™️ rather than waste your time with a PR that "solves" the "problem" in a way that the maintainers would never go for. I don't do enough open source, so here is as good a place to start as any 😄

@Pierre-Sassoulas Pierre-Sassoulas added Bug 🪲 Configuration Related to configuration and removed Needs investigation 🔬 A bug or crash where it's not immediately obvious what is happenning labels Aug 5, 2022
@Pierre-Sassoulas
Copy link
Member

Thank you for the very detailed investigation ! It seems it's indeed a bug with the configuration loading vs CLI argument that might be done differently.

@daogilvie
Copy link
Contributor Author

Thank you for the very detailed investigation ! It seems it's indeed a bug with the configuration loading vs CLI argument that might be done differently.

Ok! If I can get the time over the next 3-4 days I'll try to open a PR as per the contribution guidelines, or I'll put a note here if that isn't going to happen.

@daogilvie
Copy link
Contributor Author

daogilvie commented Aug 8, 2022

EDIT: spoke too soon, please disregard this comment for the moment.

So it looks like the reason this is happening is that the pylint entrypoint attempts to process all provided command line arguments here in one go, (function defined config.utils#L230).
I'm not entirely sure what the right approach is, given the inter-relation between the arguments you can provide and what is in the file. It seems like quite an involved solution to retry CLI plugin loading if it failed and there is an init-hook defined in a config file. Perhaps the best balance is surfacing a more nuanced error if modules cannot be loaded from the CLI argument?
The right answer might simply be a documentation change to make people aware of this? I.e don't provide --load_plugins for a non-syspath location without also providing --init_hook for the path?

@daogilvie
Copy link
Contributor Author

daogilvie commented Aug 8, 2022

In the Run class, it looks like the CLI plugins are explicitly loaded before the config is initialised: lint/run.py#L150-L163.

If we raise an error at this stage, this will explicitly break any workflows that were previously silently failing, which is probably not desirable.
Alternatively, we can move the plugin load call to after the config, but I suspect it's this way round for a reason. This doesn't appear to have changed in 2 years. Are you aware of any particular reason the plugin CLI load is attempted first @Pierre-Sassoulas? I will move that around in my local and see if anything terrible happens 😂

Update: I moved the plugin load to after the _config_initialization call, and it worked! Nothing terrible happened in my case. Will put up a PR for this change for focused discussion.

@DanielNoord
Copy link
Collaborator

@daogilvie Thanks for the thorough investigation. I have one question about this though: why can't you install the plugin like a package? As far as I know, running plugins from the current working directory is not something we support. Plugins should be installed like a package (either from a packaging index like PyPi or locally using pip install .). That way plugins can be discovered in the order that the code is currently doing so.

Is there something that is limiting you from installing the plugin as a package?

@daogilvie
Copy link
Contributor Author

daogilvie commented Aug 8, 2022

@daogilvie Thanks for the thorough investigation. I have one question about this though: why can't you install the plugin like a package? As far as I know, running plugins from the current working directory is not something we support. Plugins should be installed like a package (either from a packaging index like PyPi or locally using pip install .). That way plugins can be discovered in the order that the code is currently doing so.

Is there something that is limiting you from installing the plugin as a package?

Hi @DanielNoord! Nope, there is nothing at all that prevents me from doing that; I just didn't quite realise how much of a strict line that was. The lint is for a specific project, and has little general value outside of it, so was to live in the monorepo for the project. I could adapt the CI toolchain, or install it as another module, or any number of things 😄

Re-reading the "How to Write a Custom Checker" page again, I see that under Debugging a Checker it does briefly mention installing it as a requirement for testing, and there is a note callout that says you can edit PYTHONPATH or put the file in the checkers folder in the pylint source.
Would it be an acceptable outcome to make this specific thing clearer in the docs? I can open a PR to suggest some wording?

To be clear, the confusion I have is that my init-hook adds the directory I'm using to the PYTHONPATH: that is literally what I use it for, and is the most common usage according to pylint's own docs. Given that, my expectation was that pylint should be able to load the plugin.
If this is explicitly not a combination that is supported it is not really a major issue for me; I opened this issue precisely because I thought I was missing something obvious! 😄

@DanielNoord
Copy link
Collaborator

Re-reading the "How to Write a Custom Checker" page again, I see that under Debugging a Checker it does briefly mention installing it as a requirement for testing, and there is a note callout that says you can edit PYTHONPATH or put the file in the checkers folder in the pylint source. Would it be an acceptable outcome to make this specific thing clearer in the docs? I can open a PR to suggest some wording?

Yeah! I think adding this explicitly to the docs is the way to go. Perhaps in the future we should actually support this if somebody finds a use case where installing isn't an option but for now I prefer improving the documentation vs. increasing the complexity of pylint.

I saw you already made a commit! Feel free to tag me in the PR and I'll make sure to review it quickly 😄

@daogilvie
Copy link
Contributor Author

daogilvie commented Aug 8, 2022

@DanielNoord @Pierre-Sassoulas Having submitted the documentation change I think there is a bug here, outside of my specific way of mis-using the configuration.

The Bug

It is possible to give pylint the instruction to load a custom checker in such a way that

  1. The checker is never registered so will never be used, BUT...
  2. The module itself is imported and so the run will not error with a failure-to-import message.

This means that a user could attempt to use a custom lint, see no errors and no messages, which gives the user a false sense that their checker found no issues. I'm fortunate, in that I was doing this to explicitly test a checker that I knew should find at least one thing.

If you think it merits it, I'm happy to raise another issue for that? Also fine if you think it's so unlikely to happen in real life that the complexity of catching it isn't worth the effort 😄

@daogilvie
Copy link
Contributor Author

Re-reading the "How to Write a Custom Checker" page again, I see that under Debugging a Checker it does briefly mention installing it as a requirement for testing, and there is a note callout that says you can edit PYTHONPATH or put the file in the checkers folder in the pylint source. Would it be an acceptable outcome to make this specific thing clearer in the docs? I can open a PR to suggest some wording?

Yeah! I think adding this explicitly to the docs is the way to go. Perhaps in the future we should actually support this if somebody finds a use case where installing isn't an option but for now I prefer improving the documentation vs. increasing the complexity of pylint.

I saw you already made a commit! Feel free to tag me in the PR and I'll make sure to review it quickly 😄

@DanielNoord I've just seen you commented here, Github UI didn't load it whilst I had the page open so my comment about the bug above was made without seeing this 😂 Sorry, wasn't trying to be rude!

Totally happy if you to make the call that there's no bug to be fixed! I'm just grateful pylint exists and has such responsive maintainers. This tool has saved me from myself many, many times...

@DanielNoord
Copy link
Collaborator

@daogilvie That does indeed sound like a bug.

To be honest, if pylint wasn't so widely used I would probably disallow loading plugins from the current working directory and only allow installed packages. However, since some users probably rely on some hacky method to get this to work, we probably can't suddenly disallow this.

I haven't looked at the relevant code myself, but can we reliably determine when this is the case? It seems like the main issue is when load-plugins is used on the command line for a plugin that is not installed but is available through init-hook or some other PYTHONPATH trick, right? Can we warn users about such cases when we identify them?

@daogilvie
Copy link
Contributor Author

daogilvie commented Aug 8, 2022

To be honest, if pylint wasn't so widely used I would probably disallow loading plugins from the current working directory and only allow installed packages. However, since some users probably rely on some hacky method to get this to work, we probably can't suddenly disallow this.

This is absolutely fair — I think the requirement for it to be installed is completely reasonable, but introducing it now does seem like prime ground for the "broken workflow" problem.

I haven't looked at the relevant code myself, but can we reliably determine when this is the case? It seems like the main issue is when load-plugins is used on the command line for a plugin that is not installed but is available through init-hook or some other PYTHONPATH trick, right? Can we warn users about such cases when we identify them?

TL;DR

Yes, I think it would be fairly straightforward to modify the load_plugin_modules and/or load_plugin_configuration steps in the linter to do this, with no other side-effects.

The Long Bit, Read if You Want

Yes, I think so. The call to load plugins from the command line argument is made just before the config initialisation is done and is relatively simple. The actual function of interest is on the linter class:

def load_plugin_modules(self, modnames: list[str]) -> None:
    """Check a list pylint plugins modules, load and register them."""
    for modname in modnames:
        if modname in self._dynamic_plugins:
            continue
        self._dynamic_plugins.add(modname)
        try:
            module = astroid.modutils.load_module_from_name(modname)
            module.register(self)
        except ModuleNotFoundError:
            pass

You can see here that it will just fail to register the module if it can't import it, but it will still add it to _dynamic_plugins; a set of module names it has tried to load already.
Just below that is the function that, as part of the config initialisation, looks at _dynamic_plugins and literally reimports them, but only to run the load_configuration if the module defines it:

def load_plugin_configuration(self) -> None:
    """Call the configuration hook for plugins.
    This walks through the list of plugins, grabs the "load_configuration"
    hook, if exposed, and calls it to allow plugins to configure specific
    settings.
    """
    for modname in self._dynamic_plugins:
        try:
            module = astroid.modutils.load_module_from_name(modname)
            if hasattr(module, "load_configuration"):
                module.load_configuration(self)
        except ModuleNotFoundError as e:
            self.add_message("bad-plugin-value", args=(modname, e), line=0)

This second function is the one that signals any actual issue with importing, and that can still be the case.

It would be, I think, pretty simple to add another internal state attribute, perhaps called _registered_plugins, which is updated after the register function is called successfully. Then, in the load_plugin_configuration function, we can add a message for any module that:

  • is in _dynamic_plugins,
  • is not in _registered_plugins
  • imports successfully

I think this would only happen in the case we are describing; where an init-hook has modified the PYTHONPATH and allowed a new, off-piste import.

If there is a better mechanism for tracking registered plugins that already exists, it would also work.
I have not explored the option of directly making load_plugin_modules emit a suitable message, so that might also be better.

@DanielNoord
Copy link
Collaborator

The actual function of interest is on the linter class:

Note that you can use Copy Permalink to get an interactive code sample like this:

https://github.com/PyCQA/pylint/blob/e0ecc35e93c4dc32b691cfb8a3ad526ebfafb086/pylint/lint/pylinter.py#L361-L371

It would be, I think, pretty simple to add another internal state attribute, perhaps called _registered_plugins, which is updated after the register function is called successfully. Then, in the load_plugin_configuration function, we can add a message for any module that:

What about changing _dynamic_plugins into a dictionary with type dict[str, bool] with the boolean representing whether it was successfully imported? That removes the need for additional attributes.

If there is a better mechanism for tracking registered plugins that already exists, it would also work. I have not explored the option of directly making load_plugin_modules emit a suitable message, so that might also be better.

I think it would be good if we could explore this. You can use self.add_message() to emit bad-plugin-value in the except of the importer. Like in:

https://github.com/PyCQA/pylint/blob/e0ecc35e93c4dc32b691cfb8a3ad526ebfafb086/pylint/lint/pylinter.py#L386

One thing that could be problematic is if PyLinter is not fully "setup" when we call load_plugin_modules but that is something we will find out when we implement this.

Is this something you would want to create a PR for?

@daogilvie
Copy link
Contributor Author

daogilvie commented Aug 9, 2022

Note that you can use Copy Permalink to get an interactive code sample

Thank you for the tip! I don't often discuss code like this on issues or PRs professionally so this is a goldmine 😄

I think it would be good if we could explore this. You can use self.add_message() to emit bad-plugin-value in the except of the importer.

Yes, my only thought is that the reason (I guess) that this doesn't happen, is that load_plugin_modules is only called after the config is loaded if there are plugins in the config to load.

https://github.com/PyCQA/pylint/blob/684a1d6aa0a6791e20078bc524f97c8906332390/pylint/config/config_initialization.py#L50-L52

It will always call load_plugin_configuration:

https://github.com/PyCQA/pylint/blob/684a1d6aa0a6791e20078bc524f97c8906332390/pylint/config/config_initialization.py#L101-L103

So we could just call load_plugin_modules twice, and stop gating it on that if? We would then need to move the message about imports from load_plugin_configuration, and put it into load_plugin_modules, but not necessarily fire it right away 🤔 Will explore this.

What about changing _dynamic_plugins into a dictionary with type dict[str, bool] with the boolean representing whether it was successfully imported? That removes the need for additional attributes.

That is fine, but I would want to change the name of the attribute to communicate that, perhaps something like _dynamic_plugin_is_loaded, or similar. I'll try a couple out when doing the PR today. Which is to say...

Is this something you would want to create a PR for?

Yes! Would you like a new bug raised to track this issue specifically, or shall I continue to do it under this one?

@DanielNoord
Copy link
Collaborator

So we could just call load_plugin_modules twice, and stop gating it on that if? We would then need to move the message about imports from load_plugin_configuration, and put it into load_plugin_modules, but not necessarily fire it right away 🤔 Will explore this.

Wouldn't that unnecessarily increase the amount of if statements? As in, if we can find a way to fix this without unnecessarily calling functions twice for the common use cases I think that is preferable.

That is fine, but I would want to change the name of the attribute to communicate that, perhaps something like _dynamic_plugin_is_loaded, or similar. I'll try a couple out when doing the PR today. Which is to say...

👍

Yes! Would you like a new bug raised to track this issue specifically, or shall I continue to do it under this one?

Let's continue here. It's still related to the original post.

@daogilvie
Copy link
Contributor Author

daogilvie commented Aug 9, 2022

Wouldn't that unnecessarily increase the amount of if statements? As in, if we can find a way to fix this without unnecessarily calling functions twice for the common use cases I think that is preferable.

Well, that's sort of what I'm saying is the slightly harder bit — we'll need to add some if somewhere, to stop emitting bad-plugin-value twice, even if it's just in load_plugin_configuration to make sure it doesn't re-import any module that isn't loaded. Calling the function twice is currently the only way I can think of to catch the various different cases reliably whilst not using load_plugin_configuration.

I may have misunderstood though; is the intent here for bad-plugin-value, which has a pretty clear message about the module being impossible to load, to be emitted in this case? Or for there to be a new message specifically warning against this sort of init-hook path magic?

What if, for example, someone has a pylintrc that also specifies the same plugin they supply on the CLI — I accidentally did this myself! It actually doesn't make a difference currently, because the module gets put in the set, but ironically if they supply the argument on the CLI it will actually prevent that same module in config from being loaded. Should we handle this?

In the simplest case, is it really just about emitting bad-plugin-value the first time pylint cannot load a module, then making sure not to retry that module, so people don't get the same confusion I did (where my checker doesn't run, but my module is clearly imported)?

Does this table accurately express what you would like to happen? ❗ indicates change in behaviour, just so they stand out a bit more. All headings refer to some given plugin I want to load, i.e "Passed on CLI" means I used ``--load-plugins=some_plugin```, etc. Hopefully it's clear 😄, please let me know if not!

# Passed on CLI Supplied in pylintrc Can be imported before init Can be imported after init Current Behaviour Desired Behaviour
1 🚫 🚫 🚫 Emit bad-plugin-value Emit bad-plugin-value
2 🚫 🚫 🚫 Emit bad-plugin-value Emit bad-plugin-value
3 🚫 🚫 Do not register, but do import config, no emit Emit bad-plugin-value
4 🚫 🚫 Load as normal, works fine, no warnings Emit bad-plugin-value
5 🚫 🚫 Emit bad-plugin-value Emit bad-plugin-value
6 🚫 Do not register, but do import config, no emit Emit bad-plugin-value

I suspect these are very niche cases, as the code has been this way for a long time. I just don't want to make any confusion worse! Will open a PR with what I think might work. The trouble I think would be case 4, which is absolutely definitely a significant change in behaviour, and would mean working against several commits I can see that explicitly put the load after the init hook to allow just this type of thing. Unless you are not worried about it?

@DanielNoord
Copy link
Collaborator

I may have misunderstood though; is the intent here for bad-plugin-value, which has a pretty clear message about the module being impossible to load, to be emitted in this case? Or for there to be a new message specifically warning against this sort of init-hook path magic?

No, I think raising the existing message makes the most sense.

What if, for example, someone has a pylintrc that also specifies the same plugin they supply on the CLI — I accidentally did this myself! It actually doesn't make a difference currently, because the module gets put in the set, but ironically if they supply the argument on the CLI it will actually prevent that same module in config from being loaded. Should we handle this?

What if we defer the plugin loading to load_plugin_modules and make _dynamic_plugins a dict[str, types.ModuleType | None]? The loading in load_plugin_configuration is only necessary to get the necessary attribute from that module. If we make sure to store the loaded module somewhere that can be re-accessed we remove the double loading of plugin modules and can focus on fixing any remaining issues in one single function.

In the simplest case, is it really just about emitting bad-plugin-value the first time pylint cannot load a module, then making sure not to retry that module, so people don't get the same confusion I did (where my checker doesn't run, but my module is clearly imported)?

See suggestion above 😄

The trouble I think would be case 4, which is absolutely definitely a significant change in behaviour, and would mean working against several commits I can see that explicitly put the load after the init hook to allow just this type of thing. Unless you are not worried about it?

Yeah I agree. Case 4 seems to be troublesome.

Could we make the changes you suggest here but "catch" case 4 and emit a DeprecationWarning or Warning for this? Indicating that this is a potentially broken way to load stuff? If we don't get any issue about this usefulness of this case we can decide to fully remove it in 3.0 and emit bad-plugin-value for that case as well.

@daogilvie
Copy link
Contributor Author

daogilvie commented Aug 10, 2022

What if we defer the plugin loading to load_plugin_modules and make _dynamic_plugins a dict[str, types.ModuleType | None]? The loading in load_plugin_configuration is only necessary to get the necessary attribute from that module. If we make sure to store the loaded module somewhere that can be re-accessed we remove the double loading of plugin modules and can focus on fixing any remaining issues in one single function.

So the code would only ever call load_plugin_modules once, after init? Would we need to union the CLI arguments together with the config ones?

There's another case by the way, that might be worth thinking about but also might be tricky: where init-hook is supplied on the CLI as well as the load-plugins arg. Currently this just works.

# Passed on CLI Supplied in pylintrc Can be imported before init Can be imported after init Current Behaviour Desired Behaviour
7 🚫 🟡 (because of cli init hook) Works perfectly Emit bad-plugin-value

I can implement these changes: the problem is that I am about to go on a holiday trip for the rest of the month (we've been planning this for a while) and I think my partner would be rather upset if I took my laptop with me 😄
I do want to contribute, but this means I'm not going to be able to code anything until the 28th/29th. If someone else fixes this is the meantime that is absolutely fine, but if you're happy to wait I would really appreciate the chance to contribute. I'll open a draft PR now — I managed to get as far as a failing test for case 3.

@DanielNoord
Copy link
Collaborator

So the code would only ever call load_plugin_modules once, after init? Would we need to union the CLI arguments together with the config ones?

Something like that yeah?

I can implement these changes: the problem is that I am about to go on a holiday trip for the rest of the month (we've been planning this for a while) and I think my partner would be rather upset if I took my laptop with me 😄 I do want to contribute, but this means I'm not going to be able to code anything until the 28th/29th. If someone else fixes this is the meantime that is absolutely fine, but if you're happy to wait I would really appreciate the chance to contribute. I'll open a draft PR now — I managed to get as far as a failing test for case 3.

No worries. I don't think anybody is likely to pick up, especially considering the amount of work/research you had to do to understand all of this.

@daogilvie
Copy link
Contributor Author

daogilvie commented Aug 29, 2022

Hello! Hope you folks are well 😄

  • I've fixed case 3 on Add more cases that emit bad-plugin-value #7284, but not implemented anything else yet (i.e the code still makes 2 calls to load_plugins).
  • I haven't found a change I feel comfortable making for the other cases, or for emitting the warning, but will keep looking over the coming days.

I have meddled with the history on my branch a bit; I didn't think it was an issue because it's a draft that hasn't been looked at by anyone, but if you would rather rebasing didn't happen at all once something was on GH, please let me know.

PS I completely missed that I'd not enabled auto-signing of my commits. Future ones in the PR will not show as "unverified", sorry for that.

@Pierre-Sassoulas
Copy link
Member

Thank you for the recap !

I completely missed that I'd not enabled auto-signing of my commits. Future ones in the PR will not show as "unverified", sorry for that.

Yeah, don't worry about that, I don't 😄

@daogilvie
Copy link
Contributor Author

Update

Once #7284 is merged, cases 3 and 6 will be fixed. The other cases look like they might be a bit more complicated to fix without unintended consequences. I will happily do that, but probably over multiple smaller PRs to make sure everything works well.

Current State

# Passed on CLI Supplied in pylintrc Can be imported before init Can be imported after init Current Behaviour Desired Behaviour Done
1 🚫 🚫 🚫 Emit bad-plugin-value Emit bad-plugin-value N/A
2 🚫 🚫 🚫 Emit bad-plugin-value Emit bad-plugin-value N/A
3 🚫 🚫 Do not register, but do import config, no emit Emit bad-plugin-value
4 🚫 🚫 Load as normal, works fine, no warnings Emit bad-plugin-value
5 🚫 🚫 Emit bad-plugin-value Emit bad-plugin-value N/A
6 🚫 Do not register, but do import config, no emit Emit bad-plugin-value
7 🚫 🟡 (because of cli init hook) 🚫 Works perfectly Emit bad-plugin-value

DanielNoord added a commit that referenced this issue Sep 7, 2022
* Add x-failing test for issue 7264 case 3

This is the case where a plugin can be imported only after the init-hook
is run, and the init hook is specified in a pylintrc file
We would expect the module to not load in any case, and cause the
emission of a bad-plugin-value message.

* _dynamic_plugins is a dict not a set

This is in preparation for caching the loaded modules, and for storing
other information that will help in identifying times loading is
dependent on init-hook magic.

* Use cached module objects for load_configuration

This fixes case 3 in #7264, and is technically a breaking change, in
that it now emits a message for what would have previously been a silent
failure.

* Interim review comment fixes

1. Remove the xfail that snuck back in after the rebases.
2. Now that fake_home yields a str, update the type hint.

* Add test for bad plugin with duplicated load

This is case 6 in issue #7264, and represents the other silent failure
that is being made non-silent by this change.

* Apply review feedback

- Add an ordering test for CLI arguments
- Add clarifying comments on affected functions
- Tidy up test assertions to be more pythonic

Co-authored-by: Daniël van Noord <[email protected]>
Co-authored-by: Pierre Sassoulas <[email protected]>
Pierre-Sassoulas added a commit to Pierre-Sassoulas/pylint that referenced this issue Sep 7, 2022
* Add x-failing test for issue 7264 case 3

This is the case where a plugin can be imported only after the init-hook
is run, and the init hook is specified in a pylintrc file
We would expect the module to not load in any case, and cause the
emission of a bad-plugin-value message.

* _dynamic_plugins is a dict not a set

This is in preparation for caching the loaded modules, and for storing
other information that will help in identifying times loading is
dependent on init-hook magic.

* Use cached module objects for load_configuration

This fixes case 3 in pylint-dev#7264, and is technically a breaking change, in
that it now emits a message for what would have previously been a silent
failure.

* Interim review comment fixes

1. Remove the xfail that snuck back in after the rebases.
2. Now that fake_home yields a str, update the type hint.

* Add test for bad plugin with duplicated load

This is case 6 in issue pylint-dev#7264, and represents the other silent failure
that is being made non-silent by this change.

* Apply review feedback

- Add an ordering test for CLI arguments
- Add clarifying comments on affected functions
- Tidy up test assertions to be more pythonic

Co-authored-by: Daniël van Noord <[email protected]>
Co-authored-by: Pierre Sassoulas <[email protected]>
Pierre-Sassoulas added a commit that referenced this issue Sep 7, 2022
* Add x-failing test for issue 7264 case 3

This is the case where a plugin can be imported only after the init-hook
is run, and the init hook is specified in a pylintrc file
We would expect the module to not load in any case, and cause the
emission of a bad-plugin-value message.

* _dynamic_plugins is a dict not a set

This is in preparation for caching the loaded modules, and for storing
other information that will help in identifying times loading is
dependent on init-hook magic.

* Use cached module objects for load_configuration

This fixes case 3 in #7264, and is technically a breaking change, in
that it now emits a message for what would have previously been a silent
failure.

* Interim review comment fixes

1. Remove the xfail that snuck back in after the rebases.
2. Now that fake_home yields a str, update the type hint.

* Add test for bad plugin with duplicated load

This is case 6 in issue #7264, and represents the other silent failure
that is being made non-silent by this change.

* Apply review feedback

- Add an ordering test for CLI arguments
- Add clarifying comments on affected functions
- Tidy up test assertions to be more pythonic

Co-authored-by: Daniël van Noord <[email protected]>
Co-authored-by: Pierre Sassoulas <[email protected]>
@daogilvie
Copy link
Contributor Author

Hello again! I've been pondering the next stage of this, and I can't help but think that "fixing" 4 and 7 is quite a big change in behaviour. Especially compared to what we did already, which is change a silent failure into a reported failure.

In fact, some git/github archaeology tells me that pylint working this way was a deliberate decision: #166.

I don't know if anything has changed particularly but this increases the risk that people might actually be using this behaviour; what do you think?

@DanielNoord
Copy link
Collaborator

Thanks for that investigation. I'm not sure I agree with the original decision but I don't think it's worth the trouble of a breaking change here.

I think the only thing left to do is make sure we have regression tests against all "sub-cases" within cases 4 and 7. So:
load-plugins in pylintrc, init-hook in pylintrc on a line before load-plugins
load-plugins in pylintrc, init-hook in pylintrc on a line after load-plugins
load-plugins on cli, init-hook on cli before load-plugins
load-plugins on cli, init-hook on cli after load-plugins

@daogilvie
Copy link
Contributor Author

daogilvie commented Sep 16, 2022

load-plugins in pylintrc, init-hook in pylintrc on a line before load-plugins
load-plugins in pylintrc, init-hook in pylintrc on a line after load-plugins
load-plugins on cli, init-hook on cli before load-plugins
load-plugins on cli, init-hook on cli after load-plugins

The bottom two are already present: https://github.com/PyCQA/pylint/blob/fb30fe09d74de3dad2472309fb24396b65b8b81d/tests/lint/unittest_lint.py#L665-L707

I will open a PR with the order-independence tests for the pylintrc case today.

EDIT: Also just seen this old CLI-focused test too: https://github.com/PyCQA/pylint/blob/fb30fe09d74de3dad2472309fb24396b65b8b81d/tests/lint/unittest_lint.py#L746-L752

@daogilvie
Copy link
Contributor Author

daogilvie commented Sep 16, 2022

Annoyingly, it looks like those 2 tests that I already added are not valid. For reasons I don't understand, and have just discovered by trying to add new tests, the path they are "adding" to the sys path is already in there, before the init-hook does anything. I did not think to check for this when I added those tests previously, as I largely copied from the other existing plugin loading tests. I'll rewrite them to use a path that doesn't exist, like my more involved ones for case 3 and 6.

@daogilvie
Copy link
Contributor Author

daogilvie commented Sep 16, 2022

Ok so I actually had to fix-up all the tests I had written, because I didn't quite understand how the checker names were actually used. All sorted in #7475. Between that and #7276 I think that's everything that this issue has thrown up.

Pierre-Sassoulas pushed a commit that referenced this issue Sep 16, 2022
* Add and fix tests for init-hook based plugin load

This changes 4 existing tests, due to a misunderstanding of the author
(past me) in a couple of the details of those tests.
Specifically, we now copy a single python file to use as our plugin, and
make sure to correctly check for the name as given in the checker
classes. We also make sure not to accidentally load the old copy of the
plugin, which apparently sits in a directory that is already on the
system path.
There is also a single new test, which covers the cases of loading a
plugin with ``init-hook`` magic, both specified in an rc file, but in
different orders.

This should be sufficient to close the issue around #7264.

Co-authored-by: Daniël van Noord <[email protected]>
DanielNoord added a commit that referenced this issue Sep 20, 2022
* Improve docs clarity on loading custom checkers

Whilst the docs on testing custom checkers do touch on the fact the
module has to be in your load path, there are some restrictions that
were not made clear before.
Specifically, the init-hook configuration item (which is often used
explicitly to change the sys.path) is not automatically sufficient. If
the init hook is in a file, but the load-plugins argument is passed in
the CLI, then the plugins are loaded before the init-hook has run, and
so will not be imported. In this case, the failure can also be silent,
so the outcome is a lint that will simply not contain any information
from the checker, as opposed to a run that will error.
This corner case may merit fixing in a separate PR, that is to be
confirmed.

Closes #7264

Co-authored-by: Daniël van Noord <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants