-
Notifications
You must be signed in to change notification settings - Fork 423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
duplicate module names in mason packages #8470
Comments
Here is a "story" for why this issue prevents Mason packages from being reasonable "libraries": Suppose we have two libraries, in the form of mason packages, called SuperLib and NanoLib. Further suppose that these are developed by different groups that are unaware of each other. Suppose there is an application using both of these.
Further, suppose that each of these packages consists of a main module and a helper module:
OK, application can use both of these modules, everything can work. Now suppose that SuperLib developer adds a module called "Impl" to add some new component. Then, the application developer upgrades. Now we have
Now the application does code does not compile, because there are two Impl modules. But the SuperLib developer can't possibly know that some other library possibly used with SuperLib used that module name. |
Here is a sketch / straw-man of how a Python-package-like nested module strategy might look:
Right, I think it'd be reasonable for Mason to convince the compiler to include the package name and version in the "module" the compiler actually works with. But another interesting idea would be to allow one to treat a directory as a module, so that all files inside of that directory would be considered nested modules. But we still have a problem in that the module searching has to respect a particular order that will make sense to Mason - in particular, to look in the package containing the current module first, and then the dependencies (possibly in a particular order). E.g. in in the ProjA / ProjB example, what if we could "use" a directory?
Note 1: Inside of ProjB/Main.chpl, the compiler needs to know to look first for Core (or any other module) in ProjB/. It seems to me that this proposal adds flexibility (you can now do something - use a directory - you couldn't before) while leaving the old functionality alone. |
This is important to fix, whether through Mason or through the language. Does it make sense to "wrap all files" in a package with a package-level module?
|
@mppf: In your example, is there a compelling reason that |
Yep. Unless the language can enforce it, someone will do it. :) File-level modules also cause this behavior by simply creating a file called "Helper.chpl" without an explicit file-level module in it. |
But do you really want to be using that someone's packages? :D This feels like more of a mason issue than a language issue to me, so tagging @ben-albrecht and @Spartee to make sure they're aware of this (I'm pretty sure they are). |
There has been a lot of discussion in this issue and #7847, so I think I am mostly reiterating certain points / opinions that have previously been made.
I consider this a language issue that mason happens to be exposing. It seems plausible that any sufficiently large Chapel code base could run into this problem. The original issue #7847 is an example of a user running into this problem in their code outside of mason.
Breaking code up across files is generally considered a good practice for keeping a code base more readable and maintainable. Currently, all submodules must be contained within a single file, which means we'd be encouraging/enforcing bad coding practices for mason package developers if we required them to keep all code within submodules in their current form. I think allowing submodules to live outside of the top-level module file is the most appealing solution that has been discussed so far. Submodules already provide a clear solution to this problem, i.e. conflicting names can be disambiguated with fully qualified identifiers like For mason, we could enforce packages to only contain a single top-level module. Everything else would need to be a submodule, but we could still allow for users to break up their code across multiple files/directories. |
That's essentially what Michael's solution 1 in the issue proposes (and it's my preference too). It's also why I consider this issue a mason issue and tagged you on it. It may be that the compiler has a role in helping mason answer this question, but I don't consider this issue to be a language issue itself. (If it said "I'd like to be able to declare two variables or modules at the same scope and have that work", I'd agree it's a language issue and say "We're not going to do that" and close it).
I disagree. The intent of this issue seems to be "how do we prevent people from writing bad mason packages that will conflict with one another," which is why I think the issue is mason-specific (note that both of Michael's solutions refer to mason as well). The appropriate thing for the language to do when a user tries to define two conflicting symbols at the same scope is to complain about it, which we're already doing. For example, if I declare two variables at the same scope: var x: int;
var x: real; we appropriately say:
And when compiling a program that has two modules of the same name (which is what this issue is concerned about): module M {
writeln("In module M");
}
module M {
writeln("In module M again");
} we say:
And again, I think this is appropriate—I tried to compile a program with two modules with the same name at the same scope and that doesn't make sense. It's hard for me to imagine that the language should do any more than this. Since the issue here seems to be asking how to make composition of two independent mason packages completely bullet-proof, I think the solutions are mason-centric and come down to: (a) trust that authors of mason packages will not to do this (they'll only define a single top-level module), or My opinions: (a) is the status quo and this issue implies that it isn't acceptable (though I'm personally OK with it); (b) seems like it should be pretty easy to implement to me, and corresponds to Michael's option 1 (and you seem open to it as well); (c) feels crufty to me and not as obvious how much work it would be, but I'm not categorically opposed to it.
I agree that this is desirable, but I think it's a separate concern (and one that's being discussed in issues #10909 and #10796). |
This is my preference at the moment, but I think this is unreasonable to enforce without being able to separate submodules into separate files.
Constraining mason packages to contain all code into a single source file is not a feasible solution IMO. I wouldn't want to commit to enforcing single top-level modules until being able to split submodules into separate files was a release cycle away.
Thanks for the reference - I'll continue discussion of those strategies in those issues. |
Since I'm OK with the status quo, I'm also OK with dragging our feet on this issue. But as a thought experiment, I'm curious how many of our packages in I'm also curious whether you would consider having mason generate such a warning to constitute "enforcing", or whether you're thinking of something stricter when you use that term... (like "we'd preclude such packages from being added to the mason registry.") Thanks. |
Yeah, I wouldn't expect many package modules to be impacted by this. There has been somewhat of an unwritten rule to keep package/standard modules as a stand-alone monolithic file. Taking a look at a few packages written by users, but not yet published on the registry (mostly due to non-Chapel dependencies), this constraint would impact all of them:
I think we'd have to enforce it via some sort of tooling. Otherwise, this would be another source of package incompatibility, which will contribute to user confusion and frustration. |
I think it's reasonable for the user to want to store them in different files and to make them conceptually (sub)modules. I think we agree at this point that at end of the day, they need to be sub-modules somehow. The question is just - especially in the context of Mason packages - should there be an easy way to create submodules? If different files for submodules are desired, how will that work? I don't think #10909 is the best/only place for that discussion since it is about one approach ("include statements") where I still prefer another approach ("modules can be directories"). And #10796 seems to have gotten even more side-tracked than this issue... So I created #10946 to specifically propose a "modules can be directories" approach. Regarding the choice between approaches 1 and 2 in this issue description, it seems to me at this point we could more or less separately choose between
Note that if #10946 is to solve the problem, it doesn't obviously handle the |
@bradcray - I'm just noting that I think we should attend to this issue somehow soon - in other words I think we should move it out of Icebox. |
I think that #15279 resolves this issue - the top-level packages need to have unique names but a mason package can include submodules in different files. Closing. |
This issue is a spin-off of issue #7847.
The problem is that 2 mason packages can both use a module name (say,
Core
for example). This causes problems if both of these mason packages are used as libraries by a single application, or if one of the mason packages depends upon the other.For a complete mason example, consider:
https://github.com/mppf/ProjA
https://github.com/mppf/ProjB
I didn't actually publish projB to the Mason registry - ProjA README has instructions for how to pretend. ProjA depends on ProjB, and both ProjA and ProjB have a module named Core. You should be able to reproduce the issue by pasting the commands from ProjA README.
I know of two potentially reasonable solutions to this problem (and will add more here if people come up with them):
Do not allow mason packages to contain multiple top-level modules. Instead, a mason package must contain a single top-level module that matches the package name. Nested modules would be allowed. Use a file
include
capability to allow submodules to be stored in different files.Adjust the compiler to consider modules within a mason package more similarly to nested modules, where each module in a directory would be considered "nested" under a module with the directory name. Such a strategy might draw inspiration from how Python packages work.
These solutions rely on the fact that we are expecting mason packages to have globally-unique names (since there is just 1 official mason registry and it keys off of the package names).
The text was updated successfully, but these errors were encountered: