-
Notifications
You must be signed in to change notification settings - Fork 423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mutually recursive modules and resolution order #19770
Comments
I'll have to look, but I could have sworn that something did talk about this being allowed and defining the order. Specifically, I remember writing some text at some point about using a graph DFS/BFS algorithm to establish module initialization order (which should define what happens in the event of cycles).
I think we've always hoped that the compiler would do a better job of resolving these than it currently does (where we imagined that, when it got stuck, it would put that down and work on other things until it figured out; or just actively go resolve the thing that it can't resolve before continuing), so this sounds like good news to me. Though of course there will always be cases that can't really be resolved, like: module M {
use N;
var x = y;
}
module N {
use M;
var y = x;
} (and you could imagine more complicated / realistic examples in which the initializations were using functions that referred to the variables in cyclically-dependent ways). But I think that's OK because such programs arguably aren't really sensical/well-formed. So it seems like we'd still need some way of saying where the line between what can or can't be resolved is. |
Yes, it describes module initialization order, but not module resolution order. And AFAIK the section defining module initialization order doesn't explicitly say anything about cycles.
One challenge we are facing with the dyno query framework is that a query calling the same query recursively (most likely through other functions/queries) is a fatal error. Basically there isn't a reasonable way to handle that in the query framework other than giving up with an error. Or at least that is what we are led to believe by the Rust compiler. So, at the moment such a (nonsense) program kills the dyno compiler with an error "Error: recursion encountered in query resolveModuleStmt". (It would halt, even if used in a library, because by then the program database is corrupted and we don't have a way to revert it). One of the things we have been wrestling with is how hard to work to improve this situation, where some ideas include:
Perhaps, the fatal error is good enough for now. |
Oh, good point, I was blurring those together in my mind. I'm guessing that the two are probably the same in the current compiler, though (?).
A challenge that's occurring to me belatedly now that you've clarified the above. Even if dyno can resolve your example 3, the modules can't be completely initialized in any specific order, right? I.e., if I do M first, it relies on N.y, which hasn't been set up yet, so that's an error. And if I do N first, it relies on M.x, which hasn't been set up yet. So if the implication is that the only way to initialize this program is to have fine-grain module initializers that are run in some interleaved manner... maybe that's more complicated than we want to worry about and an error really is the correct behavior for these initializations depending on mutual recursion. (note that I think mutual module uses should definitely still be supported when the initialization order doesn't result in these sorts of cycles, though).
It looks like it does to me. It talks about using a depth-first traversal, and not visiting modules that have already been visited. So in your example C, I think it would start with M as the main module, go to N via the |
That is so fascinating that in Following that train of thought, it seems like the issue here is not the ability to resolve the module all at once, but to run it all at once (where the former might follow trivially if the latter is true - such that it helps to simplify our implementation)? Can we just add a line to the spec saying something like "the statements in a module must be written such that all statements in a module initializer can be executed together in order"? |
I believe so, yes.
I agree with this. Would it be possible to unify the initialization and resolution orders? So that the language specification can describe both orders at once? What we have now is this from https://chapel-lang.org/docs/language/spec/modules.html#program-execution :
Some wrinkles with that:
Right, it describes an algorithm that can work with cycles, but doesn't say cycles are or are not allowed. I think it should say that they are allowed, somewhere, if that is our intention. |
I was wondering about this when the Arkouda team was asking questions about this a few weeks back. Specifically, I found myself hoping that any "named on command line" modules would be init'd after the others and in the command-line order, but wasn't sure whether that was the case.
That's correct due to the fact that, in today's compiler, we resolve use/import early (long before resolution) and don't do enough bookkeeping to back things out if something doesn't end up being used. I think it would be a definite improvement to the language and user experience if only those that were in "live" code resulted in initialization if it fell out due to the dyno resolution approach. If it didn't, I think I'd just stick with the current approach (where I always imagine we have some text somewhere that talks about "all uses/imports in the module's source in lexical order" or something like that, but would not be surprised if we didn't say anything at all).
That's fine with me. I also think it's definitely our intention. |
Can you say more about what you mean by "live" code, here? (In particular, what if the use/import is within a concrete function? What if it is within a generic function?) |
By "live" I meant code that was reachable through the resolution process, however that evolves. E.g., reachable through |
What about the order of resolving use/import statements themselves, in the face of recursive modules? As we know from Example A, the order matters. Here is an example that compiles and runs today (with the production compiler): module A {
public import C as CC;
public import D as DD;
use DD; // DD resolves through the above import
use CC; // CC resolves through the above import
var a1 = cvar;
var a2 = dvar;
proc testA() {
writeln("testA ", a1, a2);
}
proc main() {
import A;
import B;
A.testA();
B.testB();
}
}
module B {
public import D as DD;
public import C as CC;
use CC; // CC resolves through the above import
use DD; // DD resolves through the above import
var b1 = cvar;
var b2 = dvar;
proc testB() {
writeln("testB ", b1, b2);
}
}
module C {
var cvar: int = 1;
}
module D {
var dvar: int = 2;
} Here is a minor adjustment to it that does not compile with the production compiler (but that is due to issue #19799). Here I am trying to ask if it should compile: module A {
use B;
public import D as DD;
use DD; // this comes from the above import
use CC; // this comes from B's import through the 'use B'
var a1 = cvar;
var a2 = dvar;
proc testA() {
writeln("testA ", a1, a2);
}
proc main() {
import A;
import B;
A.testA();
B.testB();
}
}
module B {
use A;
public import C as CC;
use CC; // this comes from the above import
use DD; // this comes from A's import through the 'use A'
var b1 = cvar;
var b2 = dvar;
proc testB() {
writeln("testB ", b1, b2);
}
}
module C {
var cvar: int = 1;
}
module D {
var dvar: int = 2;
} Let's say it starts out by resolving the use/imports in A (since it is the main module). The compiler:
We could suppose that it should work with the current knowledge in 'A' (the partial resolution result). That is what I have tried to implement in dyno, but it comes at some cost (it is technically O(n*n) in the number of use/import statements). Should we instead require that there exists some order where we can visit each module in that order and correctly resolve the use/imports without stopping to work on a different module? (I am leaning towards doing so). |
Here is another (similar) example that doesn't compile today but that we could arguably make compile with sufficient effort: module M {
use N;
use NN;
use OO;
x;
z;
}
module N {
module NN {
var x: int;
}
public use O;
}
module O {
use M;
module OO {
var z: int;
}
} |
What I have cooking for now is to, during process of resolving module-level use/import statements, simply ignore the use/import statements within any module that we are already in the process of resolving (i.e. the recursive use). I suspect that is good enough to support mutually recursive modules for now. I think the result is better than the production compiler, at least (because the production compiler has the problem in issue #19799). |
dyno: automatically use ChapelStandard The main goal of this PR is to adjust dyno to support standard and internal modules. * Removes `partialResult` from QueryMapResult. It has been unused since #19765 and I had just forgotten to remove it there. * Adds support for setting the bundled (`modules/`) and internal module (`modules/internal`) search paths and the ability to query if an ID is in a bundled or internal module. Adjusted some of these module search path functions to return UniqueString rather than `std::string` since comparing them is useful. Added a helper function `setupModuleSearchPaths` that does the right setup calls with the configuration as arguments. * Adjusts the dyno use/import resolver to automatically use ChapelStandard in modules that aren't internal modules. To support that, Scope now includes a bool `autoUsesModules` which indicates if ChapelStandard should be used. * Once I did that, I noticed performance problems with the approach in #19741 (which can be O(n*n) in terms of use/import statements). So, this PR changes it to once again create a ResolvedVisibilityScope with a query and to ignore use/import statements in any module that has its use/import statements currently being resolved (by checking to see if `resolveVisibilityStmtsQuery` is already running). This situation (as well as some related issues) are discussed in issue #19770. * To support the above, added `Contex::isQueryRunning` to check to see if a query is currently running, in order to avoid recursion. We will see if this introduces other issues with incremental compilation but I haven't yet found a case where it causes problems when used simply to avoid recursion. * This PR also adds a helper, `AstNode::mayContainStatements`. I thought I would need this for some of the use/import statement processing but it didn't end up being necessary. However it seems like a useful property to have available so I have kept it in the PR. Reviewed by @dlongnecke-cray - thanks! - [x] full local testing
This issue asks language design questions about mutually recursive modules and the order of resolution of modules.
First, we have this example that demonstrates that the order of use statements does matter:
(This example comes from #13041 (comment) ).
Next, let's consider an example that compiles and runs today of mutually recursive modules:
The language specification doesn't explicitly say that this is or is not allowed. However it is currently supported.
Now what are the things that need to happen in order to resolve a module?
Now, at the moment, I am expecting the compiler will do step (1) in an in-order manner for a given module; and then do (2) in an in-order manner for a given module. This strategy would continue to allow mutually recursive modules.
For step 2, is it OK to require that it is possible to create an order for resolving the modules? The code in Example B meets this requirement because the compiler can completely resolve module N and then from there it can completely resolve module M.
Here is an example where that is not true:
The production compiler currently produces an error for this:
However, at the moment, the dyno compiler can resolve it. Should such patterns be allowed by the resolver?
The text was updated successfully, but these errors were encountered: