Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compilation of C programs is very slow for programs with large banks #635

Closed
cmnrd opened this issue Oct 19, 2021 · 11 comments
Closed

Compilation of C programs is very slow for programs with large banks #635

cmnrd opened this issue Oct 19, 2021 · 11 comments
Labels
c Related to C target compiler

Comments

@cmnrd
Copy link
Collaborator

cmnrd commented Oct 19, 2021

As pointed out by Edward in #481, compilation of larger programs such as Sleeping Barber with 5000 customers is very slow for the C target. On my machine, the C compiler needed almost 2 minutes to compile the program. This is likely due to the huge file size of the generated SleepingBarber.c (7.6MB!).

@cmnrd cmnrd added compiler c Related to C target labels Oct 19, 2021
@Soroosh129
Copy link
Contributor

Soroosh129 commented Oct 19, 2021

Would this issue be resolved by putting reactor definitions in separate files?

I would imagine that if banks are used, reactor definitions should not take a lot of space in the .c file.

What portion of the generated code do you think is the largest and/or has the most impact on this compile time?

@cmnrd
Copy link
Collaborator Author

cmnrd commented Oct 19, 2021

If I interpret the generated c code correctly, then the self struct of each of the 5000 customers is initialized individually. This is 30 lines of code for each customer. The C++ code generator simply creates a for loop to initialize banks, but guess this does not work well in conjunction with the static resolving of parameters and connections in C.

@Soroosh129
Copy link
Contributor

If I interpret the generated c code correctly, then the self struct of each of the 5000 customers is initialized individually.

That is correct.

The C++ code generator simply creates a for loop to initialize banks

Hmm, shouldn't an unrolled for loop be faster?

@cmnrd
Copy link
Collaborator Author

cmnrd commented Oct 19, 2021

In terms of execution time it can be faster. But there are many optimizations a compiler and processor can do for loops. By unrolling the loop manually, you take away room for optimization. So I wouldn't bet on which version is faster ;). In terms of compilation time, it is certainly slower to process 5000*30 lines compared to 30 lines plus a loop header. I was referring to the compilation time in this issue.

@Soroosh129
Copy link
Contributor

If we had top-level parameters that were global variables, we could probably also implement for loop initializations for banks in the C target. This would also help in enabling the overriding of these parameters on the command line.

@cmnrd
Copy link
Collaborator Author

cmnrd commented Oct 25, 2021

This is also becoming an issue for our benchmark tests. C tests run for more than an hour. Note that this is mostly compilation time, as the benchmarks are not fully run in our CI. (see #673)

@Soroosh129
Copy link
Contributor

Do you think the memory issue for gcc could be related to the --parallel #number_of_cores flag that we pass to CMake?

@edwardalee
Copy link
Collaborator

There are two items that I consider relatively low hanging fruit towards fixing this problem. First, ReactorInstance and PortInstance need to be made more efficient so that one instance of these classes can represent N reactor or port instances. This is a little tricky because it's not as simple as just having one Multiport representing all its channels. The channels may have different connectivity, so each multiport will need to be split into a set of instances, one for each unique connectivity.

A smaller improvement we could make is to construct the ReactorInstance graph only once instead of twice for each code generation operation. We currently construct it during validation (and visualization), then discard that version and construct it again in doGenerate(). Marten and I prototyped a reuse, but I've lost track of it in the flurry of branches.

Second, the C code generator currently inlines constructor invocation and connection code. This should be wrapped into loops so that the generated code is much smaller. I think this will be easiest to do after the above redesign is done.

I am willing to take these on and will try to start this afternoon, but if anyone wants to help, that would be great.

@Soroosh129
Copy link
Contributor

I would like to help resolve this issue. What can I do in parallel?

@cmnrd
Copy link
Collaborator Author

cmnrd commented Oct 28, 2021

Do you think the memory issue for gcc could be related to the --parallel #number_of_cores flag that we pass to CMake?

No, this will only let invoke cmake multiple instances of gcc to process independent source files. It will not parallelize gcc itself. Most of the time, only a single instance of cmake runs.

@cmnrd
Copy link
Collaborator Author

cmnrd commented Sep 27, 2023

I think this is fixed. But we have traded it for #2027...

@cmnrd cmnrd closed this as completed Sep 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c Related to C target compiler
Projects
None yet
Development

No branches or pull requests

3 participants