Compilation of C programs is very slow for programs with large banks #635

cmnrd · 2021-10-19T14:22:07Z

As pointed out by Edward in #481, compilation of larger programs such as Sleeping Barber with 5000 customers is very slow for the C target. On my machine, the C compiler needed almost 2 minutes to compile the program. This is likely due to the huge file size of the generated SleepingBarber.c (7.6MB!).

Soroosh129 · 2021-10-19T14:26:07Z

Would this issue be resolved by putting reactor definitions in separate files?

I would imagine that if banks are used, reactor definitions should not take a lot of space in the .c file.

What portion of the generated code do you think is the largest and/or has the most impact on this compile time?

cmnrd · 2021-10-19T15:19:25Z

If I interpret the generated c code correctly, then the self struct of each of the 5000 customers is initialized individually. This is 30 lines of code for each customer. The C++ code generator simply creates a for loop to initialize banks, but guess this does not work well in conjunction with the static resolving of parameters and connections in C.

Soroosh129 · 2021-10-19T16:01:15Z

If I interpret the generated c code correctly, then the self struct of each of the 5000 customers is initialized individually.

That is correct.

The C++ code generator simply creates a for loop to initialize banks

Hmm, shouldn't an unrolled for loop be faster?

cmnrd · 2021-10-19T17:01:24Z

In terms of execution time it can be faster. But there are many optimizations a compiler and processor can do for loops. By unrolling the loop manually, you take away room for optimization. So I wouldn't bet on which version is faster ;). In terms of compilation time, it is certainly slower to process 5000*30 lines compared to 30 lines plus a loop header. I was referring to the compilation time in this issue.

Soroosh129 · 2021-10-19T17:18:18Z

If we had top-level parameters that were global variables, we could probably also implement for loop initializations for banks in the C target. This would also help in enabling the overriding of these parameters on the command line.

cmnrd · 2021-10-25T11:56:06Z

This is also becoming an issue for our benchmark tests. C tests run for more than an hour. Note that this is mostly compilation time, as the benchmarks are not fully run in our CI. (see #673)

Soroosh129 · 2021-10-28T13:00:23Z

Do you think the memory issue for gcc could be related to the --parallel #number_of_cores flag that we pass to CMake?

edwardalee · 2021-10-28T13:52:30Z

There are two items that I consider relatively low hanging fruit towards fixing this problem. First, ReactorInstance and PortInstance need to be made more efficient so that one instance of these classes can represent N reactor or port instances. This is a little tricky because it's not as simple as just having one Multiport representing all its channels. The channels may have different connectivity, so each multiport will need to be split into a set of instances, one for each unique connectivity.

A smaller improvement we could make is to construct the ReactorInstance graph only once instead of twice for each code generation operation. We currently construct it during validation (and visualization), then discard that version and construct it again in doGenerate(). Marten and I prototyped a reuse, but I've lost track of it in the flurry of branches.

Second, the C code generator currently inlines constructor invocation and connection code. This should be wrapped into loops so that the generated code is much smaller. I think this will be easiest to do after the above redesign is done.

I am willing to take these on and will try to start this afternoon, but if anyone wants to help, that would be great.

Soroosh129 · 2021-10-28T14:01:52Z

I would like to help resolve this issue. What can I do in parallel?

cmnrd · 2021-10-28T14:09:54Z

Do you think the memory issue for gcc could be related to the --parallel #number_of_cores flag that we pass to CMake?

No, this will only let invoke cmake multiple instances of gcc to process independent source files. It will not parallelize gcc itself. Most of the time, only a single instance of cmake runs.

cmnrd · 2023-09-27T14:45:51Z

I think this is fixed. But we have traded it for #2027...

cmnrd added compiler c Related to C target labels Oct 19, 2021

cmnrd mentioned this issue Oct 19, 2021

Sleeping Barber benchmark slowing on large inputs #481

Closed

cmnrd mentioned this issue Oct 25, 2021

Speed up CI benchmark tests #673

Merged

cmnrd mentioned this issue Oct 28, 2021

Epoch and lfc become unusable for programs with large (>500) banks and multiports #433

Closed

cmnrd closed this as completed Sep 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compilation of C programs is very slow for programs with large banks #635

Compilation of C programs is very slow for programs with large banks #635

cmnrd commented Oct 19, 2021

Soroosh129 commented Oct 19, 2021 •

edited

Loading

cmnrd commented Oct 19, 2021

Soroosh129 commented Oct 19, 2021

cmnrd commented Oct 19, 2021

Soroosh129 commented Oct 19, 2021

cmnrd commented Oct 25, 2021

Soroosh129 commented Oct 28, 2021

edwardalee commented Oct 28, 2021

Soroosh129 commented Oct 28, 2021

cmnrd commented Oct 28, 2021

cmnrd commented Sep 27, 2023

Compilation of C programs is very slow for programs with large banks #635

Compilation of C programs is very slow for programs with large banks #635

Comments

cmnrd commented Oct 19, 2021

Soroosh129 commented Oct 19, 2021 • edited Loading

cmnrd commented Oct 19, 2021

Soroosh129 commented Oct 19, 2021

cmnrd commented Oct 19, 2021

Soroosh129 commented Oct 19, 2021

cmnrd commented Oct 25, 2021

Soroosh129 commented Oct 28, 2021

edwardalee commented Oct 28, 2021

Soroosh129 commented Oct 28, 2021

cmnrd commented Oct 28, 2021

cmnrd commented Sep 27, 2023

Soroosh129 commented Oct 19, 2021 •

edited

Loading