Kinda. I think a primary use case for modules is to help with out-of-control compile times.
But the specific problem of include-what-you-use will still be encountered if you include directly from C libraries like system headers or library dependencies.
This is not true. Compile times are usually much better with modules. They also don't inhibit parallelism, but perhaps you are referring to this paper (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p144...), which shows that, with compiler versions from 2019, it can indeed be slower to compile with modules if you have a large number of threads and the depth of the module dependency graph is large.
Do you have evidence that the situation has changed? Last I checked it still remains the case that modules inhibit parallelism and hence result in slower builds in most practical work loads. But of course if you have evidence the contrary I'd be happy to see it.
I don't know of any newer benchmarks. However, I'm reading the results differently I guess, because the results show that with 128 threads, modules become slower only when the DAG depth is higher than 29, and that's quite a large depth! It also looks like each source file used in the benchmark only imports other modules and declares 300 variables, but nothing else. Practical workloads will have more interesting stuff in the source files, so I would expect the impact of module loading to be less, so more can be done in parallel.
> with 128 threads, modules become slower only when the DAG depth is higher than 29
yeah, but the same graph never shows modules being faster... it only ever shows them being the same or slower. If I'm going to put in all that work, the result should be *faster*
> This is not true. Compile times are usually much better with modules.
What significantly improves compile-times is Pre-Compiled Headers (PCH), which most compilers have supported for decades.
The study you mention, does not show data for them.
Having ported one >1 million LOC C++ app to use modules in two compilers, the compile time improvement of modules over PCH was not distinguishable from noise.
Modules have many advantages, like better encapsulation, etc.
The main thing people want from them seems to be better compile times, which is the one thing they don't deliver, at least over the PCH solutions that have existed for decades, are already supported by all build systems, etc.
Compared to modules, PCHs are "zero-effort" and deliver performance instantaneously.
Off-topic, but is there a guide to best practices for portable pre-compiled headers out there somewhere? I'm under considerable pressure to add pre-compiled headers for Windows to my code, and it won't have any significant benefit for me unless I can also make it work on MacOS and Linux. So far my Googling has turned up little information for any platform other than Windows, and nothing that would suggest how to do it well for all three platforms. (Well, more to the point, Visual C++, clang, and g++.)
I'll just google "<your build system> pre-compiled headers" and see if there is a flag or option that you can enabled.
You will definetly need quite a bit of fine tuning for apps over 500k LOC or so, but if your app is under that, and you are splitting code between .h and .cpp files appropriately, just flipping a flag might get you 80% there.
The speed ups you see people get from PCHs is like 20-30% faster compile-times. So they are more a "nice to have" feature than something that will solve your compile-time problems.
If your app is structured in such a way that it takes 20 min to compile, this can cut it to 15 min at most, but that would probably still suck. If you want more, then you'd need to consider other solutions like distributed build caches (sccache, etc.).
My understanding is just the opposite, they will decrease compilation times as "included files" are processed just once. We can see them as a better version of precompiled headers (although they are more than that).
Yes except that includes are usually not the performance bottleneck, it's the semantic analysis that consumes the bulk of the compile times.
Modules inhibit parallelism because modules are ordered along a DAG and must be compiled from the root of the DAG down to the leafs in order. So consider a traditional setup as follows:
A.cpp <- A.h <- B.h <- C.h <- D.h
B.cpp <- B.h <- C.h <- D.h
C.cpp <- C.h <- D.h
D.cpp <- D.h
All four of those cpp files can be built in parallel, even though you're right that all of the header files are being reparsed multiple times. My claim is that parsing header files is incredibly cheap, it's translating the .cpp files that's expensive because cpp files are where the bulk of the semantic analysis and type checking is performed.
With modules, the same compilation model looks like this:
A.mxx <- B.mxx <- C.mxx <- D.mxx
There's no longer header/source and there's no longer redundancy, but I can't build this in parallel anymore. I have to first build D.mxx, then C.mxx, then B.mxx then A.mxx in serial.
Parsing a single header file in isolation is cheap, but each header will include others, and templates mean many headers contain large amounts of code inline. For instance, just including <vector> results in the compiler having to look at almost 30kloc, on my system:
No it doesn't, cl.exe's compiler is an inherently single threaded application. Parallelism in VC++ is achieved by running multiple copies of cl.exe with one serving as the primary instance and the rest as followers. The primary instance forwards individual translation units to the followers and waits for the followers to complete compilation, then at the end the primary instance terminates and the linker is invoked.
That is a linker option, not a compiler option. Modules have no effect on linking one way or another as linking is fairly independent of the compilation process.
Then your comment is off-topic and creates confusion. My point was modules inhibit the parallelism of the compilation process, compile times, not that it has any effect on the link times.
Modules do not have any effect on the linker one way or another. They are independent of it.