The performance of ccache depends on a lot of factors, which makes it quite hard to predict the improvement for a given use case. This page contains some different performance measurements that try to give an idea about the potential speedup.
It should also be noted that if the expected hit rate is low, there may be a net performance loss when using ccache because of the overhead of cache misses (typically 5%-20%). Also, if the build machine is short on memory compared to the amount of memory used by the build tools (compiler, linker, etc), usage of ccache could decrease performance due the fact that ccache's cached files may flush other files from the OS's disk cache. See this mailing list post by Christopher Tate for a good write-up on this issue. So to sum it up: it is probably wise to perform some measurements with and without ccache for your typical use case before enabling it!
The following measurements were made on a fairly standard Linux-based desktop system: Intel Core i5-750, standard SATA disk, Ubuntu 10.04 with Linux 2.6.32 x86_64.
“ccache 3.0 direct” in the tables below means running ccache
with direct mode enabled (which is the default) and “ccache 3.0
prepr.” means running ccache with
disabling the direct mode and enabling the preprocessor mode. (A side note: The
performance of ccache 2.4 is very close to ccache 3.0 in preprocessor
|Without ccache||367.11 s||100.00 %||1.0000 x|
|ccache 3.0 direct, first time||385.67 s||105.06 %||0.9519 x|
|ccache 3.0 direct, second time||9.70 s||2.64 %||37.8464 x|
|ccache 3.0 prepr., first time||382.26 s||104.13 %||0.9604 x|
|ccache 3.0 prepr., second time||23.90 s||6.51 %||15.3603 x|
As can be seen above, cache hits in the direct mode are about 2.5 times faster than in preprocessor mode. The speedup compared to compiling without ccache is very large since the compilation costs a relatively large amount of CPU time. The overhead of cache misses can also be seen, though in this case, it's relatively small.
This is a test that aims to measure preprocessor-intensive compilation. Here, c++_includes.cc (a file including nine common include files from the C++ standard library) was compiled 1000 times using perf.py with no special flags.
|Without ccache||189.17 s||100.00 %||1.0000 x|
|ccache 3.0 direct, first time||225.86 s||119.40 %||0.8376 x|
|ccache 3.0 direct, second time||9.05 s||4.78 %||20.9028 x|
|ccache 3.0 prepr., first time||215.90 s||114.13 %||0.8762 x|
|ccache 3.0 prepr., second time||40.86 s||21.60 %||4.6297 x|
The difference between direct and preprocessor mode is about a factor 4.5 — much higher than for the ccache.c test because the preprocessor overhead is higher.
Here is a perhaps more realistic use case. First,
the Samba 3.5.3 source code was
./configure was run and then
-j) was run and timed. All tests were run eight times and
only the best results were kept. Note that the figures also include linking and
other things that aren't affected by ccache.
Warm disk cache
|Without ccache||316.23 s||100.00 %||1.0000 x|
|ccache 3.0 direct, first time||375.16 s||118.64 %||0.8429 x|
|ccache 3.0 direct, second time||32.09 s||10.15 %||9.8545 x|
|ccache 3.0 prepr., first time||360.62 s||114.04 %||0.8769 x|
|ccache 3.0 prepr., second time||161.44 s||51.05 %||1.9588 x|
|ccache 2.4, first time||359.42 s||113.66 %||0.8798 x|
|ccache 2.4, second time||159.31 s||50.38 %||1.9850 x|
The cost of cache misses is relatively high: 14% for the preprocessor mode and 19% for the direct mode. The direct mode has higher overhead than the preprocessor mode for cache misses, but is much faster for cache hits. In fact, the speedup is in this case so high that it may become interesting to optimize how the Makefile is written. The rule for compiling a C file in Samba 3.5.3 looks like this:
.c.o: @if (: >> $@ || : > $@) >/dev/null 2>&1; then rm -f $@; else \ dir=`echo $@ | sed 's,/[^/]*$$,,;s,^$$,.,'` $(MAKEDIR); fi @if test -n "$(CC_CHECKER)"; then \ echo "Checking $*.c with '$(CC_CHECKER)'";\ $(CHECK_CC); \ fi @echo Compiling $*.c @$(COMPILE) && exit 0;\ echo "The following command failed:" 1>&2;\ echo "$(subst ",\",$(COMPILE_CC))" 1>&2;\ $(COMPILE_CC) >/dev/null 2>&1By rewriting this to
.c.o: @echo Compiling $*.c @$(COMPILE)
the build time goes down from 32 seconds to about 23 seconds!
Cold disk cache
To measure the speed when the disk cache is cold, the disk cache was dropped
sync; echo 3 >/proc/sys/vm/drop_caches before
|Without ccache||324.08 s||100.00 %||1.0000 x|
|ccache 3.0 direct, first time||382.24 s||120.87 %||0.8273 x|
|ccache 3.0 direct, second time||44.59 s||14.10 %||7.0919 x|
|ccache 3.0 prepr., first time||367.82 s||116.31 %||0.8597 x|
|ccache 3.0 prepr., second time||172.34 s||54.50 %||1.8349 x|
|ccache 2.4, first time||366.48 s||115.89 %||0.8629 x|
|ccache 2.4, second time||169.67 s||53.65 %||1.8638 x|
A cold disk cache makes the performance gain on hits slightly lower because of the extra disk cache misses for files in the ccache directory.