Created attachment 356571 [details] Patch for the experiment During a git-bisect session, I noticed that building times were taking too long even though I was: - Using a compiler cache (CCache in case) - Doing many WebKit builds in sequence So I noticed that the building is stalled during UnifiedSources because the following may happen easily: 1) a new bundle (up to 8 source files) or more is added 2) all generated Unified Sources filenames (UnifiedSourceXYZ.cpp) after this bundle are shifted XYZ+N (N=number of new bundles) This implies ccache having lots of cache misses. To confirm that was the case, I did a little experiment (patch for the experiment attached): - Cleared CCache and compiled WebKit from clean build directory Time for a clean build (for comparison with build times below): 14973.56s user 1947.89s system 2514% cpu 11:12.97 total - After, added an source file containing only `#include "config.h"` in a separate folder in JavaScriptCore, WebCore and WebKit e.g TestUnified/TestUnified.cpp inside these folders - Added it to the beginning of each corresponding Sources.txt - It will cause a new bundle to be added because it only bundles together files in the same directory - The shifting will then occur - Built WebKit again without cleaning the build directory. Result: - Time to build: 7315.90s user 727.49s system 2456% cpu 5:27.38 total Half the time of clean-build with only 3 new "empty" source files. To improve this build time, I changed the naming of UnifiedSources to UnifiedSources-{hash}.cpp where the hash is calculated based on the UnifiedSource text content (only the #include lines with the cpp files). I then repeated the experiment: - Time to build: 27.14s user 5.44s system 226% cpu 14.405 total Just like building without changes. Advantages: - The Unified Source filename is based on its content rather than an arbitrary integer that may change frequently - When new bundles are added, only that new bundle and the ones with sources modified are recompiled. I think it should improve build time in the bots (if they do use compiler caching) and eventual git-bisect sessions in the future.
Created attachment 356572 [details] Test the bots
It looks like the bot building with Xcode are not happy :( Looking quickly at the build scripts, it seems like for the Xcode build it needs the ascending integers because the UnifiedSource filenames are not reported back after generation. The build just loops until the given number of bundles via the generate-unified-sources.sh (correct me if I'm wrong). I'll add an option '--use-hash-filename' to generate-unified-source-bundles.rb and use only in CMake builds to check if the bots are happy then, because I'm not familiar with the Xcode build to fix it there.
(In reply to Carlos Eduardo Ramalho from comment #2) > It looks like the bot building with Xcode are not happy :( the bots*
This change is actually relevant even when not using ccache. Shitfting the UnifiedSources will make CMake to detect the sources to rebuild already. Without this change, in the experiment CMake rebuilds about 690 UnifiedSources. After the change, only the unified sources with TestUnified.cpp are compiled.
Comment on attachment 356572 [details] Test the bots View in context: https://bugs.webkit.org/attachment.cgi?id=356572&action=review > Source/WTF/Scripts/generate-unified-source-bundles.rb:207 > + digest = Digest::SHA1.hexdigest(@currentBundleText)[0..12] Can we still use the number too, preferably before the hash for sorting reasons? I like having the number as it helps me know how far into the build I am. CMake doesn't really track the completion percentage all that well. So it would be "UnifiedSource-#{number}-#{digest}.#{extension}".
(In reply to Keith Miller from comment #5) > Can we still use the number too, preferably before the hash for sorting > reasons? I like having the number as it helps me know how far into the build > I am. CMake doesn't really track the completion percentage all that well. > > So it would be "UnifiedSource-#{number}-#{digest}.#{extension}". It would be nice indeed. But wouldn't the problem we're trying to solve happen again? I mean, with the number the shifting occurs and all unified sources are renamed regardless of the hash. Unless I am not seeing something. WDYT?
Wow, good idea. (In reply to Carlos Eduardo Ramalho from comment #6) > It would be nice indeed. But wouldn't the problem we're trying to solve > happen again? I mean, with the number the shifting occurs and all unified > sources are renamed regardless of the hash. Yeah... then again, adding a new file anywhere could also cause the contents of many subsequent unified source bundles to change.... (In reply to Keith Miller from comment #5) > CMake doesn't really track the completion percentage all that well. This problem is caused by using a separate build target for every single forwarding header. I think the build progress would run more naturally if that were not the case.
Created attachment 356658 [details] Test Xcode bots after adding --use-hash-filenames option
(In reply to Carlos Eduardo Ramalho from comment #8) > Created attachment 356658 [details] > Test Xcode bots after adding --use-hash-filenames option Bots are happy now.
Comment on attachment 356658 [details] Test Xcode bots after adding --use-hash-filenames option Attachment 356658 [details] did not pass ios-sim-ews (ios-simulator-wk2): Output: https://webkit-queues.webkit.org/results/10288269 New failing tests: imported/w3c/web-platform-tests/service-workers/service-worker/register-closed-window.https.html
Created attachment 356715 [details] Archive of layout-test-results from ews124 for ios-simulator-wk2 The attached test failures were seen while running run-webkit-tests on the ios-sim-ews. Bot: ews124 Port: ios-simulator-wk2 Platform: Mac OS X 10.13.6
Comment on attachment 356658 [details] Test Xcode bots after adding --use-hash-filenames option This LGTM but I'm not going to give cq+ because I want to wait and see what Keith thinks.
IIUC, this change leaves stale generated unified sources and its object files. IIUC, source files are grouped by directories. What about using filenames UnifiedSource-{hash-of-dir-name}-{seq-num-within-the-dir}.cpp?
(In reply to Fujii Hironori from comment #13) > IIUC, this change leaves stale generated unified sources and its object > files. > IIUC, source files are grouped by directories. What about using filenames > UnifiedSource-{hash-of-dir-name}-{seq-num-within-the-dir}.cpp? Well noted! I'll try to do as per you recommendation and see. Although in this new way it'll be sub-optimal compared to the original strategy because we'll have a sequence of unified sources for a given folder which may be all changed after adding a new source. But it should still speed up recompilation a lot compared to the worst case scenario as described in the example.
Created attachment 358073 [details] Patch
Comment on attachment 358073 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=358073&action=review > Source/WTF/Scripts/generate-unified-source-bundles.rb:53 > + puts "--use-hash-filenames Use a hash in unified source filenames instead of ascending integers" This description should be updated now
Created attachment 358091 [details] Patch
(In reply to Fujii Hironori from comment #17) > Created attachment 358091 [details] > Patch Great! You found out that a new option is not needed after all! :)
Review, please.
Do you have a chance to review this patch, Keith?
Comment on attachment 358091 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=358091&action=review > Source/WTF/Scripts/generate-unified-source-bundles.rb:214 > + "-#{hash}-#{@bundleCountInDirectory}" It feels a bit strange that the bundleFileName(number) function by its signature would seem to be something that would generate a distinct filename when called with different numbers, but won't actually do so when maxCount isn't set without external help (like changing bundleCountInDirectory). It's not a big deal, but could be confusing.
Comment on attachment 358091 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=358091&action=review >> Source/WTF/Scripts/generate-unified-source-bundles.rb:214 >> + "-#{hash}-#{@bundleCountInDirectory}" > > It feels a bit strange that the bundleFileName(number) function by its signature would seem to be something that would generate a distinct filename when called with different numbers, but won't actually do so when maxCount isn't set without external help (like changing bundleCountInDirectory). It's not a big deal, but could be confusing. Good point. Will fix.
Created attachment 358576 [details] Patch
Comment on attachment 358576 [details] Patch r=me I think this is fine. I'm not sure if Keith is out of office but I'd probably prefer that he cq+ it.
Comment on attachment 358576 [details] Patch Clearing flags on attachment: 358576 Committed r239813: <https://trac.webkit.org/changeset/239813>
All reviewed patches have been landed. Closing bug.
<rdar://problem/47166490>