Bug 140914

Summary: [Win] Unable to reliably run tests in parallel
Product: WebKit Reporter: Brent Fulgham <bfulgham>
Component: Tools / TestsAssignee: Brent Fulgham <bfulgham>
Status: RESOLVED FIXED    
Severity: Normal CC: aroben, bfulgham, commit-queue, glenn, pvollan, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: 528+ (Nightly build)   
Hardware: PC   
OS: All   
Bug Depends on: 160597    
Bug Blocks:    
Attachments:
Description Flags
Patch
none
Patch
none
Patch
none
Patch none

Description Brent Fulgham 2015-01-26 18:25:09 PST
Whenever I increase the number of shards on our Windows tests bots above one I encounter variation in test failures and results. There seems to be some kind of cross-talk between tests that prevent us from running more jobs in parallel.

We need to fix this to improve our test throughput.
Comment 1 Radar WebKit Bug Importer 2015-01-26 18:25:31 PST
<rdar://problem/19609178>
Comment 2 Adam Roben (:aroben) 2015-02-23 06:40:04 PST
Could the different test executables be stomping on each other's preferences? A number of tests change particular preferences while they run.
Comment 3 Brent Fulgham 2015-02-23 20:42:18 PST
(In reply to comment #2)
> Could the different test executables be stomping on each other's
> preferences? A number of tests change particular preferences while they run.

Definitely. However, I tried to account for this in "libraryPathForDumpRenderTree()", which is based on the Mac implementation. It looks for DUMPRENDERTREE_TEMP in the environment, and uses that as the basis for its local preference storage. This is supposed to be unique per running test instance.

I guess the first thing to check is that this is unique for each running test shard. If they are not, that might explain the problem.
Comment 4 Brent Fulgham 2015-02-23 22:20:25 PST
(In reply to comment #3)
> (In reply to comment #2)
> > Could the different test executables be stomping on each other's
> > preferences? A number of tests change particular preferences while they run.
> 
> Definitely. However, I tried to account for this in
> "libraryPathForDumpRenderTree()", which is based on the Mac implementation.
> It looks for DUMPRENDERTREE_TEMP in the environment, and uses that as the
> basis for its local preference storage. This is supposed to be unique per
> running test instance.
> 
> I guess the first thing to check is that this is unique for each running
> test shard. If they are not, that might explain the problem.

So I added a printf to the libraryPathForDumpRenderTree, and confirmed that each shard is getting its own temporary directory, which it uses to seed the following preferences:

    CFPreferencesSetAppValue(WebDatabaseDirectoryDefaultsKey, WebCore::pathByAppendingComponent(libraryPath, "Databases").createCFString().get(), kCFPreferencesCurrentApplication);
    CFPreferencesSetAppValue(WebStorageDirectoryDefaultsKey, WebCore::pathByAppendingComponent(libraryPath, "LocalStorage").createCFString().get(), kCFPreferencesCurrentApplication);
    CFPreferencesSetAppValue(WebKitLocalCacheDefaultsKey, WebCore::pathByAppendingComponent(libraryPath, "LocalCache").createCFString().get(), kCFPreferencesCurrentApplication);

The question, is whether CFPreferencesSetAppValue is sticking these values in the exact same place, causing all the different shards to see the same values.
Comment 5 Adam Roben (:aroben) 2015-02-24 06:53:51 PST
(In reply to comment #4)
> The question, is whether CFPreferencesSetAppValue is sticking these values
> in the exact same place, causing all the different shards to see the same
> values.

I'm pretty sure it does.

You'll probably also need to audit uses of localUserSpecificStorageDirectory() and roamingUserSpecificStorageDirectory().
Comment 6 Brent Fulgham 2015-02-24 20:22:06 PST
(In reply to comment #5)
> (In reply to comment #4)
> > The question, is whether CFPreferencesSetAppValue is sticking these values
> > in the exact same place, causing all the different shards to see the same
> > values.
> 
> I'm pretty sure it does.
> 
> You'll probably also need to audit uses of
> localUserSpecificStorageDirectory() and
> roamingUserSpecificStorageDirectory().

After talking with a few people, I think it may NOT be possible to do this on Windows with CoreFoundation. The CFPreference store is going to end up being shared between the different processes.
Comment 7 Adam Roben (:aroben) 2015-02-25 06:18:43 PST
Maybe you could give each process a unique CFBundle name? That should give them separate preferences.
Comment 8 Per Arne Vollan 2016-08-04 07:29:35 PDT
Created attachment 285319 [details]
Patch
Comment 9 Per Arne Vollan 2016-08-04 08:30:17 PDT
There might be additional issues we need to fix, but this patch fixes the failures I see when running the dom tests in parallel.
Comment 10 Brent Fulgham 2016-08-04 09:38:40 PDT
Comment on attachment 285319 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=285319&action=review

Fantastic! r=me

> Tools/Scripts/webkitpy/common/system/path.py:58
> +        path = re.sub('/cygdrive/c', 'c:', path)

I think this would be better if we could capture the drive letter and pass it to the substitution so that "/cygdrive/d' -> 'd:'.

This would be fine to do as a follow-up to this change.
Comment 11 WebKit Commit Bot 2016-08-04 10:04:03 PDT
Comment on attachment 285319 [details]
Patch

Clearing flags on attachment: 285319

Committed r204123: <http://trac.webkit.org/changeset/204123>
Comment 12 WebKit Commit Bot 2016-08-04 10:04:07 PDT
All reviewed patches have been landed.  Closing bug.
Comment 13 WebKit Commit Bot 2016-08-05 06:41:37 PDT
Re-opened since this is blocked by bug 160597
Comment 14 Per Arne Vollan 2016-08-05 09:03:55 PDT
Created attachment 285424 [details]
Patch
Comment 15 Brent Fulgham 2016-08-05 09:09:32 PDT
Comment on attachment 285424 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=285424&action=review

Thanks for improving the change! I'm not sure what the cause of the timeout was that you fixed this time around, but these changes look good on their own merits. If the tests pass please land it.

> Tools/Scripts/webkitpy/common/system/path.py:67
> +                fileSystemRoot = _CygPath.convert_using_singleton(match.group(1))

Is it possible that someone might run tools from one root (e.g., 'C:'), and also run separate tools from another root (e.g., 'D:') during the same tool run? I guess that's pretty unlikely.
Comment 16 Per Arne Vollan 2016-08-08 07:05:46 PDT
Created attachment 285564 [details]
Patch
Comment 17 Per Arne Vollan 2016-08-08 07:21:16 PDT
(In reply to comment #15)
> Comment on attachment 285424 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=285424&action=review
> 
> Thanks for improving the change! I'm not sure what the cause of the timeout
> was that you fixed this time around, but these changes look good on their
> own merits. If the tests pass please land it.
> 
> > Tools/Scripts/webkitpy/common/system/path.py:67
> > +                fileSystemRoot = _CygPath.convert_using_singleton(match.group(1))
> 
> Is it possible that someone might run tools from one root (e.g., 'C:'), and
> also run separate tools from another root (e.g., 'D:') during the same tool
> run? I guess that's pretty unlikely.

Thanks for reviewing :) I updated the patch to handle this case.
Comment 18 Per Arne Vollan 2016-08-11 04:37:44 PDT
Created attachment 285826 [details]
Patch
Comment 19 Brent Fulgham 2016-08-11 09:54:29 PDT
Comment on attachment 285826 [details]
Patch

Excellent! r=me.
Comment 20 WebKit Commit Bot 2016-08-11 10:44:00 PDT
Comment on attachment 285826 [details]
Patch

Clearing flags on attachment: 285826

Committed r204376: <http://trac.webkit.org/changeset/204376>
Comment 21 WebKit Commit Bot 2016-08-11 10:44:05 PDT
All reviewed patches have been landed.  Closing bug.