Bug 189724

Summary: REGRESSION: ( r231040 ) Layout Test http/tests/security/xss-DENIED-xsl-external-entity.xml is a flaky failure
Product: WebKit Reporter: Truitt Savell <tsavell>
Component: Tools / TestsAssignee: youenn fablet <youennf>
Status: RESOLVED FIXED    
Severity: Normal CC: ap, cdumez, commit-queue, ggaren, lforschler, ryanhaddad, sroberts, webkit-bug-importer, youennf
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
See Also: https://bugs.webkit.org/show_bug.cgi?id=189723
Attachments:
Description Flags
Patch
none
Patch
none
Patch
none
Patch none

Description Truitt Savell 2018-09-18 16:35:43 PDT
The following layout test is failing on Mac Debug WK1 

http/tests/security/xss-DENIED-xsl-external-entity.xml

Probable cause:

Unknown cause. Test is newley flakey. Closest revision that made changes to xsl was https://trac.webkit.org/changeset/235935/webkit

Flakiness Dashboard:

https://webkit-test-results.webkit.org/dashboards/flakiness_dashboard.html#showAllRuns=true&tests=http%2Ftests%2Fsecurity%2Fxss-DENIED-xsl-external-entity.xml

Diff:
--- /Volumes/Data/slave/highsierra-debug-tests-wk1/build/layout-test-results/http/tests/security/xss-DENIED-xsl-external-entity-expected.txt
+++ /Volumes/Data/slave/highsierra-debug-tests-wk1/build/layout-test-results/http/tests/security/xss-DENIED-xsl-external-entity-actual.txt
@@ -1,6 +1,6 @@
-CONSOLE MESSAGE: Unsafe attempt to load URL http://localhost:8000/security/resources/target.xml from origin http://127.0.0.1:8000. Domains, protocols and ports must match.
+CONSOLE MESSAGE: line 2: Unsafe attempt to load URL http://localhost:8000/security/resources/target.xml from origin http://127.0.0.1:8000. Domains, protocols and ports must match.
 
-CONSOLE MESSAGE: Unsafe attempt to load URL http://localhost:8000/security/resources/target.xml from origin http://127.0.0.1:8000. Domains, protocols and ports must match.
+CONSOLE MESSAGE: line 2: Unsafe attempt to load URL http://localhost:8000/security/resources/target.xml from origin http://127.0.0.1:8000. Domains, protocols and ports must match.
 
 This test includes a cross-origin external entity. It passes if the load fails and thus there is no text below this line.
Comment 1 Shawn Roberts 2019-02-26 17:24:53 PST
Did some further testing on this bug.

Test is a flaky failure on Mac WK1

Probably cause:

Test was changed in revision 231040 https://trac.webkit.org/changeset/231040/webkit
Prior to this revision, test is a 100% failure

Reproduce with:

run-webkit-tests http/tests/security/xss-DENIED-xsl-external-entity.xml -iterations 50 -f

Same diff
Comment 2 Radar WebKit Bug Importer 2019-02-26 17:25:36 PST
<rdar://problem/48422520>
Comment 3 youenn fablet 2019-03-04 09:57:46 PST
Created attachment 363521 [details]
Patch
Comment 4 Alexey Proskuryakov 2019-03-04 10:15:49 PST
Comment on attachment 363521 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=363521&action=review

> LayoutTests/http/tests/security/resources/xsl-using-external-entity.xsl.php:2
> +header("Cache-Control: no-cache, must-revalidate");

Why not "no-store"?  Do we fully support that?
Comment 5 Alexey Proskuryakov 2019-03-04 10:17:12 PST
Also, and more importantly, the original issue is not about repeating, but about running tests normally.

Is this subresource used in multiple tests? If not, then this fix will address the repro case, but not the original problem.
Comment 6 youenn fablet 2019-03-04 10:35:00 PST
(In reply to Alexey Proskuryakov from comment #4)
> Comment on attachment 363521 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=363521&action=review
> 
> > LayoutTests/http/tests/security/resources/xsl-using-external-entity.xsl.php:2
> > +header("Cache-Control: no-cache, must-revalidate");
> 
> Why not "no-store"?  Do we fully support that?

I would think so since this makes it pass when iterating for both WK1 and WK2.
no-store has the benefit of ensuring the same code path run after run which is better for stability though.

> Is this subresource used in multiple tests? If not, then this fix will
> address the repro case, but not the original problem.

AFAIK, this is not used elsewhere but the proposed change is anyway an improvement and will help further debugging.

The original issue is a WK1 only issue apparently.
There might be another case where we are flakily able to load and process the XSL resource synchronously. I would tend to fix the obvious test-specific case and see how bots react.
Comment 7 Alexey Proskuryakov 2019-03-04 10:49:32 PST
This patch looks like a good step for local debugging, but probably no need to land it. I verified that xsl-using-external-entity.xsl isn't used anywhere else, so caching is definitely not the reason why the test is flaky on bots.

Clearly, it has to be something timing related.
Comment 8 youenn fablet 2019-03-04 11:52:26 PST
> Clearly, it has to be something timing related.

It might be that we are finishing loading the XSL resource before finishing parsing of the XML main document.
Delaying the loading of the XSL resource should make the test stable.
Comment 9 youenn fablet 2019-03-04 11:56:54 PST
Created attachment 363535 [details]
Patch
Comment 10 Alexey Proskuryakov 2019-03-04 15:01:33 PST
I wonder if removing the newline in http/tests/security/xss-DENIED-xsl-external-entity.xml after "</xml>" will help. I think that this single character may be the only one that may not yet be processed when the stylesheet is applied.

Alternatively, why do we even have the difference in logging depending on timing? Perhaps logging code can be fixed to always emit the line number.
Comment 11 youenn fablet 2019-03-04 21:49:20 PST
(In reply to Alexey Proskuryakov from comment #10)
> I wonder if removing the newline in
> http/tests/security/xss-DENIED-xsl-external-entity.xml after "</xml>" will
> help. I think that this single character may be the only one that may not
> yet be processed when the stylesheet is applied.

Nope, not working.

> Alternatively, why do we even have the difference in logging depending on
> timing? Perhaps logging code can be fixed to always emit the line number.

I agree that we could improve network error console logging.

In that specific case, we could easily change it to never emit the line number.

We could also probably store the line number at the time we create the processing instruction and pass it along. But we would only use that info for a potential error reporting that probably rarely happen, and probably even more rarely since XSLT processing instructions code might not be written often these days.

This does not seem to be worth it, the flakiness is WK1 only and is not showing a real behavioral issue. A fix at the test level makes sense to me.
Comment 12 Alexey Proskuryakov 2019-03-05 11:07:55 PST
Comment on attachment 363535 [details]
Patch

r- because this can't possibly fix the problem. This patch is only useful to help investigate it locally.
Comment 13 Alexey Proskuryakov 2019-03-05 11:08:24 PST
Comment on attachment 363535 [details]
Patch

Wait, sorry, I missed the usleep that you added. Let me think...
Comment 14 Alexey Proskuryakov 2019-03-05 11:15:57 PST
Comment on attachment 363535 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=363535&action=review

If this is WebKit1 only and we honestly don't care, then maybe the right thing is to keep the test marked as flaky. But a systemic problem with logging seems like it is worth fixing at the root.

> LayoutTests/http/tests/security/resources/xsl-using-external-entity.xsl.php:3
> +header("Cache-Control: no-cache, must-revalidate");
> +header("Pragma: no-store");

I don't think "Pragma: no-store" exists, or does it? My comment about no-store was with regards to Cache-Control. In any case, pragma should be unnecessary for any browser engine made in this millennium I think.

> LayoutTests/http/tests/security/resources/xsl-using-external-entity.xsl.php:5
> +usleep(100000);

Ugh. In general, adding 100 ms timers doesn't fix flakiness on tests, because there is so much CPU congestion that we can easily have multiple second delays.

> LayoutTests/http/tests/security/resources/xsl-using-external-entity.xsl.php:7
> +$contentType = $_GET['contentType'];

Where is this set?
Comment 15 youenn fablet 2019-03-05 11:59:44 PST
Created attachment 363666 [details]
Patch
Comment 16 youenn fablet 2019-03-05 21:33:07 PST
> If this is WebKit1 only and we honestly don't care, then maybe the right
> thing is to keep the test marked as flaky.

It is good though to make sure we do not regress the core functionality in WK1 which is to fail the load.

I copied the existing test and used DumpJSConsoleLogInStdErr so that in WK1 we no longer dump the error messages.

> But a systemic problem with
> logging seems like it is worth fixing at the root.

I don't think this particular case is the most important one and I am not sure that we could do to improve this particular case would improve other cases.
Comment 17 youenn fablet 2019-03-05 21:36:13 PST
Created attachment 363730 [details]
Patch
Comment 18 Geoffrey Garen 2019-03-07 10:07:23 PST
Comment on attachment 363730 [details]
Patch

r=me
Comment 19 WebKit Commit Bot 2019-03-07 14:17:45 PST
Comment on attachment 363730 [details]
Patch

Clearing flags on attachment: 363730

Committed r242612: <https://trac.webkit.org/changeset/242612>
Comment 20 WebKit Commit Bot 2019-03-07 14:17:47 PST
All reviewed patches have been landed.  Closing bug.