Bug 68089

Summary: WebKit doesn't parse "#" as delimiter for fragment identifier in data URIs
Product: WebKit Reporter: Daniel Holbert <dholbert>
Component: PlatformAssignee: Rob Buis <rbuis>
Status: RESOLVED FIXED    
Severity: Normal CC: abarth, aboxhall, achristensen, apinheiro, ap, ayg, benjamin, brettw, cdumez, cfleizach, cmarcelo, darin, dmazzoni, eoconnor, eric.carlson, ews-watchlist, gavin.sharp, glenn, ht, ian, japhet, jcraig, jdiggs, jer.noble, jfernandez, julian.reschke, mathias, m.goleb+bugzilla, mkwst, moz, philipj, rbuis, rego, rniwa, roc, samuel_white, sergio, shadow2531, smcgruer, svillar, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
URL: data:text/html,<html>no hash should be visible:#
See Also: https://bugs.webkit.org/show_bug.cgi?id=219156
Bug Depends on: 217239    
Bug Blocks: 91791    
Attachments:
Description Flags
Patch
ews-feeder: commit-queue-
Patch
ews-feeder: commit-queue-
Patch
none
Patch
ews-feeder: commit-queue-
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch none

Description Daniel Holbert 2011-09-14 10:32:55 PDT
STEPS TO REPRODUCE:
 1. Load this URI directly:
      data:text/html,<html>no hash should be visible:#

EXPECTED RESULTS:
  No # should be visible on the resulting rendered content.  The "#" should signify the end of the URI and the start of the fragment identifier.

ACTUAL RESULTS:
  The page renders with a "#" as part of the content.

Firefox versions > 6 as well as Opera 11.51 show EXPECTED RESULTS.

Chromium shows ACTUAL RESULTS
 version tested: 14.0.835.126 (Developer Build 99097 Linux) Ubuntu 11.10
Comment 1 Daniel Holbert 2011-09-14 10:54:35 PDT
ALTERNATE STEPS TO REPRODUCE (slightly trickier, but same underlying issue):
 Load this URI:
=========
data:application/xhtml+xml,<html xmlns='http://www.w3.org/1999/xhtml'><div style='height:3000px; width: 300px; background: red'/><div id='target' style='height: 3000px; width: 300px; background: green;'/></html>#target
=========

ACTUAL RESULTS:  Error page, for an XML parse error, saying:
>  This page contains the following errors:
>  error on line 1 at column 47: Extra content at the end of the document
>  Below is a rendering of the page up to the first error.
> [rendering of the page]

EXPECTED RESULTS:
 (a) No errors reported
 (b) tall green rect should be visible (and no red should be visible).  That is -- the initial load should seek past the red to the 'target' element.

Again, Opera & Firefox give EXPECTED RESULTS, and Chromium 14 gives ACTUAL RESULTS.
Comment 2 Daniel Holbert 2011-09-14 10:58:48 PDT
(note: in both comment 0 and comment 1, Opera only gives "expected results" if I view the data URIs *directly*, whereas it matches WebKit's behavior if I view it in an <img> / <embed>.  That inconsistency appears to be a bug in Opera.)
Comment 3 Alexey Proskuryakov 2011-09-14 14:53:12 PDT
Hmm, there is probably some history behind this that I don't know. Looks like Chrome and Safari might have disagreed on this (per bug 22882), but no longer do.

What does IE do?
Comment 4 Adam Barth 2011-09-14 14:59:33 PDT
There's a thread about this topic on whatwg currently.  This issue is subtle.
Comment 5 Alexey Proskuryakov 2011-09-14 15:44:50 PDT
Thanks, that is <http://www.mail-archive.com/whatwg@lists.whatwg.org/msg28146.html>.

As mentioned on the thread, one can expect consistency between data:, javascript: and possibly even mailto: in this regard.
Comment 6 Adam Barth 2011-09-14 15:49:32 PDT
The underlying issue is that there's a dream that all URIs should be parseable with a single algorithm, but I don't think we can realize that dream given compatibility constraints.  Compounded on that are tricky issues of using relative fragments in documents created from data URLs.  WebKit does something somewhat magical to make this "work".

It's very unclear to me what the best solution to all these constraints is.
Comment 7 Robert O'Callahan 2011-09-18 22:22:08 PDT
Ryosuke said "It seems like sticking to RFC is a cleaner option if possible." Is it possible for Webkit? If not, we need to change the spec.
Comment 8 Adam Barth 2011-09-18 22:26:26 PDT
> Is it possible for Webkit? If not, we need to change the spec.

I haven't finished reading the discussion on whatwg yet.  My understanding is that the specification is currently being revised in the IRI working group.  I'll try to catch up on the discussion on Monday.  Is there more to read besides this bug and the whatwg thread?
Comment 9 Robert O'Callahan 2011-09-18 22:33:04 PDT
Not that I know of.
Comment 10 Adam Barth 2011-09-19 17:47:22 PDT
Ok.  I've caught up on my reading.  I don't see a big reason to change behavior here.  The dream of having all URIs parse the same way is just a dream.  Some common URI schemes, such as javascript URIs, can't be done this way.  There are also obscure ones, like ed2k, that can't work this way either.

The main reason I see to change our parsing of data URLs is in connection with fragment references in SVG.  I'm not sure what the best solution there is, but using the RFC 3986 regexp on data URLs doesn't seem like the best approach, especially give that doing so has resulted in Firefox getting compat complaints.
Comment 11 Robert O'Callahan 2011-09-19 20:54:04 PDT
OK, can you jump back into the WHATWG thread and propose a suitable spec change? I expect some opposition and I don't want Daniel or I to be the man in the middle :-).
Comment 12 Ian 'Hixie' Hickson 2011-09-19 20:59:30 PDT
IMHO what Firefox does here makes much more sense. Otherwise, how do you do fragment identifiers with data: URLs? There's simply no good reason to not support fragment identifiers here as far as I can tell, and there's no reason not to — it's trivial to escape the # character.
Comment 13 Alexey Proskuryakov 2011-09-19 22:46:23 PDT
I agree with Adam.

If same document fragment references in data: URLs somehow become really important, we'll likely want to shortcut calculation by not completing the "reltative" fragment-only references for performance reasons anyway.
Comment 14 Julian Reschke 2011-09-20 00:48:03 PDT
(In reply to comment #10)
> Ok.  I've caught up on my reading.  I don't see a big reason to change behavior here.  The dream of having all URIs parse the same way is just a dream.  Some common URI schemes, such as javascript URIs, can't be done this way.  There are also obscure ones, like ed2k, that can't work this way either.
> 
> The main reason I see to change our parsing of data URLs is in connection with fragment references in SVG.  I'm not sure what the best solution there is, but using the RFC 3986 regexp on data URLs doesn't seem like the best approach, especially give that doing so has resulted in Firefox getting compat complaints.

Sounds like "(b) and (c) are weird as well, so we can't make (a) less weirder". 

ed2k, as you say, is obscure. Does it affect browsers?

javascript is indeed interesting. I still think these aren't proper identifiers at all, and only live in browsers, so browser people may want to reserve the scheme name, and just specify what they do with them as something being in use where URIs/IRIs appear, but not really belonging to this class of identifiers.

What makes data different from the other schemes you mentioned is:

- browser implementations differ (as far as I can tell)

- they are produced/consumed in other places (per spec)

- they actually denote a representation to which a fragment can be applied (I don't think this is the case for javascript or ed2k)
Comment 15 Adam Barth 2011-09-20 00:51:44 PDT
I guess I'm still not clear on what problem we're solving.  Is there a web site that would work better if we changed WebKit's behavior?
Comment 16 Robert O'Callahan 2011-09-20 01:25:25 PDT
The problem I want to solve here is interop. If you think Webkit's behavior is fine, please document it and propose it to the list.
Comment 17 Robert O'Callahan 2011-09-20 01:40:25 PDT
When you do that it would be good if you can describe how relative URIs should work in data: URIs.

Given
data:text/html,<a href="#foo">hello</a><div id="#foo" style="margin-top:10000px;">pick me!</div>
clicking on the link seems to navigate to about:blank in Chrome, which is unexpected.
Comment 18 Adam Barth 2011-09-20 10:39:54 PDT
> The problem I want to solve here is interop. If you think Webkit's behavior is fine, please document it and propose it to the list.

I'm sorry that I don't have time to do that at the moment.  Previously I was working a full specification of URL handling, but I've stopped working on that spec.  You can find my work-in-progress here:

https://github.com/abarth/url-spec/blob/master/drafts/url.xml

> When you do that it would be good if you can describe how relative URIs should work in data: URIs.

It's pretty simple:

1) If the relative URL is an absolute URL, then the resolved URL is that absolute URL (appropriately canonicalized).

2) Otherwise, the resolved URL is an invalid URL.

Separately, trying to navigate to an invalid URL results in a navigation to about:blank.

In the spec, this is marked as a TODO:

        <t>TODO: If base-url's scheme is not hierarchical, we can't resolve as
        a relative URL.  We'll probably want to return an invalid URL.  Check
        what happens when resolving an empty string as a relative URL with a
        non-hierarchical base.</t>

From the point of view of the spec, data URLs are not hierarchical, which essentially means we parse out the scheme and leave the remainder of the URL to be parsed by a scheme-specific parser.
Comment 19 Robert O'Callahan 2011-09-20 11:51:05 PDT
OK. The following HTML displays a green rectangle in Chrome (the #greenRect is allowed to resolve to an element in the data: document), so I guess you'd consider that a bug? Should I file it?

<img src="data:image/svg+xml,&lt;svg xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'&gt;&lt;defs&gt;&lt;rect id='greenRect' height='50' width='300' fill='green'/&gt;&lt;/defs&gt;&lt;text y='30' fill='red'&gt;fail&lt;/text&gt;&lt;use xlink:href='#greenRect'/&gt;&lt;/svg&gt;">
Comment 20 Adam Barth 2011-09-20 12:03:36 PDT
> OK. The following HTML displays a green rectangle in Chrome (the #greenRect is allowed to resolve to an element in the data: document), so I guess you'd consider that a bug? Should I file it?

We can look at the code, but I think SVG has hacks to avoid URL resolution.  I believe that was discussed in more detail on the whatwg thread.  I don't have any specific knowledge about how that works.
Comment 21 Robert O'Callahan 2011-09-20 12:26:36 PDT
(In reply to comment #20)
> We can look at the code, but I think SVG has hacks to avoid URL resolution.  I
> believe that was discussed in more detail on the whatwg thread.

I can't find that in the thread.

When you say "SVG has hacks", I assume you mean Webkit's SVG implementation? Given that changing <img src> to <iframe src> makes the #greenRect not resolve, it does look like something to do with SVG images specifically. I can't imagine that we really want SVG images to behave differently here ... do you agree?
Comment 22 Adam Barth 2011-09-20 12:44:47 PDT
> I can't find that in the thread.

It was mentioned in Comment #10 and Comment #13 in this bug.  In the thread, it was mentioned in Boris Zbarsky's message on Sat, Sep 10, 2011 at 5:32 PM.

> When you say "SVG has hacks", I assume you mean Webkit's SVG implementation?

Yes.

> Given that changing <img src> to <iframe src> makes the #greenRect not resolve, it does look like something to do with SVG images specifically. I can't imagine that we really want SVG images to behave differently here ... do you agree?

My perspective is that SVG image itself a giant hack itself (or, at least, the WebKit implementation thereof is a giant hack).  Our implementation is really very bad and doesn't have a path forward to not being full of bugs.

I'm sorry to not be more helpful about this topic, but we've reached the limits of my knowledge.  I can go read the code to understand what's going on in the SVG case, but I learned about it from Boris' email.  I've studied the URL resolution behavior in more detail, and it works as I've described.

If we'd had this discussion back in April, I would have been much more excited about digging in here and finishing the URL spec.  At the moment, I'd rather give the IRIbis working group some breathing room and a chance to succeed.  Once they've either succeeded or failed, we should talk more about writing a high-quality specification for URL handling.
Comment 23 Robert O'Callahan 2011-09-20 13:20:06 PDT
(In reply to comment #22)
> > I can't find that in the thread.
> 
> It was mentioned in Comment #10 and Comment #13 in this bug.  In the thread,
> it was mentioned in Boris Zbarsky's message on Sat, Sep 10, 2011 at 5:32 PM.

Thanks.

> If we'd had this discussion back in April, I would have been much more excited
> about digging in here and finishing the URL spec.  At the moment, I'd rather
> give the IRIbis working group some breathing room and a chance to succeed.
> Once they've either succeeded or failed, we should talk more about writing a
> high-quality specification for URL handling.

Fair enough, although the status quo is quite unsatisfactory.

Of the three options:
1) Current behavior of Firefox and Opera: allow data: URIs to have fragment references, parse everything after the first # as a fragment reference, and allow relative URIs consisting of fragment references to resolve relative to a data: URI.
2) Your suggestion in comment #18 applied consistently to SVG images: disallow fragment references in data: URIs, resolve all URIs relative to a data: URI to an invalid URI.
3) Your suggestion in comment #18, except with something magical happening for SVG (possibly only in an image context) (e.g., have SVG special-case fragment-only relative URIs to bypass normal relative URI resolution)
#2 would cripple data: URIs for SVG images. #3 seems unnecessarily complex, and limiting it to SVG would be difficult as we increase the integration of SVG with CSS and HTML. #1 seems best to me.
Comment 24 Adam Barth 2011-09-20 14:01:37 PDT
Doesn't option (1) have compatibility problems?  I thought that's why this discussion started.
Comment 25 Robert O'Callahan 2011-09-20 14:24:49 PDT
The issue arose on our side when one of our developers complained that an SVG image with a data: URI containing unescaped #s worked in Webkit but not in Firefox. I don't know whether or not Web content depends on this.
Comment 26 Adam Barth 2011-09-20 14:26:57 PDT
The net-net of this discussion seems to be that you're not interesting in changing your behavior at this time and we're not interested in changing our behavior at this time.  That means we'll probably need to revisit this topic in the future.
Comment 27 Robert O'Callahan 2011-09-20 14:38:00 PDT
Personally I wouldn't mind changing our behavior if there was some agreement on what to change it to. In fact, the WHATWG thread started with Daniel and I proposing a behavior change (which, unfortunately, everyone hated :-) ).
Comment 28 Adam Barth 2011-09-20 14:45:08 PDT
It doesn't seem like there is an agreement based on Comment #10 and Comment #13.  I agree that WebKit's current behavior isn't particularly aesthetic, but treating # in data URLs as the beginning of the fragment is also pretty unappealing.

I certainly don't have a strong design to strong-arm you into adopting WebKit's behavior.  I'm sorry that there doesn't appear to be a solution that's compelling enough for everyone to adopt.  As you know, I'm a big fan of interoperability.
Comment 29 Henry S. Thompson 2011-10-19 04:30:17 PDT
(In reply to comment #18)

> . . .

> > When you do that it would be good if you can describe how relative URIs should work in data: URIs.
> 
> It's pretty simple:
> 
> 1) If the relative URL is an absolute URL, then the resolved URL is that absolute URL (appropriately canonicalized).
> 
> 2) Otherwise, the resolved URL is an invalid URL.
> 
> Separately, trying to navigate to an invalid URL results in a navigation to about:blank.
> 
> In the spec, this is marked as a TODO:
> 
>         <t>TODO: If base-url's scheme is not hierarchical, we can't resolve as
>         a relative URL.  We'll probably want to return an invalid URL.  Check
>         what happens when resolving an empty string as a relative URL with a
>         non-hierarchical base.</t>
> 
> From the point of view of the spec, data URLs are not hierarchical, which essentially means we parse out the scheme and leave the remainder of the URL to be parsed by a scheme-specific parser.

I think we have to be careful here to distinguish relative URIs in general from same-document references.  I claim that same-document references (and all the examples of relative data: URIs we've seen so far _are_ same-document references) _may_ make sense for non-hierarchical URI schemes, and should not be treated as errors.

So, for example, clicking on the link in that appears at the beginning of the result of interpreting

 data:application/xhtml+xml,<html xmlns='http://www.w3.org/1999/xhtml'><a href='#target'>go green</a><div style='height:3000px; width: 300px; background: red'/><div id='target' style='height: 3000px; width: 300px; background: green;'/></html>

should work, IMO.
Comment 30 Daniel Holbert 2011-10-21 13:11:10 PDT
(In reply to comment #29)
> So, for example, clicking on the link in that appears at the beginning of the result of interpreting
> 
>  data:application/xhtml+xml,<html xmlns='http://www.w3.org/1999/xhtml'><a href='#target'>go green</a>  [etc]

I would agree with you if you replaced "#" with "%23".

But in the URI as you provided it, the "#" technically terminates the main portion of the URI and indicates the beginning of the fragment-identifier.  So your URI just contains this (invalid) XML:
> <html xmlns='http://www.w3.org/1999/xhtml'><a href='
and that's it.

This is how both Opera and Firefox handle that URI, when they load it directly.  I agree that this strict behavior is a bit unintuitive (see the whatwg thread I started about trying to make it better[1]), but the general consensus in that thread was that the strict behavior (that Firefox implements) is correct.

http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2011-September/033162.html
Comment 31 Dominic Mazzoni 2014-06-19 09:48:14 PDT
We've been discussing this on the Chromium bug tracker: http://crbug.com/123004

Does anyone have any new thoughts on this?
Comment 32 j.j. 2018-01-24 15:21:45 PST
Blink
Intent to Deprecate and Remove: Support for '#' in data URI body

https://groups.google.com/a/chromium.org/forum/#!topic/blink-dev/_g_HnAHkMPo
Comment 33 Stephen McGruer 2019-03-13 07:28:32 PDT
FYI (without any comment on what anyone should or shouldn't do!): Blink's (second) attempt at removing this landed in our M72 milestone, and has now survived an entire milestone in the field (as M73 has started rolling out). Chromium is now aligned with the spec and Firefox; '#' in data URIs mark the end of the content and the start of the fragment.
Comment 34 Alex Christensen 2020-08-25 13:27:38 PDT
*** Bug 215820 has been marked as a duplicate of this bug. ***
Comment 35 Rob Buis 2020-09-29 03:57:34 PDT
Created attachment 409982 [details]
Patch
Comment 36 Rob Buis 2020-09-29 04:09:37 PDT
Created attachment 409983 [details]
Patch
Comment 37 Rob Buis 2020-09-29 05:41:32 PDT
Created attachment 409989 [details]
Patch
Comment 38 Rob Buis 2020-09-29 06:46:15 PDT
Created attachment 409995 [details]
Patch
Comment 39 Rob Buis 2020-09-29 08:17:57 PDT
Created attachment 410000 [details]
Patch
Comment 40 Rob Buis 2020-09-29 10:06:39 PDT
Created attachment 410013 [details]
Patch
Comment 41 Darin Adler 2020-09-29 10:14:08 PDT
Comment on attachment 410013 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=410013&action=review

> LayoutTests/ChangeLog:8
> +        Adjusts tests thats use data URIs to escape the # character.

Seems likely that some tested-only-with-WebKit content, like content embedded in old iOS apps, could accidentally depend on this behavior the way these many tests do. We might find we need a quirk.
Comment 42 Darin Adler 2020-09-29 10:23:40 PDT
Keep in mind that when Blink decides what to do in their engine, they do not have this consideration.
Comment 43 Rob Buis 2020-09-29 10:53:42 PDT
(In reply to Darin Adler from comment #41)
> Comment on attachment 410013 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=410013&action=review
> 
> > LayoutTests/ChangeLog:8
> > +        Adjusts tests thats use data URIs to escape the # character.
> 
> Seems likely that some tested-only-with-WebKit content, like content
> embedded in old iOS apps, could accidentally depend on this behavior the way
> these many tests do. We might find we need a quirk.

Good point, let's see if that will happen.
Comment 44 Darin Adler 2020-09-29 10:55:08 PDT
(In reply to Rob Buis from comment #43)
> Good point, let's see if that will happen.

The trick here is to figure out how that testing occurs; it’s not something the WebKit project directly manages, but rather something that various people inside Apple arrange for. For example, often we learn about such things from Apple’s public beta releases of iOS.
Comment 45 Ryosuke Niwa 2020-09-29 10:56:31 PDT
We probably need a runtime flag for this, and preferably internal debug flag.
Comment 46 Darin Adler 2020-09-29 10:59:37 PDT
(In reply to Ryosuke Niwa from comment #45)
> We probably need a runtime flag for this, and preferably internal debug flag.

Maybe this is something we should document.

"What to do when you are making a web-exposed change for interoperability" so I don’t have to write "this could affect content in iOS apps" over and over again, and so people understand why there might need to be a flag for something even if other engines like Blink have just made an unconditional change.
Comment 47 Stephen McGruer 2020-09-29 11:30:50 PDT
> Keep in mind that when Blink decides what to do in their engine, they do not have this consideration.

Super irrelevant, but we do :). (Sorry, I was on the cc-list and couldn't help myself).

And yeah, this change for us led to a non-trivial breakage in Android (because I missed the potential impact this could have on Android Apps, see https://crbug.com/123004#c106 and https://crbug.com/929083), and ultimately we landed a quirk delaying the change in Android until a major release (https://chromium-review.googlesource.com/c/chromium/src/+/1456638).
Comment 48 Rob Buis 2020-09-29 11:50:57 PDT
Created attachment 410028 [details]
Patch
Comment 49 Darin Adler 2020-09-29 12:31:35 PDT
(In reply to Stephen McGruer from comment #47)
> Super irrelevant, but we do :). (Sorry, I was on the cc-list and couldn't
> help myself).
> 
> And yeah, this change for us led to a non-trivial breakage in Android
> (because I missed the potential impact this could have on Android Apps, see
> https://crbug.com/123004#c106 and https://crbug.com/929083), and ultimately
> we landed a quirk delaying the change in Android until a major release
> (https://chromium-review.googlesource.com/c/chromium/src/+/1456638).

Thanks for chiming in; good to know!
Comment 52 Alex Christensen 2020-09-29 13:09:06 PDT
Comment on attachment 410028 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=410028&action=review

>>> Source/WebCore/platform/network/DataURLDecoder.cpp:121
>>> +        RuntimeEnabledFeatures::sharedFeatures().stripFragmentIdentifierFromDataURL() ? url.stringWithoutFragmentIdentifier().toString() : url.string(),
>> 
>> Since we're trying to remain compatible with existing WebKit-only content, wouldn't linkedOnOrAfter be better than a runtime enabled feature that starts as always on?
> 
> I was thinking Ryosuke wanted a runtime enabled flag to easily disable this feature if we do run into WebKit-only content that relies on the old behavior? Seems linkedOnOrAfter is very static once in place. Anyway this is why I put up the review flag :)

I'm ok with this in the hopes that we don't find any compatibility issues.
Comment 53 Ryosuke Niwa 2020-09-29 13:13:31 PDT
(In reply to Alex Christensen from comment #52)
> Comment on attachment 410028 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=410028&action=review
> 
> >>> Source/WebCore/platform/network/DataURLDecoder.cpp:121
> >>> +        RuntimeEnabledFeatures::sharedFeatures().stripFragmentIdentifierFromDataURL() ? url.stringWithoutFragmentIdentifier().toString() : url.string(),
> >> 
> >> Since we're trying to remain compatible with existing WebKit-only content, wouldn't linkedOnOrAfter be better than a runtime enabled feature that starts as always on?
> > 
> > I was thinking Ryosuke wanted a runtime enabled flag to easily disable this feature if we do run into WebKit-only content that relies on the old behavior? Seems linkedOnOrAfter is very static once in place. Anyway this is why I put up the review flag :)
> 
> I'm ok with this in the hopes that we don't find any compatibility issues.

Yeah, I think the ideal end result is that we can remove the old behavior from our codebase completely so I think this is a good start. If we find any compatibility issues, we'd go use linked on or after check instead.
Comment 54 Darin Adler 2020-09-29 13:40:16 PDT
Comment on attachment 410028 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=410028&action=review

> Source/WebCore/page/RuntimeEnabledFeatures.h:269
> +    void setStripFragmentIdentifierFromDataURL(bool isEnabled) { m_stripFragmentIdentifierFromDataURL = isEnabled; }
> +    bool stripFragmentIdentifierFromDataURL() const { return m_stripFragmentIdentifierFromDataURL; }

Can we use Settings instead?
Comment 55 Rob Buis 2020-09-30 02:27:53 PDT
Created attachment 410102 [details]
Patch
Comment 56 Rob Buis 2020-09-30 09:43:13 PDT
(In reply to Darin Adler from comment #54)
> Comment on attachment 410028 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=410028&action=review
> 
> > Source/WebCore/page/RuntimeEnabledFeatures.h:269
> > +    void setStripFragmentIdentifierFromDataURL(bool isEnabled) { m_stripFragmentIdentifierFromDataURL = isEnabled; }
> > +    bool stripFragmentIdentifierFromDataURL() const { return m_stripFragmentIdentifierFromDataURL; }
> 
> Can we use Settings instead?

I think so, I made that change.
Comment 57 Darin Adler 2020-09-30 10:35:39 PDT
So here’s the stage this patch is at:

1) It’s not in unconditionally, which is good because we may find compatibility issues that require turning it on and off.

2) It’s a clean, easy to understand, code change.

3) It’s using Settings, not the deprecated RuntimeEnabledFeatures.

4) There’s no API or SPI exposed for tweaking the setting, so if we put it in a release of iOS or macOS, the new behavior will be unconditionally on for all clients, and the fact that there's a Setting inside WebCore won't affect that one way or the other.

(1), (2), and (3) are all excellent!

(4) is probably not OK for Apple, and because of that, checking this in would probably create a sort of "P1 bug" for WebKit at Apple.

To deal with (4) we probably need to come up with a strategy like:

A) "We're just going to try this and 'watch for breakage' and come back to it later before we release this in an iOS update."

B) Like A on Mac, but on iOS use a "linked on or after" check to keep old apps working indefinitely. This would be a check, not a setting.

C) Same as B, but do the same on macOS.

Or maybe something involving adding API to control this. That seems unlikely.

For Apple it’s not great to land this as is. Maybe we can make some small change so that it doesn’t yet do (4) above?
Comment 58 Darin Adler 2020-09-30 10:36:21 PDT
Another possibility is:

D) On Apple platforms, find way to change this only for "web browsers" and not other uses of WebKit.

Or a mix of D with those A-C options.
Comment 59 Alex Christensen 2020-09-30 15:24:40 PDT
I think D is a bad idea.  I think we want to minimize the differences in behavior between "web browsing" and "non-web-browsing" WebKit.  Ryosuke seemed to be in favor of A.  I'm in favor of B and C because I think there is a high chance that if we choose A we will find something and be forced to switch to B and C.  It's conservative, but it's also not a very complex place to have a linkedOnOrAfter check, so I think it would be OK to put here for 5 or so years (until most apps have updated) without causing much of a maintenance burden.
Comment 60 Darin Adler 2020-09-30 15:38:15 PDT
Like Alex, I think (C) is appealing.
Comment 61 Ryosuke Niwa 2020-09-30 20:38:13 PDT
(In reply to Darin Adler from comment #60)
> Like Alex, I think (C) is appealing.

Yeah, now that we've learned that Android faced app compatibility issues, it would be a fool's errands to try to ship this outright.
Comment 62 Alex Christensen 2020-10-01 16:04:38 PDT
Created attachment 410279 [details]
Patch
Comment 63 Alex Christensen 2020-10-01 18:27:18 PDT
Created attachment 410297 [details]
Patch
Comment 64 Alex Christensen 2020-10-01 23:24:37 PDT
Created attachment 410304 [details]
Patch
Comment 65 Darin Adler 2020-10-02 08:51:25 PDT
Comment on attachment 410304 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=410304&action=review

> Source/WebKit/UIProcess/Cocoa/WebProcessPoolCocoa.mm:309
> +static bool linkedOnOrAfterEverything()

Why are we doing this now? We’ve had linked-on-or-after checks for a long time. What changed to make this be needed?
Comment 66 Darin Adler 2020-10-02 08:52:07 PDT
Comment on attachment 410304 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=410304&action=review

> Source/WebKit/UIProcess/Cocoa/WebProcessPoolCocoa.mm:318
> +#if PLATFORM(MAC)
> +        value = WebCore::MacApplication::isSafari();
> +#else
> +        value = WebCore::IOSApplication::isMobileSafari();
> +#endif

Seems likely this is wrong for Safari View Controller, to cite one example.
Comment 67 Alex Christensen 2020-10-02 10:02:17 PDT
Comment on attachment 410304 [details]
Patch

WebKitTestRunner and DumpRenderTree use NSUserDefaults instead of SPI to test behavior of an app that is linked on or after everything, and Safari (even on older macOS) behaves as if it were linked on the most recent SDK.  This is not an issue for Safari View Controller, which is always actually linked with the most recent SDK.
The linkedOnOrAfter implementation is basically duplicate code in WebKitLegacy and WebKit that we need to use here in WebCore.  I was trying to make a minimal change to get that working, but I think it would be even better if I unified linkedOnOrAfter, used SPI instead of NSUserDefaults, and moved it to WebCore in a separate patch before this change.
Comment 68 Darin Adler 2020-10-02 10:11:15 PDT
(In reply to Alex Christensen from comment #67)
> I was trying
> to make a minimal change to get that working, but I think it would be even
> better if I unified linkedOnOrAfter, used SPI instead of NSUserDefaults, and
> moved it to WebCore in a separate patch before this change.

Yes. I think the function and method names involved should mention testing, because otherwise it’s such a bizarre feature.
Comment 69 Alex Christensen 2020-10-05 09:07:18 PDT
Created attachment 410526 [details]
Patch
Comment 70 Alex Christensen 2020-10-05 12:36:36 PDT
http://trac.webkit.org/r267995
Comment 71 Radar WebKit Bug Importer 2020-10-05 12:38:36 PDT
<rdar://problem/69966418>
Comment 72 Rob Buis 2020-10-05 13:07:55 PDT
(In reply to Alex Christensen from comment #69)
> Created attachment 410526 [details]
> Patch

Thanks for your work on this Alex!
Comment 73 Jer Noble 2020-12-07 11:48:37 PST
FWIW, this change broke playback on Starz.com in Safari, as they use a data URI which looks like: `data:application/x-mpegURL,#EXTM3U%0A#EXT-X-INDEPENDENT-SEGMENTS%0A...`.
Comment 74 Alex Christensen 2020-12-07 13:31:51 PST
Fixing in https://bugs.webkit.org/show_bug.cgi?id=219612