RESOLVED FIXED238576
Add ability to mask URLs from DOM and JavaScript
https://bugs.webkit.org/show_bug.cgi?id=238576
Summary Add ability to mask URLs from DOM and JavaScript
Timothy Hatcher
Reported 2022-03-30 15:00:30 PDT
We should have a way to block certain URLs from being returned to DOM getters and JavaScript Error stack traces. <rdar://81991245>
Attachments
Patch (45.23 KB, patch)
2022-04-03 22:23 PDT, Timothy Hatcher
no flags
Patch (50.95 KB, patch)
2022-04-04 09:49 PDT, Timothy Hatcher
ews-feeder: commit-queue-
Patch (51.24 KB, patch)
2022-04-04 11:14 PDT, Timothy Hatcher
ews-feeder: commit-queue-
Patch (51.24 KB, patch)
2022-04-04 11:44 PDT, Timothy Hatcher
no flags
Patch (52.42 KB, patch)
2022-04-04 18:46 PDT, Timothy Hatcher
no flags
Patch (50.76 KB, patch)
2022-04-11 11:23 PDT, Timothy Hatcher
no flags
Patch (50.45 KB, patch)
2022-04-12 14:40 PDT, Timothy Hatcher
no flags
Patch (50.68 KB, patch)
2022-04-12 18:17 PDT, Timothy Hatcher
ews-feeder: commit-queue-
Timothy Hatcher
Comment 1 2022-04-03 22:23:02 PDT Comment hidden (obsolete)
Alexey Proskuryakov
Comment 2 2022-04-04 09:28:04 PDT
Comment on attachment 456529 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=456529&action=review > Source/JavaScriptCore/ChangeLog:3 > + Add ability to mask URLs from DOM and JavaScript Is this what other browsers do? Closing every way in which JS code can see URLs seems like a losing game. Also, there is some hot code being touched here, so I'm not optimistic about perf impact.
Timothy Hatcher
Comment 3 2022-04-04 09:49:26 PDT Comment hidden (obsolete)
Timothy Hatcher
Comment 4 2022-04-04 09:52:47 PDT
(In reply to Alexey Proskuryakov from comment #2) > Comment on attachment 456529 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=456529&action=review > > > Source/JavaScriptCore/ChangeLog:3 > > + Add ability to mask URLs from DOM and JavaScript > > Is this what other browsers do? Other browsers don't do this, but should consider it! > Closing every way in which JS code can see URLs seems like a losing game. Yeah, I completely understand that. Should we just give up and jet this be a fingerprinting vector then? > Also, there is some hot code being touched here, so I'm not optimistic about > perf impact. I'm also aware of this. Any suggestions?
Alexey Proskuryakov
Comment 5 2022-04-04 10:09:24 PDT
One suggestion is to please A/B test this change before landing. > Yeah, I completely understand that. Should we just give up and jet this be a fingerprinting vector then? "[extension code]" is a fingerprinting giveaway too, as is checking where exactly those appear, as is the actual impact of what extensions do. I know that this is not what the report is about, but (1) I don't feel comfortable discussing the specific report here, and (2) not sure how useful it is to focus just on that. Do we have a forum where extension behaviors of this kind can be discussed with other browser engine vendors?
Timothy Hatcher
Comment 6 2022-04-04 10:44:08 PDT
(In reply to Alexey Proskuryakov from comment #5) > One suggestion is to please A/B test this change before landing. > > > Yeah, I completely understand that. Should we just give up and jet this be a fingerprinting vector then? > > "[extension code]" is a fingerprinting giveaway too, as is checking where > exactly those appear, as is the actual impact of what extensions do. I know > that this is not what the report is about, but (1) I don't feel comfortable > discussing the specific report here, and (2) not sure how useful it is to > focus just on that. If extensions are active, that is almost impossible to hide since they change the can change the page in uniquely identifiable ways. It is the URL we want to hide, since that is 100% unique. Hiding the fact that extensions are active is not a goal. > Do we have a forum where extension behaviors of this kind can be discussed > with other browser engine vendors? Yes, we have the Web Extensions Community Group and I plan to bring this up. But some of this problem are unique to Safari too.
Timothy Hatcher
Comment 7 2022-04-04 11:14:45 PDT Comment hidden (obsolete)
Timothy Hatcher
Comment 8 2022-04-04 11:44:16 PDT Comment hidden (obsolete)
Alex Christensen
Comment 9 2022-04-04 12:02:48 PDT
Comment on attachment 456598 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=456598&action=review > Source/JavaScriptCore/runtime/StackFrame.cpp:71 > + // FIXME: This should not be hardcoded here, but come from some API with what schemes to mask. > + if (sourceURL.startsWithIgnoringASCIICase("safari-"_s)) This might be fine for performance testing this patch, but I don't think we should land code like this. This is a layering violation and if it's not easy to fix now then it will be even harder once we've committed to this functionality being here.
Timothy Hatcher
Comment 10 2022-04-04 12:09:43 PDT
(In reply to Alex Christensen from comment #9) > Comment on attachment 456598 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=456598&action=review > > > Source/JavaScriptCore/runtime/StackFrame.cpp:71 > > + // FIXME: This should not be hardcoded here, but come from some API with what schemes to mask. > > + if (sourceURL.startsWithIgnoringASCIICase("safari-"_s)) > > This might be fine for performance testing this patch, but I don't think we > should land code like this. This is a layering violation and if it's not > easy to fix now then it will be even harder once we've committed to this > functionality being here. Yeah, I agree. I'm continuing to look at the best way to add a JSC API for this.
Alexey Proskuryakov
Comment 11 2022-04-04 13:10:59 PDT
> It is the URL we want to hide, since that is 100% unique. Why does it have to be unique enough to identify the user, but not random enough to prevent identifying the user? > Hiding the fact that extensions are active is not a goal. I understand that it is not a goal of this patch. I'm seeking a confirmation that this patch solves a problem that is worth solving - in other words, whether having extensions provides enough bits to identify the user with or without this patch.
Timothy Hatcher
Comment 12 2022-04-04 15:14:22 PDT
(In reply to Alexey Proskuryakov from comment #11) > > It is the URL we want to hide, since that is 100% unique. > > Why does it have to be unique enough to identify the user, but not random > enough to prevent identifying the user? > > > Hiding the fact that extensions are active is not a goal. > > I understand that it is not a goal of this patch. I'm seeking a confirmation > that this patch solves a problem that is worth solving - in other words, > whether having extensions provides enough bits to identify the user with or > without this patch. I’m not sure I fully follow. Extensions in Safari get a new UUID for the host part of their URL on launch. This UUID is the same for all tabs, private or otherwise. That UUID is what we need to hide to prevent that cross site/tab tracking with a 128-bit id. Without that UUID, only a single bit of knowing an extension, and maybe a specific extension, could be determined if they modify a page. The more popular the extension, like a password manager, the less useful that bit is.
Alexey Proskuryakov
Comment 13 2022-04-04 17:29:36 PDT
I do not know what the average bucket size is today, so I'm not sure if installing an extension that 1% (or say 0.1%) or users have means that the browser becomes unique in the bucket. This is what I was seeking for someone to confirm. My other question was about why the UUID behaved like that, shared across all tabs.
Alexey Proskuryakov
Comment 14 2022-04-04 17:30:18 PDT
(and why it's a UUID, can we just have incrementing numbers instead, for example)
Timothy Hatcher
Comment 15 2022-04-04 18:44:44 PDT
(In reply to Alexey Proskuryakov from comment #13) > I do not know what the average bucket size is today, so I'm not sure if > installing an extension that 1% (or say 0.1%) or users have means that the > browser becomes unique in the bucket. This is what I was seeking for someone > to confirm. A good percentage of users have at least one extension active. > My other question was about why the UUID behaved like that, shared across > all tabs. > > (and why it's a UUID, can we just have incrementing numbers instead, for example) We register the UUID with a scheme handler to handle loading resources for extensions. It is the host name essentially. Do we need to meet about this? I feel we could clear things up quicker in a meeting.
Timothy Hatcher
Comment 16 2022-04-04 18:46:18 PDT Comment hidden (obsolete)
Timothy Hatcher
Comment 17 2022-04-06 10:26:49 PDT
I was able to A/B test this patch: Speedometer: A: 84.07 pt ± 3.0 pt B: 84.39 pt ± 2.3 pt 0.38% better (insignificant) PLT (Mac): A: 532.9 ms ± 3.8 ms B: 532.4 ms ± 4.8 ms 0.10% better (insignificant) PLT (iPhone): A: 489.8 ms ± 6.7 ms B: 487.3 ms ± 7.2 ms 0.51% better (insignificant) I'm still checking Jetstream (waiting on a retry for results).
Timothy Hatcher
Comment 18 2022-04-06 15:01:32 PDT
JetStream 2: A: 114.6 pt ± 1.1 pt B: 114.2 pt ± 1.8 pt 0.35% worse (insignificant)
Alexey Proskuryakov
Comment 19 2022-04-06 17:10:59 PDT
Thank you for running the performance tests. My other concerns stand, but I don't think that I'm the right person to decide if this should be landed, or to work with you on an alternative solution in Safari. So Im deferring to others' judgement. Seems like api-ios EWS failure is relevant.
Chris Dumez
Comment 20 2022-04-06 17:38:30 PDT
(In reply to Alexey Proskuryakov from comment #19) > Thank you for running the performance tests. My other concerns stand, but I > don't think that I'm the right person to decide if this should be landed, or > to work with you on an alternative solution in Safari. So Im deferring to > others' judgement. > > Seems like api-ios EWS failure is relevant. I need to look more into this, this look overly intrusive to me. I bet there is a better way.
Timothy Hatcher
Comment 21 2022-04-11 11:23:26 PDT Comment hidden (obsolete)
Timothy Hatcher
Comment 22 2022-04-12 14:40:12 PDT Comment hidden (obsolete)
Timothy Hatcher
Comment 23 2022-04-12 18:17:37 PDT Comment hidden (obsolete)
Darin Adler
Comment 24 2022-04-12 18:21:23 PDT
Comment on attachment 457495 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=457495&action=review > Source/WebCore/page/Page.cpp:3737 > + return m_maskedURLSchemes.contains(url.protocol().toStringWithoutCopying()); This can be done without calling toString, using the StringView hash translator, which should be considerably more efficient.
Chris Dumez
Comment 25 2022-04-12 19:18:00 PDT
Comment on attachment 457495 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=457495&action=review Does the origin leak any information we don't want to leak or is it only about full URLs? Thinking about self.origin for example. And window.location.ancestorOrigins. Also, service workers can see the URL of their clients via ServiceWorkerClient.url. Not sure if that's covered and if it matters for your use cases. Also, HashChangeEvent contains the old and new URL when changing the hash of the URL. Wouldn't it be an issue here? > Source/WebCore/dom/Document.cpp:5635 > + static std::once_flag onceFlag; Why std::call_once()? I doubt this can be called from multiple threads given that this is on Document. Why not use a regular MainThreadNeverDestroyed? > Source/WebCore/dom/Document.h:723 > + bool shouldMaskURLForBindings(const URL&) const; Doesn't seem like this needs to be public. > Source/WebCore/dom/Document.h:727 > + const URL& maskedURLForBindings() const; ditto. > Source/WebCore/dom/Element.cpp:728 > + synchronizeAttribute(name); Can't this call getAttribute() and then do some post-processing based on resolveURLs? This would reduce code duplication. > Source/WebCore/dom/Element.cpp:1829 > +AtomString Element::getAttributeForBindings(const AtomString& localName, ResolveURLs resolveURLs) const localName -> qualifiedName, per spec. > Source/WebCore/dom/Element.cpp:1834 > + synchronizeAttribute(localName); Can't this call getAttribute() and then do some post-processing based on resolveURLs? This would reduce code duplication. > Source/WebCore/dom/Element.h:84 > +enum class ResolveURLs : uint8_t { No, NoExcludingURLsForPrivacy, Yes, YesExcludingURLsForPrivacy }; Maybe this should be 2 separate enums? One for resolvingURLs and one for excludingURLsForPrivacy? > Source/WebCore/dom/Element.h:144 > + WEBCORE_EXPORT const AtomString& getAttribute(const AtomString& localName) const; Please revert this change. The new name is wrong and the new name is correct. See the spec here: https://dom.spec.whatwg.org/#dom-element-getattribute This is a qualified name as a string (e.g. "ns:foo"). > Source/WebCore/dom/Element.h:147 > + AtomString getAttributeForBindings(const AtomString& localName, ResolveURLs = ResolveURLs::NoExcludingURLsForPrivacy) const; Should probably use qualifiedName too given that this is the equivalent function for bindings now. > Source/WebCore/dom/Element.idl:44 > DOMString? getAttributeNS(DOMString? namespaceURI, DOMString localName); I am a little surprised you didn't have to hijack getAttributeNS() too. After all, getAttribute() and getAttributeNS() are almost identical except that you can specify a namespace. As a matter of fact, I believe passing null as namespace will make it identical to getAttribute(). >> Source/WebCore/page/Page.cpp:3737 >> + return m_maskedURLSchemes.contains(url.protocol().toStringWithoutCopying()); > > This can be done without calling toString, using the StringView hash translator, which should be considerably more efficient. m_maskedURLSchemes.contains<StringViewHashTranslator>(url.protocol()); > Tools/TestWebKitAPI/Tests/WebKitCocoa/WKWebViewConfiguration.mm:236 > + We may want to add tests for window.location.href and probably window.location.protocol / port / host / pathname / search / hash, document.url, document.documentURI.
Darin Adler
Comment 26 2022-04-13 09:14:52 PDT
Based on Chris’s comments above we must add getAttributeNS test coverage.
Timothy Hatcher
Comment 27 2022-04-13 10:57:00 PDT
(In reply to Chris Dumez from comment #25) > Comment on attachment 457495 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=457495&action=review > > Does the origin leak any information we don't want to leak or is it only > about full URLs? Thinking about self.origin for example. And > window.location.ancestorOrigins. The origin is the important part to hide, but the full URL can contain info too (depending on the extension). However, these URLs only need masked when the web view is showing external pages that can have extension content injected. The main document will never be an extension loaded origin, so things like self.origin, etc are not a concern for this use-case. I agree if other clients needed this SPI, we would need to protect more. > Also, service workers can see the URL of their clients via > ServiceWorkerClient.url. Not sure if that's covered and if it matters for > your use cases. Similarly, service workers are not a concern. > Also, HashChangeEvent contains the old and new URL when changing the hash of > the URL. Wouldn't it be an issue here? Ditto.
Timothy Hatcher
Comment 28 2022-04-13 10:57:36 PDT
(In reply to Darin Adler from comment #26) > Based on Chris’s comments above we must add getAttributeNS test coverage. Yep! Good catch Chris.
Darin Adler
Comment 29 2022-04-13 11:01:36 PDT
We may need some adversarial testing to find places URLs might leak out.
Timothy Hatcher
Comment 30 2022-06-24 12:26:10 PDT
Comment on attachment 457495 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=457495&action=review >> Source/WebCore/dom/Document.cpp:5635 >> + static std::once_flag onceFlag; > > Why std::call_once()? I doubt this can be called from multiple threads given that this is on Document. Why not use a regular MainThreadNeverDestroyed? Fixed. >> Source/WebCore/dom/Document.h:723 >> + bool shouldMaskURLForBindings(const URL&) const; > > Doesn't seem like this needs to be public. It is called from Element. >> Source/WebCore/dom/Document.h:727 >> + const URL& maskedURLForBindings() const; > > ditto. Ditto. >> Source/WebCore/dom/Element.cpp:728 >> + synchronizeAttribute(name); > > Can't this call getAttribute() and then do some post-processing based on resolveURLs? This would reduce code duplication. I added an inline getAttributeInternal since we need the Attribute, and getAttribute returns an AtomString. >> Source/WebCore/dom/Element.cpp:1829 >> +AtomString Element::getAttributeForBindings(const AtomString& localName, ResolveURLs resolveURLs) const > > localName -> qualifiedName, per spec. Fixed. >> Source/WebCore/dom/Element.cpp:1834 >> + synchronizeAttribute(localName); > > Can't this call getAttribute() and then do some post-processing based on resolveURLs? This would reduce code duplication. See above. >> Source/WebCore/dom/Element.h:84 >> +enum class ResolveURLs : uint8_t { No, NoExcludingURLsForPrivacy, Yes, YesExcludingURLsForPrivacy }; > > Maybe this should be 2 separate enums? One for resolvingURLs and one for excludingURLsForPrivacy? That made the switches more unwieldy in my opinion. >> Source/WebCore/dom/Element.h:144 >> + WEBCORE_EXPORT const AtomString& getAttribute(const AtomString& localName) const; > > Please revert this change. The new name is wrong and the new name is correct. See the spec here: > https://dom.spec.whatwg.org/#dom-element-getattribute > > This is a qualified name as a string (e.g. "ns:foo"). Fixed. >> Source/WebCore/dom/Element.h:147 >> + AtomString getAttributeForBindings(const AtomString& localName, ResolveURLs = ResolveURLs::NoExcludingURLsForPrivacy) const; > > Should probably use qualifiedName too given that this is the equivalent function for bindings now. Fixed. >> Source/WebCore/dom/Element.idl:44 >> DOMString? getAttributeNS(DOMString? namespaceURI, DOMString localName); > > I am a little surprised you didn't have to hijack getAttributeNS() too. After all, getAttribute() and getAttributeNS() are almost identical except that you can specify a namespace. As a matter of fact, I believe passing null as namespace will make it identical to getAttribute(). Good catch! I did miss that one because an earlier version of my change just modified Element::getAttribute, which getAttributeNS uses. Once I switched to a getAttributeForBindings I needed to do similar for getAttributeNS. >>> Source/WebCore/page/Page.cpp:3737 >>> + return m_maskedURLSchemes.contains(url.protocol().toStringWithoutCopying()); >> >> This can be done without calling toString, using the StringView hash translator, which should be considerably more efficient. > > m_maskedURLSchemes.contains<StringViewHashTranslator>(url.protocol()); Awesome! >> Tools/TestWebKitAPI/Tests/WebKitCocoa/WKWebViewConfiguration.mm:236 >> + > > We may want to add tests for window.location.href and probably window.location.protocol / port / host / pathname / search / hash, document.url, document.documentURI. Those URLs are not covered by this API. The intent of this API is to hide sub resources and links, not the main resource URL.
Timothy Hatcher
Comment 31 2022-06-24 13:03:18 PDT
EWS
Comment 32 2022-06-29 14:14:53 PDT
Committed 251962@main (c0c652e4bd3c): <https://commits.webkit.org/251962@main> Reviewed commits have been landed. Closing PR #1783 and removing active labels.
Note You need to log in before you can comment on or make changes to this bug.