Bug 235070

Summary:

Make ServiceWorkerClient.id a UUID instead of a string derived from a ScriptExecutionContextIdentifier

Product:

WebKit

Reporter:

youenn fablet <youennf>

Component:

Service Workers

Assignee:

youenn fablet <youennf>

Status:

RESOLVED FIXED

Severity:

Normal

CC:

achristensen, benjamin, cdumez, cmarcelo, darin, esprehn+autocc, ews-watchlist, falken, japhet, kangil.han, kondapallykalyan, webkit-bug-importer

Priority:

Keywords:

InRadar

Version:

WebKit Nightly Build

Hardware:

Unspecified

OS:

Unspecified

Bug Depends on:

Bug Blocks:

235107

Attachments:

Description	Flags
Patch	none
Patch	none
Patch	none
Patch	none
Patch	none
Patch	none
Patch for landing	none
Patch	ews-feeder: commit-queue-
Patch	ews-feeder: commit-queue-
Patch	none

youenn fablet

Reported 2022-01-11 08:35:50 PST

Make ServiceWorkerClient.id a UUID instead of a string derived from a ScriptExecutionContextIdentifier

Attachments
Patch (79.30 KB, patch) 2022-01-11 08:45 PST, youenn fablet	no flags	Details Formatted Diff Diff
Patch (82.73 KB, patch) 2022-01-13 02:54 PST, youenn fablet	no flags	Details Formatted Diff Diff
Patch (82.84 KB, patch) 2022-01-13 04:38 PST, youenn fablet	no flags	Details Formatted Diff Diff
Patch (91.99 KB, patch) 2022-01-13 09:28 PST, youenn fablet	no flags	Details Formatted Diff Diff
Patch (92.80 KB, patch) 2022-01-14 01:34 PST, youenn fablet	no flags	Details Formatted Diff Diff
Patch (92.10 KB, patch) 2022-01-14 04:33 PST, youenn fablet	no flags	Details Formatted Diff Diff
Patch for landing (89.27 KB, patch) 2022-01-16 23:42 PST, youenn fablet	no flags	Details Formatted Diff Diff
Patch (12.14 KB, patch) 2022-01-18 03:45 PST, youenn fablet	ews-feeder: commit-queue-	Details Formatted Diff Diff
Patch (12.29 KB, patch) 2022-01-18 03:55 PST, youenn fablet	ews-feeder: commit-queue-	Details Formatted Diff Diff
Patch (12.36 KB, patch) 2022-01-18 04:30 PST, youenn fablet	no flags	Details Formatted Diff Diff
Show Obsolete (9) View All Add attachment proposed patch, testcase, etc.

youenn fablet

Comment 1 2022-01-11 08:45:40 PST

Created attachment 448850 [details] Patch

Chris Dumez

Comment 2 2022-01-12 08:20:30 PST

Comment on attachment 448850 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=448850&action=review > Source/WebCore/ChangeLog:8 > + Make SWServer store the maping between internal ScriptExecutinoContextIdentifiers and visible identifiers. Typos: maping and ScriptExecutinoContextIdentifiers. The fact that we have to store a mapping is sad. It could for example get out of date. > Source/WebCore/ChangeLog:9 > + Visible identifiers are now generated as UUIDs, which is mapping what Chrome and Firefox are doing. Why though? The spec [1] merely says "An opaque string that uniquely identifies this environment." so I think we should use whatever is convenient. Note that there should not even be a concept of a ServiceWorkerClientIdentifier, per spec it has nothing specific to service workers and it is just an identifier for the environment (aka ScriptExecutionContext). As a matter of fact, the identifiers returned for the clients in the service worker API should match the identifiers returned for the clients in the Web Lock API. [1] https://html.spec.whatwg.org/multipage/webappapis.html#concept-environment-id > Source/WebCore/workers/service/server/SWServer.cpp:946 > + auto visibleId = createCanonicalUUIDString(); As said earlier, client ids returned both by the service worker API and the Web Lock API (and may others) need to match. This is because it is just an environment identifier (defined in the HTML spec as an opaque string), nothing specific to service workers. Therefore, when I see new logic in Service worker code to generate client identifiers, it makes me worry.

Chris Dumez

Comment 3 2022-01-12 08:26:13 PST

Note that if we really want to use a UUID for some reason, an alternative would also be to use a UUID for ScriptExecutionContextIdentifier instead of a ProcessIdentified<ObjectIdentifier>. I just don't think we should have logic specific to service workers here. I also worry of the fact that having to maintain a mapping between internal identifiers and visible identifiers could be error-prone and lead to new bugs.

Chris Dumez

Comment 4 2022-01-12 08:28:44 PST

(In reply to Chris Dumez from comment #3) > Note that if we really want to use a UUID for some reason, an alternative > would also be to use a UUID for ScriptExecutionContextIdentifier instead of > a ProcessIdentified<ObjectIdentifier>. Thinking about this more though, I think this would likely mean keeping some kind of mapping somewhere to match a given UUID to a particular process. Using a ProcessIdentified<ObjectIdentifier> makes it very convenient to find out which process it comes from. That's why I'd like to understand better the benefits of using a UUID compared to what we have now (which seems less complex / error-prone). > > I just don't think we should have logic specific to service workers here. I > also worry of the fact that having to maintain a mapping between internal > identifiers and visible identifiers could be error-prone and lead to new > bugs.

youenn fablet

Comment 5 2022-01-12 09:00:07 PST

(In reply to Chris Dumez from comment #4) > (In reply to Chris Dumez from comment #3) > > Note that if we really want to use a UUID for some reason, an alternative > > would also be to use a UUID for ScriptExecutionContextIdentifier instead of > > a ProcessIdentified<ObjectIdentifier>. > > Thinking about this more though, I think this would likely mean keeping some > kind of mapping somewhere to match a given UUID to a particular process. > Using a ProcessIdentified<ObjectIdentifier> makes it very convenient to find > out which process it comes from. That's why I'd like to understand better > the benefits of using a UUID compared to what we have now (which seems less > complex / error-prone). I can see a few reasons: - It makes it easier to handle the case of exposing an ID for a Document that is not yet created (see follow-up patch to implement resultingClientId). - The current IDs are exposing to JavaScript internal implementation details that could be leveraged. For instance whether two contexts are living in the same process. This may be used to correlate third-party iframes more easily for instance. If a service worker gets corrupted, it also allows a service worker to more easily post message to iframes that are not clients in other processes. - The spec says it should be an "opaque string", I do not think our current ID is matching the 'opaque' part of it. Firefox and Chrome are using some form of UUIDs. I did not know that other specs were using those IDs. I can rewrite the patch to externalise the map so that it can be used by other APIs. Hopefully, these APIs can use a network process based map.

Chris Dumez

Comment 6 2022-01-12 09:03:21 PST

(In reply to youenn fablet from comment #5) > (In reply to Chris Dumez from comment #4) > > (In reply to Chris Dumez from comment #3) > > > Note that if we really want to use a UUID for some reason, an alternative > > > would also be to use a UUID for ScriptExecutionContextIdentifier instead of > > > a ProcessIdentified<ObjectIdentifier>. > > > > Thinking about this more though, I think this would likely mean keeping some > > kind of mapping somewhere to match a given UUID to a particular process. > > Using a ProcessIdentified<ObjectIdentifier> makes it very convenient to find > > out which process it comes from. That's why I'd like to understand better > > the benefits of using a UUID compared to what we have now (which seems less > > complex / error-prone). > > I can see a few reasons: > - It makes it easier to handle the case of exposing an ID for a Document > that is not yet created (see follow-up patch to implement resultingClientId). > - The current IDs are exposing to JavaScript internal implementation details > that could be leveraged. For instance whether two contexts are living in the > same process. This may be used to correlate third-party iframes more easily > for instance. If a service worker gets corrupted, it also allows a service > worker to more easily post message to iframes that are not clients in other > processes. > - The spec says it should be an "opaque string", I do not think our current > ID is matching the 'opaque' part of it. Firefox and Chrome are using some > form of UUIDs. > > I did not know that other specs were using those IDs. > I can rewrite the patch to externalise the map so that it can be used by > other APIs. > Hopefully, these APIs can use a network process based map. LayoutTests/imported/w3c/web-platform-tests/web-locks/clientids.tentative.https.html is supposed to test that ids returned by the service worker API and the Web Lock API are consistent. Your patch doesn't seem to introduce new test failures though, not sure why that is.

youenn fablet

Comment 7 2022-01-13 02:54:08 PST

Created attachment 449039 [details] Patch

youenn fablet

Comment 8 2022-01-13 04:38:04 PST

Created attachment 449046 [details] Patch

youenn fablet

Comment 9 2022-01-13 09:28:35 PST

Created attachment 449071 [details] Patch

youenn fablet

Comment 10 2022-01-14 01:34:41 PST

Created attachment 449147 [details] Patch

youenn fablet

Comment 11 2022-01-14 04:33:11 PST

Created attachment 449161 [details] Patch

Chris Dumez

Comment 12 2022-01-14 08:11:39 PST

Comment on attachment 449161 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=449161&action=review > Source/WebKit/ChangeLog:8 > + Change findByClientIdentifier into findByVisibleClientIdentifieran and make it asn async reply. typo: asn > Source/WebCore/platform/ScriptExecutionContextIdentifier.h:35 > +class ProcessQualified<UUID> { I will note that I find it a bit weird to have a ProcessQualified<UUID> and a UUID is already globally unique (and not process local). > Source/WebCore/platform/ScriptExecutionContextIdentifier.h:98 > + String toVisibleString() const { return m_object.toString(); } It is related to the comment above but it is a bit confusing that we have 2 types of Strings now. It seems the Visible string (UUID) should be sufficient everywhere. > Source/WebCore/platform/ScriptExecutionContextIdentifier.h:121 > +using ScriptExecutionContextIdentifier = ProcessQualified<UUID>; Again, I don't really understand why the ScriptExecutionContextIdentifier needs to be ProcessQualified<> at all after your changes to use a UUID. > Source/WebCore/testing/Internals.cpp:2767 > + return document.identifier().object().toString(); document.identifier().toVisibleString() ?

youenn fablet

Comment 13 2022-01-14 08:41:36 PST

> I will note that I find it a bit weird to have a ProcessQualified<UUID> and > a UUID is already globally unique (and not process local). Internally, we often need to know the WebProcess ID where lives the corresponding context. We could go with a simple UUID (or maybe a typed UUID) but we would probably need each time to query a UUID -> ProcessID map either in UIProcess and/or NetworkProcess, potentially on various threads.

youenn fablet

Comment 14 2022-01-14 08:42:44 PST

> It is related to the comment above but it is a bit confusing that we have 2 > types of Strings now. It seems the Visible string (UUID) should be > sufficient everywhere. The other string is for debug purposes where it might be useful to know the process ID of a given context.

Chris Dumez

Comment 15 2022-01-14 08:43:42 PST

Chris Dumez

Comment 16 2022-01-14 08:48:51 PST

(In reply to youenn fablet from comment #13) > > I will note that I find it a bit weird to have a ProcessQualified<UUID> and > > a UUID is already globally unique (and not process local). > > Internally, we often need to know the WebProcess ID where lives the > corresponding context. We could go with a simple UUID (or maybe a typed > UUID) but we would probably need each time to query a UUID -> ProcessID map > either in UIProcess and/or NetworkProcess, potentially on various threads. I was hoping we wouldn't need such a map because both the network process and the UIProcess know which process the UUID came from (because they get it from a particular IPC connection). As a matter of fact, both the UIProcess and NetworkProcess cannot really trust the ProcessID sent by the WebProcess as part of this ScriptExecutionIdentifier and have to validate it already (since IPC from the WebProcess cannot be trusted). That said, my bet is that once the network process or UIProcess may need to know later on (not right when receiving the IPC) where the UUID came from and at that point the information would indeed be lost.

Chris Dumez

Comment 17 2022-01-14 09:24:21 PST

Comment on attachment 449161 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=449161&action=review r=me with some simplification ideas. I think the patch adds a bit more complexity than we really need. > Source/WebCore/platform/ProcessQualified.h:72 > + const T& objectHash() const Shouldn't be needed, see comment below. > Source/WebCore/platform/ProcessQualified.h:134 > + add(hasher, processQualified.objectHash()); I don't think we should need to add this objectHash(), we should probably just add this to UUID.h: inline void add(Hasher& hasher, const UUID& uuid) { add(hasher, uuid()); } > Source/WebCore/platform/ScriptExecutionContextIdentifier.h:70 > + unsigned objectHash() const Shouldn't be needed, see comment above. > Source/WebCore/platform/ScriptExecutionContextIdentifier.h:97 > + String toDebugString() const { return makeString(m_processIdentifier.toUInt64(), '-', m_object.toString()); } I personally don't think it is worth the extra mental complexity to have 2 different kinds of strings, just to print the process identifier in some cases. I think having only a single string (that is the "visible" string) would be better. > Source/WebCore/testing/Internals.idl:936 > + DOMString serviceWorkerClientInternalIdentifier(Document document); Why do we need to expose "internal" identifiers to the "Web" (I understand it is just tests). Why cannot tests rely on regular client identifiers (UUIDs)?

youenn fablet

Comment 18 2022-01-16 23:14:58 PST

(In reply to Chris Dumez from comment #17) > Comment on attachment 449161 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=449161&action=review > > r=me with some simplification ideas. I think the patch adds a bit more > complexity than we really need. > > > Source/WebCore/platform/ProcessQualified.h:72 > > + const T& objectHash() const > > Shouldn't be needed, see comment below. > > > Source/WebCore/platform/ProcessQualified.h:134 > > + add(hasher, processQualified.objectHash()); > > I don't think we should need to add this objectHash(), we should probably > just add this to UUID.h: > inline void add(Hasher& hasher, const UUID& uuid) > { > add(hasher, uuid()); > } > > > Source/WebCore/platform/ScriptExecutionContextIdentifier.h:70 > > + unsigned objectHash() const > > Shouldn't be needed, see comment above. Will simplify this. > > Source/WebCore/platform/ScriptExecutionContextIdentifier.h:97 > > + String toDebugString() const { return makeString(m_processIdentifier.toUInt64(), '-', m_object.toString()); } > > I personally don't think it is worth the extra mental complexity to have 2 > different kinds of strings, just to print the process identifier in some > cases. I think having only a single string (that is the "visible" string) > would be better. toDebugString is a ProcessQualified concept. I can see whether we can remove it but in general I prefer we keep methods defined in template.

youenn fablet

Comment 19 2022-01-16 23:42:43 PST

Created attachment 449311 [details] Patch for landing

EWS

Comment 20 2022-01-17 02:50:49 PST

Committed r288093 (246107@main): <https://commits.webkit.org/246107@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 449311 [details].

Radar WebKit Bug Importer

Comment 21 2022-01-17 02:51:20 PST

<rdar://problem/87673540>

Darin Adler

Comment 22 2022-01-17 09:48:05 PST

Comment on attachment 449311 [details] Patch for landing View in context: https://bugs.webkit.org/attachment.cgi?id=449311&action=review I spotted a mistake and some ares we could improve coding style. I’d be willing to do this refinement if you are busy on other things. Just let me know if you prefer that > Source/WTF/wtf/UUID.cpp:55 > + // We sanitize the value so as not loosing any data when going to Version 4 UUIDs. "loosing" -> "losing", but also the grammar here is wrong. Grammatical version would be: // We sanitize the value so we do not lose any data when converting this to a Version 4 UUID. But also, the comment doesn’t seem right. Is this being converted to a Version 4 UUID? What exactly *is* this code doing? Changing it from a random 16 byte value into a valid random UUID? Why is clearing these 6 bits sufficient? > Source/WTF/wtf/UUID.cpp:102 > + unsigned y = value[19]; > + if (y == '8' || y == '9') > + y = y - '8'; > + else if (y == 'a' || y == 'A') > + y = 2; > + else if (y == 'b' || y == 'B') > + y = 3; > + else > + return { }; I think it’s a little strange to use "y" both for the character and for the number. Why do this operation "in place"? But also, why do we need "y" at all? Why not just parse the hex numbers, and do any masking or validity checks on the values after they are parsed, instead of working directly on the characters? > Source/WTF/wtf/UUID.cpp:104 > + auto firstValue = parseInteger<unsigned>(value.substring(0, 8), 16); This use, and the other 5 calls below are all wrong, because they will allow leading "+". Simplest solution is to write a wrapper that checks for the "+" and fails if it’s present. Fancier solution would be to add a variant of parseInteger that does not allow the "+", as mentioned in the FIXME In StringToIntegerConversion.h. Trickiest part is probably just deciding what to name such a function. > Source/WTF/wtf/UUID.cpp:134 > + auto* data = reinterpret_cast<unsigned*>(&uuidValue); > + data[0] = *firstValue; > + data[1] = (*secondValue << 16) | *thirdValue; > + data[2] = ((*fourthValue << 16) | *fifthValue) + (y << 30); > + data[3] = *sixthValue; This should be rewritten without the reinterpret_cast; just generally a bad practice to do it if not necessary. There’s no reason we can’t use uint64_t and shifting instead. Could just make a function that takes four uint32_t and assembles them into a UInt128 if the idiom seems too ugly written out in line. > Source/WTF/wtf/UUID.cpp:136 > + return UUID(WTFMove(uuidValue)); No reason to use WTFMove on a scalar. > Source/WTF/wtf/UUID.h:131 > + add(hasher, uuid.hash()); It’s not good to hash, and then hash the hash. Instead this function should add m_data to the Hasher. Some details will need to be worked out, but that’s the general best approach. > Source/WebCore/platform/ScriptExecutionContextIdentifier.h:40 > + : m_object(WTF::UInt128(0)) Would be better style to do this with a default value below and use "= default". Also, why do we need to use UInt128 directly to get a zeroed UUID? Can we change the UUID class so that’s easier to do? > Source/WebCore/platform/ScriptExecutionContextIdentifier.h:48 > + ProcessQualified(UUID&& object, ProcessIdentifier processIdentifier) > + : m_object(WTFMove(object)) > + , m_processIdentifier(processIdentifier) > + { > + } This isn’t valuable. Since a UUID is a scalar, there’s no need to move it. Should just drop this. > Tools/TestWebKitAPI/Tests/WTF/UUID.cpp:43 > + value = UUID::parse("12345678-9abc-5de0-89AB-0123456789ab"); We can add test cases here that show the "+" bug.

youenn fablet

Comment 23 2022-01-18 03:45:29 PST

Reopening to attach new patch.

youenn fablet

Comment 24 2022-01-18 03:45:33 PST

Created attachment 449374 [details] Patch

youenn fablet

Comment 25 2022-01-18 03:51:45 PST

(In reply to Darin Adler from comment #22) > Comment on attachment 449311 [details] > Patch for landing Thanks for the review, I updated the patch accordingly. Also, I made sure we cannot parse specific values like empty or deleted. We should probably add these same checks for the decoder routine as well. > // We sanitize the value so we do not lose any data when converting this > to a Version 4 UUID. > > But also, the comment doesn’t seem right. Is this being converted to a > Version 4 UUID? What exactly *is* this code doing? Changing it from a random > 16 byte value into a valid random UUID? Why is clearing these 6 bits > sufficient? When stringifying a UUID a, we loose the values of these 6 bits. When parsing a's stringified representation into a new UUID b, we want a == b. Either we update == to ignore these 6 bits or we make sure that these 6 bits have fixed values. This code is doing the latter.

youenn fablet

Comment 26 2022-01-18 03:55:50 PST

Created attachment 449376 [details] Patch

youenn fablet

Comment 27 2022-01-18 04:30:47 PST

Created attachment 449379 [details] Patch

EWS

Comment 28 2022-01-18 08:17:48 PST

Committed r288116 (246130@main): <https://commits.webkit.org/246130@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 449379 [details].

Darin Adler

Comment 29 2022-01-18 13:29:57 PST

(In reply to youenn fablet from comment #25) > When stringifying a UUID a, we loose the values of these 6 bits. > When parsing a's stringified representation into a new UUID b, we want a == b. > Either we update == to ignore these 6 bits or we make sure that these 6 bits > have fixed values. > This code is doing the latter. But is that enough to make a valid UUID. I had the impression there were some special requirements to just turn random bits into a valid UUID, rather than just clearing a few of the bits. Like a pattern that says "this is a random UUID" to guarantee no conflicts with non-random UUIDs.

youenn fablet

Comment 30 2022-01-18 14:02:44 PST

(In reply to Darin Adler from comment #29) > (In reply to youenn fablet from comment #25) > > When stringifying a UUID a, we loose the values of these 6 bits. > > When parsing a's stringified representation into a new UUID b, we want a == b. > > Either we update == to ignore these 6 bits or we make sure that these 6 bits > > have fixed values. > > This code is doing the latter. > > But is that enough to make a valid UUID. I had the impression there were > some special requirements to just turn random bits into a valid UUID, rather > than just clearing a few of the bits. Like a pattern that says "this is a > random UUID" to guarantee no conflicts with non-random UUIDs. You are probably right, I'll look at the RFC tomorrow.

youenn fablet

Comment 31 2022-01-19 00:52:54 PST

I looked at https://datatracker.ietf.org/doc/html/rfc4122#section-4.4 and what we are doing is okayish: we are setting randomly the whole UUID but 4+2 bits. The 4 bits generate the '4' and the two bits the '8', '9', 'a', or 'b'. We could change the sanitisation to actually set these 6 bits to 0100 and 10, instead of using 0000 and 00, which would simplify the stringification routine. The internal bit representation would perfectly match the stringified version and the spec. We could also make it clear UUID::create and UUID::parse are dealing with UUIDs v4 only.

Darin Adler

Comment 32 2022-01-19 09:55:48 PST

(In reply to youenn fablet from comment #31) > We could change the sanitisation to actually set these 6 bits to 0100 and > 10, instead of using 0000 and 00, which would simplify the stringification > routine. Yes, great idea. I think that would just be two more lines of code in UUID::create. > We could also make it clear UUID::create and UUID::parse are dealing with > UUIDs v4 only. Yes, I think we should rename them to be more specific. Renaming create to make it clear what kind of UUID it creates seems good. We can just call it "v4" or whatever term makes it clear that it’s the cryptographically-random type of UUID and it creates a new unique one. For UUID::parse, not sure why we wrote one that only works with v4 UUIDs. We might want a UUID::parse that can parse any valid UUID, but which will still refuse to parse ones that conflict with our empty and deleted values for hash tables, which I hope we carefully selected so they are not valid UUIDs. Or maybe we do want to reject non-v4 UUIDs. That’s OK with me, but I’m not sure why we would would want to do that.

youenn fablet

Comment 33 2022-01-19 10:02:55 PST

Some specs require UUID v4, like the mDNS ICE candidate spec so it is good to support parsing and creation of V4

Darin Adler

Comment 34 2022-01-19 10:05:53 PST

Another way to do this is we could use a different class to hold UUIDs that we assert are v4. I don’t think a class named UUID should have that invariant, though.

youenn fablet

Comment 35 2022-01-21 00:03:16 PST

Follow-up work at https://bugs.webkit.org/show_bug.cgi?id=235430 to clarify v4 use

Sam Sneddon [:gsnedders]

Comment 36 2022-06-06 06:38:32 PDT

*** Bug 190313 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.