RESOLVED FIXED 239585
Move global AtomStrings to a common header to promote reuse
https://bugs.webkit.org/show_bug.cgi?id=239585
Summary Move global AtomStrings to a common header to promote reuse
Chris Dumez
Reported 2022-04-20 21:06:58 PDT
Move global AtomStrings to a common header to promote reuse.
Attachments
Patch (125.67 KB, patch)
2022-04-20 21:15 PDT, Chris Dumez
ews-feeder: commit-queue-
Patch (125.68 KB, patch)
2022-04-20 21:25 PDT, Chris Dumez
no flags
Patch (125.45 KB, patch)
2022-04-21 08:24 PDT, Chris Dumez
ews-feeder: commit-queue-
Patch (123.74 KB, patch)
2022-04-21 08:38 PDT, Chris Dumez
ews-feeder: commit-queue-
Patch (122.22 KB, patch)
2022-04-21 10:37 PDT, Chris Dumez
no flags
Patch (123.02 KB, patch)
2022-04-21 15:16 PDT, Chris Dumez
ews-feeder: commit-queue-
Patch (124.75 KB, patch)
2022-04-21 16:57 PDT, Chris Dumez
no flags
Patch (119.90 KB, patch)
2022-04-22 17:45 PDT, Chris Dumez
no flags
Chris Dumez
Comment 1 2022-04-20 21:15:26 PDT
Chris Dumez
Comment 2 2022-04-20 21:25:52 PDT
Chris Dumez
Comment 3 2022-04-21 08:24:34 PDT
Chris Dumez
Comment 4 2022-04-21 08:38:35 PDT
Chris Dumez
Comment 5 2022-04-21 10:37:11 PDT
Chris Dumez
Comment 6 2022-04-21 15:16:48 PDT
Chris Dumez
Comment 7 2022-04-21 16:57:21 PDT
Geoffrey Garen
Comment 8 2022-04-22 10:54:10 PDT
Comment on attachment 458103 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=458103&action=review r=me > Source/WebKit/NetworkProcess/NetworkProcess.cpp:322 > + WebCore::initializeCommonAtomStrings(); As a reluctant expert in JSC::initialize(), WTF::initialize(), and WTF::initializeMainThread(), I am saddened to see yet another "make sure to call this initialization function at various entrypoints otherwise you will crash". Can we fold common atom string initialization into some other initialization function? Maybe just fold it into AtomString::init()? In the ideal world, the set of top-level initialization functions should shrink over time. Maybe we can even get down to just two, one for WebKit and one for JSC.
Chris Dumez
Comment 9 2022-04-22 10:56:24 PDT
(In reply to Geoffrey Garen from comment #8) > Comment on attachment 458103 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=458103&action=review > > r=me > > > Source/WebKit/NetworkProcess/NetworkProcess.cpp:322 > > + WebCore::initializeCommonAtomStrings(); > > As a reluctant expert in JSC::initialize(), WTF::initialize(), and > WTF::initializeMainThread(), I am saddened to see yet another "make sure to > call this initialization function at various entrypoints otherwise you will > crash". Can we fold common atom string initialization into some other > initialization function? Maybe just fold it into AtomString::init()? AtomString::init() is in WTF/. How could it do initialization for WebCore/? I agree with the sentiment but not sure how to address it.
Darin Adler
Comment 10 2022-04-22 10:58:48 PDT
Comment on attachment 458103 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=458103&action=review > Source/WebCore/contentextensions/ContentExtensionParser.cpp:208 > AtomString::init(); > + initializeCommonAtomStrings(); Seems like we could make this more foolproof by having one function that calls both AtomString::init and WebCore::initializeCommonAtomStrings instead of having both calls in each place. Could be as simple as having initializeCommonAtomStrings call AtomString::init. > Source/WebCore/html/HTMLFormElement.cpp:1016 > + return equalIgnoringASCIICase(attributeWithoutSynchronization(autocompleteAttr), "off") ? offAtom() : onAtom(); This can use equalLettersIgnoringASCIICase for slightly better performance. > Source/WebCore/platform/CommonAtomStrings.cpp:60 > + static std::once_flag initializeKey; > + std::call_once(initializeKey, [] { > + // Initialization is not thread safe, so this function must be called from the main thread first. > + ASSERT(isUIThread()); Use of std::call_once is a *little* confusing, since the AtomString table can only be used on one thread. I assume it’s specific to the iOS legacy "web thread" machinery; the comment doesn’t make that entirely clear. If it wasn't iOS, presumably we would not need the thread safety from std::call_once and could just assert isMainThread. Might even be nice to write it that way if there was a clean way to do so. I suggest moving the assertion out of the call_once, because we don’t want anyone to write code that calls initializeCommonAtomStrings on something that is not a UI thread, even if a particular call site happens to not to be the first caller at *this* time. > Source/WebCore/platform/CommonAtomStrings.h:33 > +extern MainThreadLazyNeverDestroyed<const AtomString> alternativeAtomData; I find it annoying to have these list repeated 4 times. Even if we use macros to achieve that, would be nice not to repeat the names 4 times, along with the strings. > Source/WebCore/platform/CommonAtomStrings.h:81 > +WEBCORE_EXPORT void initializeCommonAtomStrings(); Is it a good tradeoff to always initialize all of these? Eventually as the list gets long that might not be so great, instead of lazy initialization do it all up front, and have them all in the AtomString table. This has given me pause in past when creating things like EventNames.h. > Source/WebCore/svg/SVGAnimationElement.cpp:322 > +static const AtomString& sumAtom() You decided to factor this out, but not move it into CommonAtomStrings.h? > Source/WebCore/svg/animation/SVGSMILElement.cpp:71 > +static const AtomString& indefiniteAtom() Same pattern here too. But it’s not mentioned in the change log. Not sure how you decided when to do which.
Chris Dumez
Comment 11 2022-04-22 11:04:12 PDT
(In reply to Darin Adler from comment #10) > Comment on attachment 458103 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=458103&action=review > > > Source/WebCore/contentextensions/ContentExtensionParser.cpp:208 > > AtomString::init(); > > + initializeCommonAtomStrings(); > > Seems like we could make this more foolproof by having one function that > calls both AtomString::init and WebCore::initializeCommonAtomStrings instead > of having both calls in each place. Could be as simple as having > initializeCommonAtomStrings call AtomString::init. Good idea. > > Source/WebCore/html/HTMLFormElement.cpp:1016 > > + return equalIgnoringASCIICase(attributeWithoutSynchronization(autocompleteAttr), "off") ? offAtom() : onAtom(); > > This can use equalLettersIgnoringASCIICase for slightly better performance. > Ok. > > Source/WebCore/platform/CommonAtomStrings.cpp:60 > > + static std::once_flag initializeKey; > > + std::call_once(initializeKey, [] { > > + // Initialization is not thread safe, so this function must be called from the main thread first. > > + ASSERT(isUIThread()); > > Use of std::call_once is a *little* confusing, since the AtomString table > can only be used on one thread. I assume it’s specific to the iOS legacy > "web thread" machinery; the comment doesn’t make that entirely clear. > > If it wasn't iOS, presumably we would not need the thread safety from > std::call_once and could just assert isMainThread. Might even be nice to > write it that way if there was a clean way to do so. > I suggest moving the assertion out of the call_once, because we don’t want > anyone to write code that calls initializeCommonAtomStrings on something > that is not a UI thread, even if a particular call site happens to not to be > the first caller at *this* time. I was just copying the pattern in AtomString. > > Source/WebCore/platform/CommonAtomStrings.h:33 > > +extern MainThreadLazyNeverDestroyed<const AtomString> alternativeAtomData; > > I find it annoying to have these list repeated 4 times. Even if we use > macros to achieve that, would be nice not to repeat the names 4 times, along > with the strings. > > > Source/WebCore/platform/CommonAtomStrings.h:81 > > +WEBCORE_EXPORT void initializeCommonAtomStrings(); > > Is it a good tradeoff to always initialize all of these? Eventually as the > list gets long that might not be so great, instead of lazy initialization do > it all up front, and have them all in the AtomString table. This has given > me pause in past when creating things like EventNames.h. So maybe instead of an upfront initialization, I can move to functions with an out of line body that has a: MainThreadNeverDestroyed<const AtomString> alternativeAtomData("alternative"_s); Then there wouldn't be any regression compared to existing code, would there? > > Source/WebCore/svg/SVGAnimationElement.cpp:322 > > +static const AtomString& sumAtom() > > You decided to factor this out, but not move it into CommonAtomStrings.h? > > > Source/WebCore/svg/animation/SVGSMILElement.cpp:71 > > +static const AtomString& indefiniteAtom() > > Same pattern here too. But it’s not mentioned in the change log. Not sure > how you decided when to do which. I could. I didn't do it because the callers were all in the same cpp so they just needed to be file global, not project global.
Darin Adler
Comment 12 2022-04-22 11:17:59 PDT
Comment on attachment 458103 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=458103&action=review >>> Source/WebCore/platform/CommonAtomStrings.h:33 >>> +extern MainThreadLazyNeverDestroyed<const AtomString> alternativeAtomData; >> >> I find it annoying to have these list repeated 4 times. Even if we use macros to achieve that, would be nice not to repeat the names 4 times, along with the strings. > > So maybe instead of an upfront initialization, I can move to functions with an out of line body that has a: > MainThreadNeverDestroyed<const AtomString> alternativeAtomData("alternative"_s); > > Then there wouldn't be any regression compared to existing code, would there? Probably don’t need to make a change. It’s a tradeoff. Paying for a function call and a check if it’s already initialized each time the string is used vs. using speed, more memory, and more space in the AtomString table at the time WebCore is initialized. Honestly I don’t know which side of the tradeoff we fall on, but I do know it changes as we add more and more strings, especially if they aren’t all used depending on web content. There are possible optimizations for creating them too, some of which apply only if we do them all at once. Adding many strings to the AtomString table at once could likely be optimized; if nothing else, we could size the hash table before adding all the new entries instead of possibly doing multiple rehashes during the creation process, but maybe something even better. And with either approach, we could calculate the hashes at compile time instead of runtime for these strings. Not sure which of these techniques are important enough. It’s tempting to have something where we know that callers just fetch a global pointer and do no other work, I think that’s the kind of efficiency we have for emptyString() and emptyAtom for example, and you have that here. Good news is that since you are using function calls at each call site, we can change it later if we decide we got it wrong, without affecting the call sites.
Chris Dumez
Comment 13 2022-04-22 17:45:04 PDT
EWS
Comment 14 2022-04-22 21:22:40 PDT
Committed r293285 (249910@main): <https://commits.webkit.org/249910@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 458190 [details].
Radar WebKit Bug Importer
Comment 15 2022-04-22 21:23:15 PDT
Note You need to log in before you can comment on or make changes to this bug.