Bug 216172

Summary: [JSC] Use symbols as identifiers for class fields computed names storage
Product: WebKit Reporter: Xan Lopez <xan.lopez>
Component: JavaScriptCoreAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: benjamin, caitp, cdumez, cmarcelo, ews-watchlist, keith_miller, mark.lam, msaboff, saam, ticaiolima, tzagallo, webkit-bug-importer, xan.lopez, ysuzuki
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on:    
Bug Blocks: 194095    
Attachments:
Description Flags
Use symbols as identifiers
none
Use symbols as identifiers
ews-feeder: commit-queue-
Use symbols as identifiers, v3
none
Use symbols as identifiers, v4
none
Use symbols as identifiers, v5
ews-feeder: commit-queue-
Use symbols as identifiers, v6
none
Use symbols as identifiers, v7
none
Use symbols as identifiers, v8
none
Use symbols as identifiers, v9
none
Use symbols as identifiers, v10
none
Use symbols as identifiers, v11
ews-feeder: commit-queue-
Use symbols as identifiers, 12
none
Use symbols as identifiers, v13
none
Use symbols as identifiers, v14
none
Use symbols as identifiers, v15 none

Xan Lopez
Reported 2020-09-04 02:57:40 PDT
Right now we are using plain numeric identifiers to store the property keys for the field's computed names ("0" for the first one, "1" for the second, etc). Using symbols is cleaner, avoids potential collisions with other features storing data in the class scope, and will allow for a more sane implementation when we land static class fields.
Attachments
Use symbols as identifiers (6.84 KB, patch)
2020-09-04 08:27 PDT, Xan Lopez
no flags
Use symbols as identifiers (12.48 KB, patch)
2020-10-01 02:34 PDT, Xan Lopez
ews-feeder: commit-queue-
Use symbols as identifiers, v3 (12.49 KB, patch)
2020-10-01 06:16 PDT, Xan Lopez
no flags
Use symbols as identifiers, v4 (16.46 KB, patch)
2020-10-19 07:54 PDT, Xan Lopez
no flags
Use symbols as identifiers, v5 (17.63 KB, patch)
2020-10-21 04:27 PDT, Xan Lopez
ews-feeder: commit-queue-
Use symbols as identifiers, v6 (18.59 KB, patch)
2020-10-21 06:36 PDT, Xan Lopez
no flags
Use symbols as identifiers, v7 (21.02 KB, patch)
2020-10-27 05:02 PDT, Xan Lopez
no flags
Use symbols as identifiers, v8 (20.70 KB, patch)
2020-10-28 02:34 PDT, Xan Lopez
no flags
Use symbols as identifiers, v9 (20.85 KB, patch)
2020-11-09 02:12 PST, Xan Lopez
no flags
Use symbols as identifiers, v10 (20.69 KB, patch)
2020-11-09 03:34 PST, Xan Lopez
no flags
Use symbols as identifiers, v11 (20.87 KB, patch)
2020-11-10 01:25 PST, Xan Lopez
ews-feeder: commit-queue-
Use symbols as identifiers, 12 (20.89 KB, patch)
2020-11-10 01:42 PST, Xan Lopez
no flags
Use symbols as identifiers, v13 (20.87 KB, patch)
2020-11-10 02:14 PST, Xan Lopez
no flags
Use symbols as identifiers, v14 (22.47 KB, patch)
2020-11-11 01:15 PST, Xan Lopez
no flags
Use symbols as identifiers, v15 (22.53 KB, patch)
2020-11-11 04:40 PST, Xan Lopez
no flags
Xan Lopez
Comment 1 2020-09-04 08:27:39 PDT
Created attachment 407975 [details] Use symbols as identifiers v1
Caio Lima
Comment 2 2020-09-04 08:45:28 PDT
Comment on attachment 407975 [details] Use symbols as identifiers View in context: https://bugs.webkit.org/attachment.cgi?id=407975&action=review > Source/JavaScriptCore/parser/Parser.cpp:2851 > + Symbol* symbol = Symbol::create(vm, vm.symbolRegistry().symbolForKey(symbolName)); I think we'd like to use PrivateSymbols here instead. Also, I think it is necessary to use a separate SymbolRegistry for this internal usage, since we can have `Symbol.for("0")` from use code and I don't think it's safe to use something that can be retrieved by arbritrary code. Finally, I don't think we actually need to create a Symbol, since we are only interested into uid and `vm.symbolRegistry().symbolForKey(symbolName)` is already giving us that.
Radar WebKit Bug Importer
Comment 3 2020-09-11 02:58:14 PDT
Xan Lopez
Comment 4 2020-10-01 02:34:23 PDT
Created attachment 410212 [details] Use symbols as identifiers Use private symbols as suggested. I'm keeping the Symbol wrapper (created with Symbol::create) since it's needed to keep the UID alive in the SymbolRegistry. It will be garbage collected as needed, so it's not leaked. Also not using a parallel SymbolRegistry for now, we'll discuss this in a sync-up meeting.
Xan Lopez
Comment 5 2020-10-01 06:16:00 PDT
Created attachment 410226 [details] Use symbols as identifiers, v3 Fix a mistake with changed default flags for registered symbols. There's still a bug with the bytecode cache, but just to make sure that's the only missing thing.
Xan Lopez
Comment 6 2020-10-01 11:04:10 PDT
(In reply to Xan Lopez from comment #5) > Created attachment 410226 [details] > Use symbols as identifiers, v3 > > Fix a mistake with changed default flags for registered symbols. There's > still a bug with the bytecode cache, but just to make sure that's the only > missing thing. Will also update the patch to use a different SymbolRegistry, as discussed.
Xan Lopez
Comment 7 2020-10-19 07:54:38 PDT
Created attachment 411746 [details] Use symbols as identifiers, v4 v4, attempts to fix the bytecode cache bugs (basically by teaching it about registered symbols) and uses a different hashshet for private symbols in SymbolRegistry.
Caio Lima
Comment 8 2020-10-20 04:54:00 PDT
Comment on attachment 411746 [details] Use symbols as identifiers, v4 View in context: https://bugs.webkit.org/attachment.cgi?id=411746&action=review > Source/JavaScriptCore/parser/Parser.cpp:2849 > + auto symbol = Symbol::create(vm, vm.symbolRegistry().privateSymbolForKey(symbolName)); After thinking this in more depth, I'm uncomfortable with the fact we have a short lived heap object to return the uid here. This seems a bad choice for this API, since for a fresh created `SymbolImpl`, the only thing that is handling its ref is the Symbol we created. It means that if GC runs after `getOrCreateSymbolForComputedName` and before we ever reference SymbolImpl somewhere else, we will end up with an invalid pointer that can cause UAF issues. I think it's not correct relying that such scenario won't happen. Is there a particular reason to have such function as a primitive? It sounds more correct if we have a less fine grain API to create/retrieve private identifier on `identifierArena` called `makePrivateIdentifier(VM& vm, unsigned identifier)`. This way, we would be able to avoid a heap allocation (removing `Symbol::create`) and have full control of SymbolImpl lifetime. > Source/WTF/wtf/text/SymbolRegistry.h:91 > + HashSet<SymbolRegistryKey> m_privateTable; I don't think we need to include a `m_privateTable` on SymbolRegistry. I think this concept of private/public is more on the client side of this API than on SymbolRegistry itself. The alternative I'd propose here is to have a new SymbolRegistry on `VM` that store private symbols. This way we would be achieving the same goal we want with current `SymbolRegistry` API we have.
Xan Lopez
Comment 9 2020-10-20 07:42:07 PDT
(In reply to Caio Lima from comment #8) > It sounds more correct if we have a less fine grain API to create/retrieve > private identifier on `identifierArena` called `makePrivateIdentifier(VM& > vm, unsigned identifier)`. This way, we would be able to avoid a heap > allocation (removing `Symbol::create`) and have full control of SymbolImpl > lifetime. This makes a lot of sense, even more so because the pending bytecode cache failures are also related to this! So we need a better API than wrapping things with a heap Symbol. > > > Source/WTF/wtf/text/SymbolRegistry.h:91 > > + HashSet<SymbolRegistryKey> m_privateTable; > > I don't think we need to include a `m_privateTable` on SymbolRegistry. I > think this concept of private/public is more on the client side of this API > than on SymbolRegistry itself. The alternative I'd propose here is to have a > new SymbolRegistry on `VM` that store private symbols. This way we would be > achieving the same goal we want with current `SymbolRegistry` API we have. Cool. What do you think about adding some code in SymbolRegistry so that it can be configured to store public or private symbols, but never both?
Caio Lima
Comment 10 2020-10-20 08:12:35 PDT
(In reply to Xan Lopez from comment #9) > (In reply to Caio Lima from comment #8) > > > It sounds more correct if we have a less fine grain API to create/retrieve > > private identifier on `identifierArena` called `makePrivateIdentifier(VM& > > vm, unsigned identifier)`. This way, we would be able to avoid a heap > > allocation (removing `Symbol::create`) and have full control of SymbolImpl > > lifetime. > > This makes a lot of sense, even more so because the pending bytecode cache > failures are also related to this! So we need a better API than wrapping > things with a heap Symbol. > > > > > > Source/WTF/wtf/text/SymbolRegistry.h:91 > > > + HashSet<SymbolRegistryKey> m_privateTable; > > > > I don't think we need to include a `m_privateTable` on SymbolRegistry. I > > think this concept of private/public is more on the client side of this API > > than on SymbolRegistry itself. The alternative I'd propose here is to have a > > new SymbolRegistry on `VM` that store private symbols. This way we would be > > achieving the same goal we want with current `SymbolRegistry` API we have. > > Cool. What do you think about adding some code in SymbolRegistry so that it > can be configured to store public or private symbols, but never both? I personally would avoid including any info into SymbolRegistry for this change. We could definitely do that in the future, if we think it's necessary. I can see the advantage of this on cases where we don't want to mix Private and Public Symbols.
Xan Lopez
Comment 11 2020-10-21 01:54:17 PDT
(In reply to Caio Lima from comment #10) > > Cool. What do you think about adding some code in SymbolRegistry so that it > > can be configured to store public or private symbols, but never both? > > I personally would avoid including any info into SymbolRegistry for this > change. We could definitely do that in the future, if we think it's > necessary. I can see the advantage of this on cases where we don't want to > mix Private and Public Symbols. The thing is, SymbolRegistry is the one creating the symbols. If there's no way to configure it we need to subclass it, otherwise it will still create non-private symbols for all keys.
Xan Lopez
Comment 12 2020-10-21 04:27:30 PDT
Created attachment 411971 [details] Use symbols as identifiers, v5 v5, use makePrivateIdentifier to create private symbols as identifiers in the parser use a new symbolregistry in VM, changing the class to create private symbols when asked to (in symbolKey()) (hopefully) fix the bytecode cache bugs
Xan Lopez
Comment 13 2020-10-21 06:36:26 PDT
Created attachment 411979 [details] Use symbols as identifiers, v6 v6, since we need to make some changes in SymbolRegistry, set the type (public/private symbols) already in the constructor, that way we ensure we'll always store only one type of symbol and get rid of the extra parameter in symbolForKey()
Caio Lima
Comment 14 2020-10-21 09:41:10 PDT
Comment on attachment 411979 [details] Use symbols as identifiers, v6 View in context: https://bugs.webkit.org/attachment.cgi?id=411979&action=review LGTM. I left one comment regarding changes on `Source/WTF/wtf/text/SymbolRegistry.h` > Source/WTF/wtf/text/SymbolRegistry.h:94 > + CreateSymbol m_createSymbol { nullptr }; Do you mind measuring mem usage with this change to double check if it's perf neutral?
Xan Lopez
Comment 15 2020-10-22 09:59:58 PDT
I run RAMification with v6 and without it (ToT), assuming that would give me a worst case of what the patch causes. v6: Footprint geomean: 58974859 (56.243 MB) Peak Footprint geomean: 63023929 (60.104 MB) Score: 60965788 (58.142 MB) ToT: Footprint geomean: 59780486 (57.011 MB) Peak Footprint geomean: 63500091 (60.558 MB) Score: 61612225 (58.758 MB) Not sure if I understand how could this be, but I'm happy to run other tests if needed.
Xan Lopez
Comment 16 2020-10-22 10:12:29 PDT
(In reply to Xan Lopez from comment #15) > I run RAMification with v6 and without it (ToT), assuming that would give me > a worst case of what the patch causes. > > v6: > > Footprint geomean: 58974859 (56.243 MB) > Peak Footprint geomean: 63023929 (60.104 MB) > Score: 60965788 (58.142 MB) > > ToT: > > Footprint geomean: 59780486 (57.011 MB) > Peak Footprint geomean: 63500091 (60.558 MB) > Score: 61612225 (58.758 MB) > > Not sure if I understand how could this be, but I'm happy to run other tests > if needed. I'm told RAMification.py itself does not have super accurate averaging so just one execution is not enough, so I'll provide better numbers soon.
Xan Lopez
Comment 17 2020-10-26 08:40:53 PDT
(In reply to Xan Lopez from comment #16) > > I'm told RAMification.py itself does not have super accurate averaging so > just one execution is not enough, so I'll provide better numbers soon. Here comparing a series of several ToT runs (A) and v6 runs (B). Again it seems somehow the v6 patch is slightly lower than ToT, but in any case if there are no errors it would seem we are good: A Low Diff: 0.14497648057239587 A high Diff: -0.14497648057239587 B Low Diff: 0.12794076230601092 B high Diff: -0.12794076230601092 Are they different?: False Mean Baseline: 58.83728571428571 Mean Changes: 58.63028571428571 Diff: 1.0035305985205112
Xan Lopez
Comment 18 2020-10-27 05:02:41 PDT
Created attachment 412410 [details] Use symbols as identifiers, v7 v7, rebased fix a bug uncovered while rebasing static fields on top of this, we were using the temporary private id for the computed property key as the function name in some cases; will update test262 accordingly, since by chance this was not tested properly in the instance test suite
Caitlin Potter (:caitp)
Comment 19 2020-10-27 12:09:30 PDT
Comment on attachment 412410 [details] Use symbols as identifiers, v7 View in context: https://bugs.webkit.org/attachment.cgi?id=412410&action=review overall it looks ok to me > Source/JavaScriptCore/bytecompiler/NodesCodegen.cpp:4827 > + setFunctionName = true; it looks like this is only used by ComputedName, so we can probably get rid of it and just add `&& m_type != DefineFieldNode::Type::ComputedName` to the if condition > Source/JavaScriptCore/bytecompiler/NodesCodegen.cpp:4868 > + if (setFunctionName) and replace this with `if (generator.shouldSetFunctionName(m_assign)`
Caitlin Potter (:caitp)
Comment 20 2020-10-27 12:10:09 PDT
Comment on attachment 412410 [details] Use symbols as identifiers, v7 View in context: https://bugs.webkit.org/attachment.cgi?id=412410&action=review overall it looks ok to me >> Source/JavaScriptCore/bytecompiler/NodesCodegen.cpp:4827 >> + setFunctionName = true; > > it looks like this is only used by ComputedName, so we can probably get rid of it and just add `&& m_type != DefineFieldNode::Type::ComputedName` to the if condition it looks like this is only used by ComputedName, so we can probably get rid of it and just add `&& m_type != DefineFieldNode::Type::ComputedName` to the if condition >> Source/JavaScriptCore/bytecompiler/NodesCodegen.cpp:4868 >> + if (setFunctionName) > > and replace this with `if (generator.shouldSetFunctionName(m_assign)` and replace this with `if (generator.shouldSetFunctionName(m_assign)`
Xan Lopez
Comment 21 2020-10-28 02:34:56 PDT
Created attachment 412513 [details] Use symbols as identifiers, v8 v8, Hrm, what you suggest does not seem to work (new failures introduced); a bit odd, since generator.shouldSetFunctionName() seems to be read-only. I have rearranged the code a bit so that the boolean is used in both places and we keep a single call to the function. Seems a bit neater to me, let me know if that's fine.
Caio Lima
Comment 22 2020-11-06 10:14:12 PST
Comment on attachment 412513 [details] Use symbols as identifiers, v8 View in context: https://bugs.webkit.org/attachment.cgi?id=412513&action=review Patch LGTM. I left a comment below. > Source/WTF/wtf/text/SymbolRegistry.h:94 > + CreateSymbol m_createSymbol { nullptr }; I'm wondering if it is really worth it have those 2 new members on `SymbolRegistry`. I'm neutral on `m_symbolType` but I'm not sold on `m_createSymbol`, since we could call either `RegisteredSymbolImpl::createPrivate` or `RegisteredSymbolImpl::create` directly on `symbolForKey` based on `m_symbolType`.
Caio Lima
Comment 23 2020-11-06 10:17:54 PST
Comment on attachment 412513 [details] Use symbols as identifiers, v8 View in context: https://bugs.webkit.org/attachment.cgi?id=412513&action=review > Source/JavaScriptCore/ChangeLog:14 > + No new tests since no functionality was changed. I think we are also fixing bug on `emitSetFunctionName` for computed fields, right? If that's the case, I think we should include tests that this patch is fixing
Xan Lopez
Comment 24 2020-11-06 10:46:31 PST
(In reply to Caio Lima from comment #22) > Comment on attachment 412513 [details] > Use symbols as identifiers, v8 > > View in context: > https://bugs.webkit.org/attachment.cgi?id=412513&action=review > > Patch LGTM. I left a comment below. > > > Source/WTF/wtf/text/SymbolRegistry.h:94 > > + CreateSymbol m_createSymbol { nullptr }; > > I'm wondering if it is really worth it have those 2 new members on > `SymbolRegistry`. I'm neutral on `m_symbolType` but I'm not sold on > `m_createSymbol`, since we could call either > `RegisteredSymbolImpl::createPrivate` or `RegisteredSymbolImpl::create` > directly on `symbolForKey` based on `m_symbolType`. Yeah, I think a previous version did this. If memory is a concern we could do it again (also the opposite, just set the function once in the ctor and then forget the type). About the test, there is one in test262 for static fields that verifies this works, but it won't land with this patch. We should add a test262 test for instance fields too, but it also would not land with this patch, right? I'd say doing that is enough coverage, but I could add a test for it locally if you think that's important.
Xan Lopez
Comment 25 2020-11-09 02:12:44 PST
Created attachment 413571 [details] Use symbols as identifiers, v9 v9, do not store the symbol creation method in SymbolRegistry. We'll figure out the correct one from the symbol type each time, this function is not called very often so there's no need for the micro-optimization.
Yusuke Suzuki
Comment 26 2020-11-09 03:00:40 PST
Comment on attachment 413571 [details] Use symbols as identifiers, v9 View in context: https://bugs.webkit.org/attachment.cgi?id=413571&action=review > Source/WTF/wtf/text/SymbolRegistry.cpp:56 > + CreateSymbol m_createSymbol; > + > + if (m_symbolType == Type::PrivateSymbol) > + m_createSymbol = &RegisteredSymbolImpl::createPrivate; > + else > + m_createSymbol = &RegisteredSymbolImpl::create; > + > + auto symbol = m_createSymbol(*rep.impl(), *this); Let's just switch if (m_symbolType == Type::PrivateSymbol) symbol = RegisteredSymbolImpl::createPrivate(...); else symbol = RegisteredSymbolImpl::create(...); We don't need to make this a indirect call, and it is slower.
Xan Lopez
Comment 27 2020-11-09 03:34:07 PST
Created attachment 413577 [details] Use symbols as identifiers, v10 v10, do not use an indirect call to create symbols in the registry
Yusuke Suzuki
Comment 28 2020-11-09 17:45:21 PST
Comment on attachment 413577 [details] Use symbols as identifiers, v10 View in context: https://bugs.webkit.org/attachment.cgi?id=413577&action=review Looks good overall. Some nits. > Source/JavaScriptCore/parser/Parser.cpp:2840 > +#define INSTANCE_COMPUTED_NAME_PREFIX "instanceComputedName" Let's use `static constexpr ASCIILiteral instanceComputedNamePrefix { "instanceComputedName"_s };` > Source/JavaScriptCore/parser/ParserArena.cpp:110 > +const Identifier& IdentifierArena::makePrivateIdentifier(VM& vm, const char* prefix, unsigned identifier) Let's use `ASCIILiteral` instead of `const char*`. > Source/WTF/wtf/text/SymbolRegistry.h:81 > + enum class Type { PublicSymbol, PrivateSymbol }; Make it `enum class Type : uint8_t` > Source/WTF/wtf/text/SymbolRegistry.h:83 > + SymbolRegistry(Type type) { m_symbolType = type; }; Set it in initializer list instead of function body.
Xan Lopez
Comment 29 2020-11-10 01:25:39 PST
Created attachment 413678 [details] Use symbols as identifiers, v11 v11, fix style issues pointed out in previous comment
Xan Lopez
Comment 30 2020-11-10 01:42:32 PST
Created attachment 413679 [details] Use symbols as identifiers, 12 v12, try to make the wincairo build happy with WTF_EXPORT_PRIVATE
Xan Lopez
Comment 31 2020-11-10 02:14:14 PST
Created attachment 413681 [details] Use symbols as identifiers, v13 v13, and remove the now unnecessary function body type initialization, forgot about it
Yusuke Suzuki
Comment 32 2020-11-10 16:22:18 PST
Comment on attachment 413681 [details] Use symbols as identifiers, v13 View in context: https://bugs.webkit.org/attachment.cgi?id=413681&action=review > Source/JavaScriptCore/bytecompiler/NodesCodegen.cpp:4827 > + if (m_ident && shouldSetFunctionName && m_type != DefineFieldNode::Type::ComputedName) It seems that `m_type != DefineFieldNode::Type::ComputedName` is new behavior. Can you add a test for this?
Xan Lopez
Comment 33 2020-11-11 01:15:42 PST
Created attachment 413795 [details] Use symbols as identifiers, v14 v14, add a test for computed fields+setFunctionName (this will be also fully tested in test262 when class fields is feature complete in WebKit)
Caio Lima
Comment 34 2020-11-11 03:51:07 PST
(In reply to Xan Lopez from comment #24) > (In reply to Caio Lima from comment #22) > > About the test, there is one in test262 for static fields that verifies this > works, but it won't land with this patch. We should add a test262 test for > instance fields too, but it also would not land with this patch, right? I'd > say doing that is enough coverage, but I could add a test for it locally if > you think that's important. The problem with relying on Test262 coverage is that we don't run them on EWS yet. This makes our life harder to know when we break things not included on stress tests, since we will only figure out that they broke after landing it. I think it's a very good practice try to have stress test to increase coverage of every change we make on a patch, with a few exceptions of cases where it's hard to reproduce them.
Caio Lima
Comment 35 2020-11-11 04:05:35 PST
Comment on attachment 413795 [details] Use symbols as identifiers, v14 View in context: https://bugs.webkit.org/attachment.cgi?id=413795&action=review > JSTests/stress/class-fields-harmony.js:922 > +} As I commented above, I think it's valuable have tests asserting that we are setting function name properly. I think there are at least 2 new cases we'd like to include: 1. Computed field with some function expression on assignment. 2. Computed field with class expression on assignment. So we'd like to assert for those cases `assertSame(c[x].name, <x evaluation to function name>)`. I'd personally would include cases where `x` is not a trivial primitive, but also an object.
Xan Lopez
Comment 36 2020-11-11 04:40:08 PST
Created attachment 413811 [details] Use symbols as identifiers, v15 v15, add a few more explicit tests as suggested by Caio. The previous test was testing setFunctionName but implicitly.
Yusuke Suzuki
Comment 37 2020-11-13 14:00:21 PST
Comment on attachment 413811 [details] Use symbols as identifiers, v15 r=me
EWS
Comment 38 2020-11-13 14:32:24 PST
Committed r269801: <https://trac.webkit.org/changeset/269801> All reviewed patches have been landed. Closing bug and clearing flags on attachment 413811 [details].
Note You need to log in before you can comment on or make changes to this bug.