112336 – HTMLIdentifier uses 8-12 bytes when it only needs 4-8

RESOLVED CONFIGURATION CHANGED 112336

HTMLIdentifier uses 8-12 bytes when it only needs 4-8

https://bugs.webkit.org/show_bug.cgi?id=112336

Summary HTMLIdentifier uses 8-12 bytes when it only needs 4-8

Eric Seidel (no email)

Reported 2013-03-14 02:45:59 PDT

HTMLIdentifier uses 8-12 bytes when it only needs 4-8 http://trac.webkit.org/browser/trunk/Source/WebCore/html/parser/HTMLIdentifier.h#L80 // FIXME: This could be a union. unsigned m_index; String m_string; (I no longer believe that it needs to be a Union, but rather just a StringImpl* with one bit stolen -- perhaps even there is a PointerWithMask template class to help us with this?) The one masked bit would tell you which mode it was in. Either we could treat the "known identifier" mode as a StringImpl* to a const (non-reffed) AtomicStringImpl*, or we could use an index (like the current implementation does). Using a StringImpl* for both the owned and non-owned case is nice because then the nameForIndex function completely disappears! Basically we'd just be implementing a static version of the AtomicString hash table, which held a known set of AtomicStrings and only exposed StringImpl* access to them. (And magically acted like a String in the case where the character buffer in question did not match one of our known strings.) We've sorta implemented a "ThreadPortableAtomicString" of sorts. :) Making this smaller will also make the maximum memory consumed during speculation smaller, which may affect bug 112335.

Attachments
About what I had in mind (20.34 KB, patch) 2013-03-14 16:34 PDT, Eric Seidel (no email)	no flags	Details Formatted Diff Diff
View All Add attachment proposed patch, testcase, etc.

Eric Seidel (no email)

Comment 1 2013-03-14 02:51:08 PDT

This sentence was confusing: > Either we could treat the "known identifier" mode as a StringImpl* to a const (non-reffed) > AtomicStringImpl*, or we could use an index (like the current implementation does). I've fully given up on the "index" based approach. It shoudl always be a pointer to a StringImpl*, teh question is just if we own (ref) that StringImpl or not. If it's a known identifier (AtomicStringImpl*) then we don't own it, and depend on something else to keep it alive (and in the AtomicStringTable on the main thread) forever. If we own it, then it's just a plain old StringImpl which we need to manually ref/deref since we can't steal a bit and still use RefPtr or String. :) This class would only ever provide String access to its storage from the main thread, but could provide const StringImpl* access on any thread.

Benjamin Poulain

Comment 2 2013-03-14 10:33:04 PDT

> I've fully given up on the "index" based approach. It shoudl always be a pointer to a StringImpl*, teh question is just if we own (ref) that StringImpl or not. If it's a known identifier (AtomicStringImpl*) then we don't own it, and depend on something else to keep it alive (and in the AtomicStringTable on the main thread) forever. If we own it, then it's just a plain old StringImpl which we need to manually ref/deref since we can't steal a bit and still use RefPtr or String. :) I am unfamiliar with the ownership here. Is it that if it is a AtomicStringImpl*, it must be a HTMLName, if not, it is a custom string?

Eric Seidel (no email)

Comment 3 2013-03-14 11:48:22 PDT

I wasn't very clear, sorry. I'll just code up an implementation and post it. :)

Eric Seidel (no email)

Comment 4 2013-03-14 16:34:52 PDT

Created attachment 193200 [details] About what I had in mind

Adam Barth

Comment 5 2013-03-15 12:18:06 PDT

Comment on attachment 193200 [details] About what I had in mind View in context: https://bugs.webkit.org/attachment.cgi?id=193200&action=review > Source/WebCore/html/parser/BackgroundHTMLParser.cpp:64 > -static void checkThatTokensAreSafeToSendToAnotherThread(const CompactHTMLTokenStream* tokens) > +static void checkThatTokensAreSafeToSendToAnotherThread(CompactHTMLTokenStream* tokens) Should we rename this function to make it clear that it has side-effects? I presume those side effects are only relevant in debug. > Source/WebCore/html/parser/CompactHTMLToken.h:61 > + void markForTransferToMainThread(); Can you make this a DEBUG only function? > Source/WebCore/html/parser/HTMLIdentifier.cpp:71 > + const_cast<StringImpl*>(m_pointerWithFlags.ptr())->deref(); Woah. > Source/WebCore/html/parser/HTMLIdentifier.h:45 > +// This is basically just a bitfield for a pointer. > +template <typename T> > +class PointerWithFlags { This should probably be in WTF. > Source/WebCore/html/parser/HTMLIdentifier.h:56 > + bool flag1() const { return maskedValue(kFlag1Mask); } > + void setFlag1(bool flag) { setMaskedValue(flag, kFlag1Mask); } > + > + bool flag2() const { return maskedValue(kFlag2Mask); } > + void setFlag2(bool flag) { setMaskedValue(flag, kFlag2Mask); } Should these methods be protected? Presumably the way to use this class is to subclass it and give these flags reasonable names. > Source/WebCore/html/parser/HTMLIdentifier.h:116 > +#ifndef NDEBUG > + bool isMarkedForTransferToMainThread() { return m_pointerWithFlags.flag2(); } > +#endif If this is debug-only, why not introduce debug-only storage for this value?

Anne van Kesteren

Comment 6 2023-04-01 00:53:15 PDT

This no longer exists.

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution CONFIGURATION CHANGED

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware Unspecified

OS Unspecified

Product WebKit

Component New Bugs

Assignee

Eric Seidel (no email)

Reported

2013-03-14 02:45 PDT

Modified

2023-04-01 00:53 PDT History

CC List

4 users Show

URL

Keywords

Depends on

Blocks

111645

Dependencies

tree graph