Bug 107317

Summary: Make CompactHTMLToken a little more compact
Product: WebKit Reporter: Tony Gentilcore <tonyg>
Component: New BugsAssignee: Tony Gentilcore <tonyg>
Status: RESOLVED FIXED    
Severity: Normal CC: abarth, dglazkov, eric, hclam, ojan.autocc, webkit.review.bot
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on:    
Bug Blocks: 106127    
Attachments:
Description Flags
Patch
none
Patch
none
An alternate approach
none
Updated to TOT
none
Patch for landing
none
Patch for landing
none
Patch for landing
none
Patch for landing none

Description Tony Gentilcore 2013-01-18 12:59:31 PST
Make CompactHTMLToken a little more compact
Comment 1 Tony Gentilcore 2013-01-18 12:59:54 PST
Created attachment 183533 [details]
Patch
Comment 2 Eric Seidel (no email) 2013-01-18 13:07:02 PST
Comment on attachment 183533 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=183533&action=review

> Source/WebCore/html/parser/BackgroundHTMLParser.cpp:129
> +        if (m_token.type() == HTMLTokenTypes::DOCTYPE)
> +            m_pendingTokens.append(DocTypeCompactHTMLToken(m_token));

How can this work?  Aren't we appending these by value, not pointers?
Comment 3 Tony Gentilcore 2013-01-18 13:50:09 PST
(In reply to comment #2)
> (From update of attachment 183533 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=183533&action=review
> 
> > Source/WebCore/html/parser/BackgroundHTMLParser.cpp:129
> > +        if (m_token.type() == HTMLTokenTypes::DOCTYPE)
> > +            m_pendingTokens.append(DocTypeCompactHTMLToken(m_token));
> 
> How can this work?  Aren't we appending these by value, not pointers?

Oops, I'll have to think of another way to go about this.
Comment 4 Adam Barth 2013-01-18 14:22:31 PST
Comment on attachment 183533 [details]
Patch

Maybe an OwnPtr to hold the DocType specific values?  Normally that will be 0, which is smaller and won't try to ref/deref twice.
Comment 5 Tony Gentilcore 2013-01-18 17:54:57 PST
Created attachment 183584 [details]
Patch
Comment 6 Eric Seidel (no email) 2013-01-19 04:01:35 PST
Created attachment 183619 [details]
An alternate approach
Comment 7 Eric Seidel (no email) 2013-01-19 04:02:13 PST
I hope you don't mind me hijacking your bug.  Please feel free to obsolete my patch and re-state yours for review if you like!
Comment 8 Adam Barth 2013-01-22 01:24:31 PST
Comment on attachment 183619 [details]
An alternate approach

That's awesome.
Comment 9 Adam Barth 2013-01-22 01:37:08 PST
Comment on attachment 183619 [details]
An alternate approach

View in context: https://bugs.webkit.org/attachment.cgi?id=183619&action=review

> Source/WebCore/html/parser/CompactHTMLToken.h:71
> +    const String& publicIdentifier() const { return m_attributes[0].name(); }
> +    const String& systemIdentifier() const { return m_attributes[0].value(); }

Can we ASSERT that m_attributes has at least size one?  Maybe Vector does that for us?

> Source/WebCore/html/parser/CompactHTMLToken.h:74
> +    unsigned m_type : 4;

Can we ASSERT that we have enough bits somehow?
Comment 10 Eric Seidel (no email) 2013-01-22 01:43:52 PST
Comment on attachment 183619 [details]
An alternate approach

View in context: https://bugs.webkit.org/attachment.cgi?id=183619&action=review

>> Source/WebCore/html/parser/CompactHTMLToken.h:71
>> +    const String& systemIdentifier() const { return m_attributes[0].value(); }
> 
> Can we ASSERT that m_attributes has at least size one?  Maybe Vector does that for us?

Yup:
http://trac.webkit.org/browser/trunk/Source/WTF/wtf/Vector.h#L553

>> Source/WebCore/html/parser/CompactHTMLToken.h:74
>> +    unsigned m_type : 4;
> 
> Can we ASSERT that we have enough bits somehow?

I believe the compiler does this for us (at least GCC) when we assign HTMLTokenTypes::Type to this value.  if it were too small it would warn.  At least that's my recollection.
Comment 11 Eric Seidel (no email) 2013-01-22 01:48:15 PST
Created attachment 183930 [details]
Updated to TOT
Comment 12 Darin Adler 2013-01-22 10:19:10 PST
Comment on attachment 183930 [details]
Updated to TOT

View in context: https://bugs.webkit.org/attachment.cgi?id=183930&action=review

I am not an expert on the parser, but the patch looks fine to me and probably does not require parser expertise.

> Source/WebCore/ChangeLog:13
> +        however when I tried to assert that it failed.  Presumably because
> +        Vector<T> is more than void* in size.

Yes, Vector<T> is much more than void* in size!

> Source/WebCore/html/parser/CompactHTMLToken.h:78
>      Vector<CompactAttribute> m_attributes;

What’s the typical size of this vector? If it’s typically empty, it could be OwnPtr<Vector>. If it typically has a small number of attributes, maybe it should have inline capacity.
Comment 13 Darin Adler 2013-01-22 10:20:29 PST
Comment on attachment 183930 [details]
Updated to TOT

View in context: https://bugs.webkit.org/attachment.cgi?id=183930&action=review

>> Source/WebCore/ChangeLog:13
>> +        Vector<T> is more than void* in size.
> 
> Yes, Vector<T> is much more than void* in size!

Roughly speaking, it’s got a pointer for the storage, a size, and a capacity. So we pay for the flexibility of being able to efficiently resize. For specialized uses we can use a different data structure that is more compact.
Comment 14 WebKit Review Bot 2013-01-22 10:32:28 PST
Comment on attachment 183930 [details]
Updated to TOT

Rejecting attachment 183930 [details] from commit-queue.

Failed to run "['/mnt/git/webkit-commit-queue/Tools/Scripts/webkit-patch', '--status-host=queues.webkit.org', '-..." exit_code: 2 cwd: /mnt/git/webkit-commit-queue

Last 500 characters of output:
-fno-rtti -fno-threadsafe-statics -fvisibility-inlines-hidden -Wsign-compare  -c ../../Source/WebCore/html/parser/CompactHTMLToken.cpp -o obj/Source/WebCore/html/parser/webcore_html.CompactHTMLToken.o
../../Source/WebCore/html/parser/CompactHTMLToken.cpp: In constructor 'WebCore::CompactHTMLToken::CompactHTMLToken(const WebCore::HTMLToken&)':
../../Source/WebCore/html/parser/CompactHTMLToken.cpp:55: error: 'stringFromToken' was not declared in this scope
ninja: build stopped: subcommand failed.

Full output: http://queues.webkit.org/results/16039638
Comment 15 Eric Seidel (no email) 2013-01-22 11:03:20 PST
Comment on attachment 183930 [details]
Updated to TOT

View in context: https://bugs.webkit.org/attachment.cgi?id=183930&action=review

>>> Source/WebCore/ChangeLog:13
>>> +        Vector<T> is more than void* in size.
>> 
>> Yes, Vector<T> is much more than void* in size!
> 
> Roughly speaking, it’s got a pointer for the storage, a size, and a capacity. So we pay for the flexibility of being able to efficiently resize. For specialized uses we can use a different data structure that is more compact.

Yeah, I forgot about the necessary size member when writing that. :)  I didn't realize we had another one for capacity as well.
Comment 16 Eric Seidel (no email) 2013-01-22 11:04:09 PST
Comment on attachment 183930 [details]
Updated to TOT

View in context: https://bugs.webkit.org/attachment.cgi?id=183930&action=review

>> Source/WebCore/html/parser/CompactHTMLToken.h:78
>>      Vector<CompactAttribute> m_attributes;
> 
> What’s the typical size of this vector? If it’s typically empty, it could be OwnPtr<Vector>. If it typically has a small number of attributes, maybe it should have inline capacity.

It's only used for attribute tokens, which are less common than character tokens for sure.  We probably should make it OwnPtr<Vector>
Comment 17 Eric Seidel (no email) 2013-01-22 11:35:37 PST
Created attachment 184026 [details]
Patch for landing
Comment 18 WebKit Review Bot 2013-01-22 11:38:08 PST
Comment on attachment 184026 [details]
Patch for landing

Attachment 184026 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/16042706
Comment 19 Eric Seidel (no email) 2013-01-22 11:42:43 PST
Created attachment 184028 [details]
Patch for landing
Comment 20 Eric Seidel (no email) 2013-01-22 11:44:57 PST
Sorry, this was originally part of a fancier patch series.   An earlier patch never landed.
Comment 21 WebKit Review Bot 2013-01-22 11:47:26 PST
Comment on attachment 184028 [details]
Patch for landing

Rejecting attachment 184028 [details] from commit-queue.

Failed to run "['/mnt/git/webkit-commit-queue/Tools/Scripts/webkit-patch', '--status-host=queues.webkit.org', '-..." exit_code: 2 cwd: /mnt/git/webkit-commit-queue

Last 500 characters of output:
r'
../../Source/WebCore/html/parser/CompactHTMLToken.cpp:55: error:   crosses initialization of 'WTF::String publicIdentifier'
../../Source/WebCore/html/parser/CompactHTMLToken.cpp:76: error: jump to case label
../../Source/WebCore/html/parser/CompactHTMLToken.cpp:56: error:   crosses initialization of 'WTF::String systemIdentifier'
../../Source/WebCore/html/parser/CompactHTMLToken.cpp:55: error:   crosses initialization of 'WTF::String publicIdentifier'
ninja: build stopped: subcommand failed.

Full output: http://queues.webkit.org/results/16036725
Comment 22 Eric Seidel (no email) 2013-01-22 12:32:27 PST
Created attachment 184035 [details]
Patch for landing
Comment 23 WebKit Review Bot 2013-01-22 13:03:11 PST
Comment on attachment 184035 [details]
Patch for landing

Clearing flags on attachment: 184035

Committed r140453: <http://trac.webkit.org/changeset/140453>
Comment 24 WebKit Review Bot 2013-01-22 13:03:16 PST
All reviewed patches have been landed.  Closing bug.
Comment 25 Eric Seidel (no email) 2013-01-22 13:54:19 PST
This broke the Windows builder because MSVC won't compact adjacent bitfields of differing types. I have to change "bool" to "unsigned" and it will work.
Comment 26 Eric Seidel (no email) 2013-01-22 13:54:55 PST
http://build.chromium.org/p/chromium.webkit/builders/WebKit%20Win%20Builder/builds/35188 is the log from the break.
Comment 27 Eric Seidel (no email) 2013-01-22 14:16:00 PST
Reopening to attach new patch.
Comment 28 Eric Seidel (no email) 2013-01-22 14:16:08 PST
Created attachment 184046 [details]
Patch for landing
Comment 29 Hin-Chung Lam 2013-01-22 14:47:59 PST
Comment on attachment 184046 [details]
Patch for landing

Clearing flags on attachment: 184046

Committed r140473: <http://trac.webkit.org/changeset/140473>
Comment 30 Hin-Chung Lam 2013-01-22 14:48:02 PST
All reviewed patches have been landed.  Closing bug.