Bug 107513 - AtomicMarkupTokenBase must die
Summary: AtomicMarkupTokenBase must die
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Adam Barth
URL:
Keywords:
Depends on:
Blocks: 107522
  Show dependency treegraph
 
Reported: 2013-01-22 00:35 PST by Adam Barth
Modified: 2013-01-22 02:27 PST (History)
6 users (show)

See Also:


Attachments
Patch (15.57 KB, patch)
2013-01-22 00:46 PST, Adam Barth
no flags Details | Formatted Diff | Diff
Patch (18.42 KB, patch)
2013-01-22 01:25 PST, Adam Barth
no flags Details | Formatted Diff | Diff
Patch (18.24 KB, patch)
2013-01-22 01:32 PST, Adam Barth
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Adam Barth 2013-01-22 00:35:48 PST
AtomicMarkupTokenBase must die
Comment 1 Eric Seidel (no email) 2013-01-22 00:36:32 PST
Amen, brother.  Amen.
Comment 2 Adam Barth 2013-01-22 00:46:07 PST
Created attachment 183911 [details]
Patch
Comment 3 Eric Seidel (no email) 2013-01-22 01:04:05 PST
Comment on attachment 183911 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=183911&action=review

Just a couple nits.  Most of them don't apply to this patch even. :)

> Source/WebCore/html/parser/HTMLToken.h:145
> +    bool isAll8BitData() const
> +    {
> +        return m_isAll8BitData;
> +    }

I think I would have written this function named "dataIsAll8Bit()" :) to make it clear that it's only talking about the data member. :)  Obviously that's a separate patch here.  Just idly commenting.

> Source/WebCore/html/parser/HTMLToken.h:158
> +    // FIXME: Distinguish between a missing public identifer and an empty one.
> +    WTF::Vector<UChar>& publicIdentifier() const
> +    {
> +        ASSERT(m_type == HTMLTokenTypes::DOCTYPE);
> +        return m_doctypeData->m_publicIdentifier;
> +    }

There is no reason to make these fancy vectors anyway.  There is 1 DOCTYPE node per document. :)  We can make our handling of it as inefficient as we like. :)  We're not saving anything by using fancy vector types.

> Source/WebCore/html/parser/HTMLToken.h:172
> +    void clearExternalCharacters()
> +    {
> +        m_externalCharacters = 0;
> +        m_externalCharactersLength = 0;
> +        m_isAll8BitData = false;
> +    }

Does isAll8bit not apply to just data then?

> Source/WebCore/html/parser/HTMLToken.h:270
> +        , m_isAll8BitData(false)
> +        , m_attributes(attributes)

This does not appear to set m_selfClosing...

> Source/WebCore/html/parser/HTMLToken.h:299
> +    // We don't want to copy the the characters out of the Token, so we
> +    // keep a pointer to its buffer instead. This buffer is owned by the
> +    // Token and causes a lifetime dependence between these objects.
> +    //
> +    // FIXME: Add a mechanism for "internalizing" the characters when the
> +    //        HTMLToken is destructed.
> +    const UChar* m_externalCharacters;
> +    size_t m_externalCharactersLength;

This is obviously wrong in the threaded-parser world.  At least as implemented.  But we'll fix that now that you've made the code fixable. :)
Comment 4 Eric Seidel (no email) 2013-01-22 01:04:29 PST
Did you want this marked for review?  Or EWSed?
Comment 5 Adam Barth 2013-01-22 01:10:39 PST
(In reply to comment #3)
> (From update of attachment 183911 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=183911&action=review
> 
> Just a couple nits.  Most of them don't apply to this patch even. :)
> 
> > Source/WebCore/html/parser/HTMLToken.h:145
> > +    bool isAll8BitData() const
> > +    {
> > +        return m_isAll8BitData;
> > +    }
> 
> I think I would have written this function named "dataIsAll8Bit()" :) to make it clear that it's only talking about the data member. :)  Obviously that's a separate patch here.  Just idly commenting.

-> Punt to future patch.

> > Source/WebCore/html/parser/HTMLToken.h:158
> > +    // FIXME: Distinguish between a missing public identifer and an empty one.
> > +    WTF::Vector<UChar>& publicIdentifier() const
> > +    {
> > +        ASSERT(m_type == HTMLTokenTypes::DOCTYPE);
> > +        return m_doctypeData->m_publicIdentifier;
> > +    }
> 
> There is no reason to make these fancy vectors anyway.  There is 1 DOCTYPE node per document. :)  We can make our handling of it as inefficient as we like. :)  We're not saving anything by using fancy vector types.

Agreed!  -> Punt to future patch.

> > Source/WebCore/html/parser/HTMLToken.h:172
> > +    void clearExternalCharacters()
> > +    {
> > +        m_externalCharacters = 0;
> > +        m_externalCharactersLength = 0;
> > +        m_isAll8BitData = false;
> > +    }
> 
> Does isAll8bit not apply to just data then?

We only track m_isAll8BitData for the HTMLToken's notion of data, which includes the characters (but not attributes, etc).

> > Source/WebCore/html/parser/HTMLToken.h:270
> > +        , m_isAll8BitData(false)
> > +        , m_attributes(attributes)
> 
> This does not appear to set m_selfClosing...

Fixed.

> > Source/WebCore/html/parser/HTMLToken.h:299
> > +    // We don't want to copy the the characters out of the Token, so we
> > +    // keep a pointer to its buffer instead. This buffer is owned by the
> > +    // Token and causes a lifetime dependence between these objects.
> > +    //
> > +    // FIXME: Add a mechanism for "internalizing" the characters when the
> > +    //        HTMLToken is destructed.
> > +    const UChar* m_externalCharacters;
> > +    size_t m_externalCharactersLength;
> 
> This is obviously wrong in the threaded-parser world.  At least as implemented.  But we'll fix that now that you've made the code fixable. :)

Agreed.  We need to be smarter about taking the whole character string rather than always copying in the threaded case.
Comment 6 Adam Barth 2013-01-22 01:11:02 PST
> Did you want this marked for review?  Or EWSed?

This patch doesn't compile yet.  I'm working through the compile errors now.
Comment 7 Adam Barth 2013-01-22 01:25:00 PST
Created attachment 183921 [details]
Patch
Comment 8 Early Warning System Bot 2013-01-22 01:30:40 PST
Comment on attachment 183921 [details]
Patch

Attachment 183921 [details] did not pass qt-ews (qt):
Output: http://queues.webkit.org/results/16010886
Comment 9 WebKit Review Bot 2013-01-22 01:31:15 PST
Comment on attachment 183921 [details]
Patch

Attachment 183921 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/16035483
Comment 10 Early Warning System Bot 2013-01-22 01:31:29 PST
Comment on attachment 183921 [details]
Patch

Attachment 183921 [details] did not pass qt-wk2-ews (qt):
Output: http://queues.webkit.org/results/16036480
Comment 11 Adam Barth 2013-01-22 01:32:31 PST
Created attachment 183927 [details]
Patch
Comment 12 Eric Seidel (no email) 2013-01-22 01:34:17 PST
Comment on attachment 183927 [details]
Patch

LGTM.
Comment 13 WebKit Review Bot 2013-01-22 02:27:49 PST
Comment on attachment 183927 [details]
Patch

Clearing flags on attachment: 183927

Committed r140403: <http://trac.webkit.org/changeset/140403>
Comment 14 WebKit Review Bot 2013-01-22 02:27:54 PST
All reviewed patches have been landed.  Closing bug.