AtomicMarkupTokenBase must die
Amen, brother. Amen.
Created attachment 183911 [details] Patch
Comment on attachment 183911 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=183911&action=review Just a couple nits. Most of them don't apply to this patch even. :) > Source/WebCore/html/parser/HTMLToken.h:145 > + bool isAll8BitData() const > + { > + return m_isAll8BitData; > + } I think I would have written this function named "dataIsAll8Bit()" :) to make it clear that it's only talking about the data member. :) Obviously that's a separate patch here. Just idly commenting. > Source/WebCore/html/parser/HTMLToken.h:158 > + // FIXME: Distinguish between a missing public identifer and an empty one. > + WTF::Vector<UChar>& publicIdentifier() const > + { > + ASSERT(m_type == HTMLTokenTypes::DOCTYPE); > + return m_doctypeData->m_publicIdentifier; > + } There is no reason to make these fancy vectors anyway. There is 1 DOCTYPE node per document. :) We can make our handling of it as inefficient as we like. :) We're not saving anything by using fancy vector types. > Source/WebCore/html/parser/HTMLToken.h:172 > + void clearExternalCharacters() > + { > + m_externalCharacters = 0; > + m_externalCharactersLength = 0; > + m_isAll8BitData = false; > + } Does isAll8bit not apply to just data then? > Source/WebCore/html/parser/HTMLToken.h:270 > + , m_isAll8BitData(false) > + , m_attributes(attributes) This does not appear to set m_selfClosing... > Source/WebCore/html/parser/HTMLToken.h:299 > + // We don't want to copy the the characters out of the Token, so we > + // keep a pointer to its buffer instead. This buffer is owned by the > + // Token and causes a lifetime dependence between these objects. > + // > + // FIXME: Add a mechanism for "internalizing" the characters when the > + // HTMLToken is destructed. > + const UChar* m_externalCharacters; > + size_t m_externalCharactersLength; This is obviously wrong in the threaded-parser world. At least as implemented. But we'll fix that now that you've made the code fixable. :)
Did you want this marked for review? Or EWSed?
(In reply to comment #3) > (From update of attachment 183911 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=183911&action=review > > Just a couple nits. Most of them don't apply to this patch even. :) > > > Source/WebCore/html/parser/HTMLToken.h:145 > > + bool isAll8BitData() const > > + { > > + return m_isAll8BitData; > > + } > > I think I would have written this function named "dataIsAll8Bit()" :) to make it clear that it's only talking about the data member. :) Obviously that's a separate patch here. Just idly commenting. -> Punt to future patch. > > Source/WebCore/html/parser/HTMLToken.h:158 > > + // FIXME: Distinguish between a missing public identifer and an empty one. > > + WTF::Vector<UChar>& publicIdentifier() const > > + { > > + ASSERT(m_type == HTMLTokenTypes::DOCTYPE); > > + return m_doctypeData->m_publicIdentifier; > > + } > > There is no reason to make these fancy vectors anyway. There is 1 DOCTYPE node per document. :) We can make our handling of it as inefficient as we like. :) We're not saving anything by using fancy vector types. Agreed! -> Punt to future patch. > > Source/WebCore/html/parser/HTMLToken.h:172 > > + void clearExternalCharacters() > > + { > > + m_externalCharacters = 0; > > + m_externalCharactersLength = 0; > > + m_isAll8BitData = false; > > + } > > Does isAll8bit not apply to just data then? We only track m_isAll8BitData for the HTMLToken's notion of data, which includes the characters (but not attributes, etc). > > Source/WebCore/html/parser/HTMLToken.h:270 > > + , m_isAll8BitData(false) > > + , m_attributes(attributes) > > This does not appear to set m_selfClosing... Fixed. > > Source/WebCore/html/parser/HTMLToken.h:299 > > + // We don't want to copy the the characters out of the Token, so we > > + // keep a pointer to its buffer instead. This buffer is owned by the > > + // Token and causes a lifetime dependence between these objects. > > + // > > + // FIXME: Add a mechanism for "internalizing" the characters when the > > + // HTMLToken is destructed. > > + const UChar* m_externalCharacters; > > + size_t m_externalCharactersLength; > > This is obviously wrong in the threaded-parser world. At least as implemented. But we'll fix that now that you've made the code fixable. :) Agreed. We need to be smarter about taking the whole character string rather than always copying in the threaded case.
> Did you want this marked for review? Or EWSed? This patch doesn't compile yet. I'm working through the compile errors now.
Created attachment 183921 [details] Patch
Comment on attachment 183921 [details] Patch Attachment 183921 [details] did not pass qt-ews (qt): Output: http://queues.webkit.org/results/16010886
Comment on attachment 183921 [details] Patch Attachment 183921 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/16035483
Comment on attachment 183921 [details] Patch Attachment 183921 [details] did not pass qt-wk2-ews (qt): Output: http://queues.webkit.org/results/16036480
Created attachment 183927 [details] Patch
Comment on attachment 183927 [details] Patch LGTM.
Comment on attachment 183927 [details] Patch Clearing flags on attachment: 183927 Committed r140403: <http://trac.webkit.org/changeset/140403>
All reviewed patches have been landed. Closing bug.