Bug 45649 - Move adjustLexerState to the HTMLTokenizer
: Move adjustLexerState to the HTMLTokenizer
Status: RESOLVED FIXED
: WebKit
New Bugs
: 528+ (Nightly build)
: Other Mac OS X 10.5
: P2 Normal
Assigned To:
:
:
:
:
  Show dependency treegraph
 
Reported: 2010-09-13 03:23 PST by
Modified: 2010-09-14 09:14 PST (History)


Attachments
Patch (7.94 KB, patch)
2010-09-13 03:25 PST, Adam Barth
no flags Review Patch | Details | Formatted Diff | Diff
Patch (7.87 KB, patch)
2010-09-13 11:53 PST, Adam Barth
no flags Review Patch | Details | Formatted Diff | Diff
Patch for landing (11.95 KB, patch)
2010-09-13 12:29 PST, Adam Barth
no flags Review Patch | Details | Formatted Diff | Diff


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2010-09-13 03:23:40 PST
Move adjustLexerState to the HTMLTokenizer
------- Comment #1 From 2010-09-13 03:25:54 PST -------
Created an attachment (id=67391) [details]
Patch
------- Comment #2 From 2010-09-13 10:05:06 PST -------
(From update of attachment 67391 [details])
View in context: https://bugs.webkit.org/attachment.cgi?id=67391&action=prettypatch

Seems we could remove includes of TreeBuilder.h now, no?

> WebCore/html/parser/HTMLTokenizer.h:142
> +    // Updates the tokenizer's state according to the given tag name.  This is
> +    // an approxmation of how the tree builder would update the tokenizer's
> +    // state.  This method is useful for approximating HTML tokenization.  To
> +    // get exactly the correct tokenization, you need the real tree builder.
> +    void updateState(const AtomicString& tagName, Frame*);
If this is an approximation, can we name it to say such?  approximateStateForTag or something?
------- Comment #3 From 2010-09-13 11:53:31 PST -------
Created an attachment (id=67446) [details]
Patch
------- Comment #4 From 2010-09-13 12:01:14 PST -------
(From update of attachment 67446 [details])
> +    // Updates the tokenizer's state according to the given tag name.  This is
> +    // an approxmation of how the tree builder would update the tokenizer's
> +    // state.  This method is useful for approximating HTML tokenization.  To
> +    // get exactly the correct tokenization, you need the real tree builder.

Typo "approxmation"
Two spaces after periods.

I am uncomfortable with the use of the verb "approximate" here. First, it’s easy to confuse the verb with the adjective. And if this was an adjectival form then it would have a return value.

Your comment uses the verb "update" and I think that should be in the function name. Maybe updateStateForFakeElement? Or something else like that? Probably not because the element is not fake.

Another problem with "approximate" is that it doesn’t say what’s wrong. To decide if it’s OK to use the function you need to know how far off “correct” it is, not just that it’s not perfect. It’s not enough to say that it’s not 100% correct; that’s enough to get someone off the hook in a court of law or mathematical proof but is not so great for high-quality software development.

commit-queue- to give you a chance to consider my comments
------- Comment #5 From 2010-09-13 12:05:41 PST -------
> Your comment uses the verb "update" and I think that should be in the function name. Maybe updateStateForFakeElement? Or something else like that? Probably not because the element is not fake.

:)

I originally called it "updateState" but Eric asked me to change it.

> Another problem with "approximate" is that it doesn’t say what’s wrong. To decide if it’s OK to use the function you need to know how far off “correct” it is, not just that it’s not perfect. It’s not enough to say that it’s not 100% correct; that’s enough to get someone off the hook in a court of law or mathematical proof but is not so great for high-quality software development.

We might eventually need something fancier here where we run the tree builder in a side-effect free way.  The main thing this function doesn't understand is foreign content.  I'll add more detail to the comment.
------- Comment #6 From 2010-09-13 12:29:07 PST -------
Created an attachment (id=67454) [details]
Patch for landing
------- Comment #7 From 2010-09-14 07:26:07 PST -------
(From update of attachment 67454 [details])
Clearing flags on attachment: 67454

Committed r67467: <http://trac.webkit.org/changeset/67467>
------- Comment #8 From 2010-09-14 07:26:12 PST -------
All reviewed patches have been landed.  Closing bug.
------- Comment #9 From 2010-09-14 09:14:14 PST -------
http://trac.webkit.org/changeset/67467 might have broken GTK Linux 32-bit Debug
The following changes are on the blame list:
http://trac.webkit.org/changeset/67465
http://trac.webkit.org/changeset/67466
http://trac.webkit.org/changeset/67467
http://trac.webkit.org/changeset/67468