Bug 62971 - view-source doesn't colorize </script> correctly
Summary: view-source doesn't colorize </script> correctly
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: DOM (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Adam Barth
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-06-20 02:33 PDT by WebKit Review Bot
Modified: 2012-01-12 03:43 PST (History)
5 users (show)

See Also:


Attachments
Patch (20.16 KB, patch)
2011-12-27 15:19 PST, Adam Barth
no flags Details | Formatted Diff | Diff
Patch for landing (20.23 KB, patch)
2012-01-03 15:55 PST, Adam Barth
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description WebKit Review Bot 2011-06-20 02:33:57 PDT
view-source doesn't colorize </script> correctly
Requested by abarth on #webkit.
Comment 1 Alexey Proskuryakov 2011-06-20 09:03:14 PDT
See bug 61774 for more information (which should have been in this bug, but using the bot precludes one from filing good bugs).
Comment 2 Adam Barth 2011-06-20 10:10:22 PDT
I'm actually surprised there isn't another bug to dupe this one against.
Comment 3 Adam Barth 2011-10-13 15:51:58 PDT
This bug is worth fixing, but it's non-trivial.
Comment 4 Adam Barth 2011-12-27 15:19:19 PST
Created attachment 120621 [details]
Patch
Comment 5 Adam Barth 2011-12-27 15:22:33 PST
Note: Once we fix this bug, we should try adding these ASSERTs back:
http://trac.webkit.org/changeset/89258/trunk/Source/WebCore/html/parser/HTMLToken.h
Comment 6 Eric Seidel (no email) 2012-01-03 13:20:15 PST
Comment on attachment 120621 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=120621&action=review

LGTM if the perf is OK.

> Source/WebCore/ChangeLog:35
> +            previous segment of source.  We copy them into our accumulation
> +            buffer and adjust the token base offset to account for the extra
> +            characters.

Did you perf test this change?  Could you please mention the effect (if any) in the ChangeLog?

> Source/WebCore/ChangeLog:56
> +          - Previously, we cleared the temporary buffer lazily when we needed
> +            to add new characters to it.  Now we clear it eagerly so that it's
> +            length tells us whether we're currently using it to store
> +            characters.

Moar perf fear.

> Source/WebCore/html/parser/HTMLTokenizer.cpp:168
> +    m_temporaryBuffer.clear();

Is temporary buffer a spec name?  Seems we could give htis a more descriptive name about what it contains or how it's used...

> Source/WebCore/html/parser/HTMLTokenizer.cpp:1590
> +    characters.append('<');
> +    characters.append('/');

Can we assert that we're in the "waiting for end tag state"?  Seems this should at least have a comment that the only time we buffer characters is when searching for an end tag.
Comment 7 Adam Barth 2012-01-03 15:49:22 PST
I've re-tested the perf and there's no measurable difference.
Comment 8 Adam Barth 2012-01-03 15:53:12 PST
(In reply to comment #6)
> (From update of attachment 120621 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=120621&action=review
>
> > Source/WebCore/ChangeLog:56
> > +          - Previously, we cleared the temporary buffer lazily when we needed
> > +            to add new characters to it.  Now we clear it eagerly so that it's
> > +            length tells us whether we're currently using it to store
> > +            characters.
> 
> Moar perf fear.

Yes.  That was from before we had the benchmark.  :)

> > Source/WebCore/html/parser/HTMLTokenizer.cpp:168
> > +    m_temporaryBuffer.clear();
> 
> Is temporary buffer a spec name?  Seems we could give htis a more descriptive name about what it contains or how it's used...

It's a name from the spec:

http://www.whatwg.org/specs/web-apps/current-work/multipage/tokenization.html#temporary-buffer

We don't need to use the spec name if something else would be clearer.

> > Source/WebCore/html/parser/HTMLTokenizer.cpp:1590
> > +    characters.append('<');
> > +    characters.append('/');
> 
> Can we assert that we're in the "waiting for end tag state"?  Seems this should at least have a comment that the only time we buffer characters is when searching for an end tag.

I've added a FIXME about adding this ASSERT.  I need to study this more closely to get the full list of states in which we use the temporary buffer.
Comment 9 Adam Barth 2012-01-03 15:55:38 PST
Created attachment 121013 [details]
Patch for landing
Comment 10 WebKit Review Bot 2012-01-03 18:42:14 PST
Comment on attachment 121013 [details]
Patch for landing

Clearing flags on attachment: 121013

Committed r103999: <http://trac.webkit.org/changeset/103999>
Comment 11 WebKit Review Bot 2012-01-03 18:42:19 PST
All reviewed patches have been landed.  Closing bug.