Bug 43746 - Port view-source to new parser
Summary: Port view-source to new parser
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: 528+ (Nightly build)
Hardware: Other OS X 10.5
: P2 Normal
Assignee: Adam Barth
URL:
Keywords:
Depends on: 43765 43766
Blocks:
  Show dependency treegraph
 
Reported: 2010-08-09 14:19 PDT by Adam Barth
Modified: 2010-08-11 00:31 PDT (History)
4 users (show)

See Also:


Attachments
Patch (13.99 KB, patch)
2010-08-09 14:20 PDT, Adam Barth
no flags Details | Formatted Diff | Diff
now with HTMLViewSourceParser (17.51 KB, patch)
2010-08-09 14:57 PDT, Adam Barth
no flags Details | Formatted Diff | Diff
Patch (31.32 KB, patch)
2010-08-09 20:27 PDT, Adam Barth
no flags Details | Formatted Diff | Diff
Patch (14.82 KB, patch)
2010-08-10 18:38 PDT, Adam Barth
no flags Details | Formatted Diff | Diff
Patch (31.29 KB, patch)
2010-08-10 18:48 PDT, Adam Barth
eric: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Adam Barth 2010-08-09 14:19:46 PDT
Port view-source to new parser
Comment 1 Adam Barth 2010-08-09 14:20:45 PDT
Created attachment 63928 [details]
Patch
Comment 2 Adam Barth 2010-08-09 14:21:11 PDT
I'm going to try another approach of creating the view-source view directly from the input stream.
Comment 3 Adam Barth 2010-08-09 14:57:26 PDT
Created attachment 63931 [details]
now with HTMLViewSourceParser
Comment 4 Adam Barth 2010-08-09 19:59:54 PDT
Comment on attachment 63931 [details]
now with HTMLViewSourceParser

Ok.  I've got this working.  I'm going to upload the changes in three dependent bugs for ease of reviewing.
Comment 5 Adam Barth 2010-08-09 20:27:16 PDT
Created attachment 63970 [details]
Patch
Comment 6 Adam Barth 2010-08-09 20:28:14 PDT
I should say that most (all?) of the code in HTMLViewSourceDocument should really be in a HTMLViewSourceTreeBuilder, but I didn't have the heart to do that in this patch.
Comment 7 Early Warning System Bot 2010-08-09 21:10:35 PDT
Attachment 63970 [details] did not build on qt:
Build output: http://queues.webkit.org/results/3750009
Comment 8 Eric Seidel (no email) 2010-08-09 23:22:04 PDT
Attachment 63970 [details] did not build on mac:
Build output: http://queues.webkit.org/results/3745016
Comment 9 WebKit Review Bot 2010-08-09 23:52:23 PDT
Attachment 63970 [details] did not build on chromium:
Build output: http://queues.webkit.org/results/3731006
Comment 10 Eric Seidel (no email) 2010-08-10 00:49:43 PDT
Comment on attachment 63970 [details]
Patch

WebCore/html/HTMLToken.h:293
 +      int m_startIndex;
I'm surprised you didn't use your little Range object here.

WebCore/html/HTMLViewSourceDocument.cpp:105
 +      if (!m_current)
This seems like an old treebuilder concept.  Strange to see it in this Document class?

WebCore/html/HTMLViewSourceDocument.cpp:144
 +      String tagName = String(token.name().data(), token.name().size());
I'm slightly surprised Token doesn't have a method to do this for you, but I guess that's what AtomicToken is all about?

WebCore/html/HTMLViewSourceDocument.cpp:162
 +          if (tagName == "base" && name == "href") {
We have AtomicStrings for these.

WebCore/html/HTMLViewSourceDocument.cpp:156
 +          String name = String(iter->m_name.data(), iter->m_name.size());
Don't we have a String-From-Vector conversion function?  Or shouldn't we use an inline for these?  Seems silly that this is the 3rd time in this patch so far... :)

WebCore/html/HTMLViewSourceDocument.cpp:168
 +          index = addRange(source, index, iter->m_valueRange.m_start - token.startIndex(), "");
We keep doing this same code too.  Maybe this needs a wrapper around addRange?  Or something to do this subtraction?  Seems like a lot of copy/paste code here.

WebCore/html/HTMLViewSourceDocument.cpp:170
 +          bool isLink = name == "src" || name == "href";
We have AtomicStrings for these.

WebCore/html/HTMLViewSourceDocument.cpp:171
 +          index = addRange(source, index, iter->m_valueRange.m_end - token.startIndex(), "webkit-html-attribute-value", isLink, tagName == "a");
And these...

WebCore/html/HTMLViewSourceDocument.cpp:276
 +      if (start < end) {
Early return instead?

WebCore/html/HTMLViewSourceDocument.cpp:277
 +          String text = source.substring(start, end - start);
I might have put this whole block in a function "spanForText".  Not sure I understand welle enough yet.  The setting m_current feels like something wanting to call a function to return whatever your'e about to set m_current too.

Overall I think this is great.  I'd like to see a refreshed patch though.
Comment 11 Adam Barth 2010-08-10 16:53:28 PDT
(In reply to comment #10)
> (From update of attachment 63970 [details])
> WebCore/html/HTMLToken.h:293
>  +      int m_startIndex;
> I'm surprised you didn't use your little Range object here.

Fixed.

> WebCore/html/HTMLViewSourceDocument.cpp:105
>  +      if (!m_current)
> This seems like an old treebuilder concept.  Strange to see it in this Document class?

Yes.  This is wrong code inherited from the old HTMLViewSourceDocument.  I'd like to move it to an HTMLViewSourceTreeBuilder at some point.

> I'm slightly surprised Token doesn't have a method to do this for you, but I guess that's what AtomicToken is all about?

Yes.  Token had a bunch of string caching infrastructure at some point, but we moved all that to atomic token.

> WebCore/html/HTMLViewSourceDocument.cpp:162
>  +          if (tagName == "base" && name == "href") {
> We have AtomicStrings for these.

True!  Now that we don't need to do case-insensitive compares we can use the atomics.

> WebCore/html/HTMLViewSourceDocument.cpp:156
>  +          String name = String(iter->m_name.data(), iter->m_name.size());
> Don't we have a String-From-Vector conversion function?  Or shouldn't we use an inline for these?  Seems silly that this is the 3rd time in this patch so far... :)

We have one, but not for vectors with inline capacity, I believe.

> WebCore/html/HTMLViewSourceDocument.cpp:168
>  +          index = addRange(source, index, iter->m_valueRange.m_start - token.startIndex(), "");
> We keep doing this same code too.  Maybe this needs a wrapper around addRange?  Or something to do this subtraction?  Seems like a lot of copy/paste code here.

Hum...  Ok.  I look at that.

> WebCore/html/HTMLViewSourceDocument.cpp:170
>  +          bool isLink = name == "src" || name == "href";
> We have AtomicStrings for these.

Yep.

> WebCore/html/HTMLViewSourceDocument.cpp:171
>  +          index = addRange(source, index, iter->m_valueRange.m_end - token.startIndex(), "webkit-html-attribute-value", isLink, tagName == "a");
> And these...

Yep.

> WebCore/html/HTMLViewSourceDocument.cpp:276
>  +      if (start < end) {
> Early return instead?

Sure.

> WebCore/html/HTMLViewSourceDocument.cpp:277
>  +          String text = source.substring(start, end - start);
> I might have put this whole block in a function "spanForText".  Not sure I understand welle enough yet.  The setting m_current feels like something wanting to call a function to return whatever your'e about to set m_current too.

Hum...  I'm not sure I quite understand.

> Overall I think this is great.  I'd like to see a refreshed patch though.

Okiedokes.
Comment 12 Adam Barth 2010-08-10 17:58:47 PDT
> > WebCore/html/HTMLViewSourceDocument.cpp:156
> >  +          String name = String(iter->m_name.data(), iter->m_name.size());
> > Don't we have a String-From-Vector conversion function?  Or shouldn't we use an inline for these?  Seems silly that this is the 3rd time in this patch so far... :)
> 
> We have one, but not for vectors with inline capacity, I believe.

I tried adding this, but I failed.  I'm not sure why, but I couldn't get it to compile.  :(

> > WebCore/html/HTMLViewSourceDocument.cpp:168
> >  +          index = addRange(source, index, iter->m_valueRange.m_start - token.startIndex(), "");
> > We keep doing this same code too.  Maybe this needs a wrapper around addRange?  Or something to do this subtraction?  Seems like a lot of copy/paste code here.
> 
> Hum...  Ok.  I look at that.

The problem is that these are all independent pieces of data.  I could turn the "-" into a "," but I'm not sure that would buy us much.
Comment 13 Adam Barth 2010-08-10 18:38:55 PDT
Created attachment 64065 [details]
Patch
Comment 14 Early Warning System Bot 2010-08-10 18:45:14 PDT
Attachment 64065 [details] did not build on qt:
Build output: http://queues.webkit.org/results/3752029
Comment 15 Eric Seidel (no email) 2010-08-10 18:48:10 PDT
Attachment 64065 [details] did not build on mac:
Build output: http://queues.webkit.org/results/3775036
Comment 16 Adam Barth 2010-08-10 18:48:22 PDT
Created attachment 64066 [details]
Patch
Comment 17 Eric Seidel (no email) 2010-08-10 23:36:34 PDT
Comment on attachment 64066 [details]
Patch

Slightly sad you weren't able to make the String constructor work.  It probably had to be a template itself or something.

WebCore/html/HTMLToken.h:75
 +      int startIndex() const { return m_range.m_start; }
Feels slightly odd to not just expose the Range object, but seems fine.

WebCore/html/HTMLViewSourceParser.cpp:41
 +      ASSERT_NOT_REACHED();
I'm surprised you have to implement insert()

WebCore/html/HTMLViewSourceParser.cpp:79
 +      // FIXME: The tokenizer should do this work for us.
I don't understand what you mean here.  Could you file a bug?

Looks great!
Comment 18 Adam Barth 2010-08-11 00:31:22 PDT
Committed r65132: <http://trac.webkit.org/changeset/65132>