Bug 25512 - Texts in some unfinished tags aren't visible in view-source mode
Summary: Texts in some unfinished tags aren't visible in view-source mode
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: DOM (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Nobody
URL: data:text/html,<textarea>foobar
Keywords:
Depends on:
Blocks:
 
Reported: 2009-05-01 14:46 PDT by Shinichiro Hamaji
Modified: 2009-06-11 11:16 PDT (History)
3 users (show)

See Also:


Attachments
Handle texts in unfinished special tags. (9.31 KB, patch)
2009-05-01 15:05 PDT, Shinichiro Hamaji
mjs: review-
Details | Formatted Diff | Diff
fixes for layouttest and rename "special tag" (15.83 KB, patch)
2009-05-04 20:07 PDT, Shinichiro Hamaji
mjs: review-
Details | Formatted Diff | Diff
patch v3 (15.74 KB, patch)
2009-06-01 22:55 PDT, Shinichiro Hamaji
no flags Details | Formatted Diff | Diff
patch v4 (15.69 KB, patch)
2009-06-04 00:20 PDT, Shinichiro Hamaji
no flags Details | Formatted Diff | Diff
patch v5 (15.72 KB, patch)
2009-06-04 21:52 PDT, Shinichiro Hamaji
abarth: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Shinichiro Hamaji 2009-05-01 14:46:45 PDT
For HTML file whose content is "<textarea>foobar", "foobar" isn't seen even with view-source mode. This problem happens with script, style, textarea, title, xmp, and iframe tags.

Related chromium bug: http://code.google.com/p/chromium/issues/detail?id=10686
Comment 1 Shinichiro Hamaji 2009-05-01 15:05:08 PDT
Created attachment 29948 [details]
Handle texts in unfinished special tags.

 LayoutTests/ChangeLog                              |   17 +++++++++
 .../fast/frames/resources/viewsource-frame-3.html  |    1 +
 .../frames/viewsource-unfinished-tags-expected.txt |    1 +
 .../fast/frames/viewsource-unfinished-tags.html    |   39 ++++++++++++++++++++
 WebCore/ChangeLog                                  |   16 ++++++++
 WebCore/html/HTMLTokenizer.cpp                     |   39 ++++++++++++--------
 6 files changed, 97 insertions(+), 16 deletions(-)
Comment 2 Alexey Proskuryakov 2009-05-02 02:32:43 PDT
Is this patch meant for review? Please mark it as such by clicking Edit link to the right of it if it is.

Briefly looking over the code, I think that "specialTagParsed" is not a great name for the variable. What do these tags have in common? Answering that question should help giving a more self-documanting name to it.
Comment 3 Shinichiro Hamaji 2009-05-02 14:01:41 PDT
Comment on attachment 29948 [details]
Handle texts in unfinished special tags.

Thanks for the review! I forgot to change the flag for review, sorry.

Regarding the name "special", I agree that this name isn't good, but it's a bit difficult to find a good name... How about "non HTML text" since these tags have non HTML text node as their child node? If it sounds good to you, I'll modify the patch in the beginning of the next week. I think we need to change the function name "parseSpecial" to "parseNonHtmlText" as well.

Thanks!
Comment 4 Shinichiro Hamaji 2009-05-04 20:07:09 PDT
Created attachment 30009 [details]
fixes for layouttest and rename "special tag"

 LayoutTests/ChangeLog                              |   24 +++++++++
 ...tion-using-js-object-with-toString-expected.txt |    3 +
 ...dow-location-using-js-object-with-toString.html |    4 +-
 ...indow-shadow-location-using-string-expected.txt |    3 +
 .../window-shadow-location-using-string.html       |    4 +-
 .../viewsource-frame-unfinished-script.html        |    1 +
 .../viewsource-frame-unfinished-textarea.html      |    1 +
 .../frames/viewsource-unfinished-tags-expected.txt |    3 +
 .../fast/frames/viewsource-unfinished-tags.html    |   46 ++++++++++++++++++
 WebCore/ChangeLog                                  |   20 ++++++++
 WebCore/html/HTMLTokenizer.cpp                     |   50 ++++++++++----------
 WebCore/html/HTMLTokenizer.h                       |    4 +-
 12 files changed, 132 insertions(+), 31 deletions(-)
Comment 5 Shinichiro Hamaji 2009-05-04 20:12:15 PDT
Comment on attachment 30009 [details]
fixes for layouttest and rename "special tag"

Updates from the previous patch were:

- Renamed functions which contain "Special" using  "NonHTMLText.
- Fixed layouttest by using inAnyNonHTMLText().
- Added </iframe> for two tests and fixed the expected output.
Comment 6 Shinichiro Hamaji 2009-05-12 14:58:17 PDT
Ping for review?
Comment 7 Maciej Stachowiak 2009-05-22 01:18:07 PDT
Comment on attachment 29948 [details]
Handle texts in unfinished special tags.

In general the fix looks good, but I would suggest doing performance tests such as iBench HTML or other page load test benchmarks to confirm no regression, since the tokenizer is very performance-sensitive. Please repost with performance data.
Comment 8 Maciej Stachowiak 2009-05-22 02:31:50 PDT
Comment on attachment 30009 [details]
fixes for layouttest and rename "special tag"

Looks great! Like I said in the other patch, though, I'd like to see some perf testing of this. Please repost with perf testing results.
Comment 9 Shinichiro Hamaji 2009-05-22 03:19:17 PDT
Comment on attachment 29948 [details]
Handle texts in unfinished special tags.

Sorry, the next patch contains this patch and I should have marked this as obsolete.
Comment 10 Shinichiro Hamaji 2009-05-22 03:31:41 PDT
> In general the fix looks good, but I would suggest doing performance tests such
> as iBench HTML or other page load test benchmarks to confirm no regression,
> since the tokenizer is very performance-sensitive. Please repost with
> performance data.

I see. Thanks for the advice! I'll learn the tool and update this patch.
Comment 11 Shinichiro Hamaji 2009-06-01 05:27:58 PDT
I ran iBench and there seem to be no performance regression. I ran HTML load speed test 3 times.

Before this patch:

HTML Load Speed   	  	  	  
   All iterations	  15.15	  	  
   First iteration (downloaded)	  2.07	  	  
   Subsequent iteration (cached)	  1.87

HTML Load Speed   	  	  	  
   All iterations	  14.63	  	  
   First iteration (downloaded)	  1.86	  	  
   Subsequent iteration (cached)	  1.82

HTML Load Speed   	  	  	  
   All iterations	  14.66	  	  
   First iteration (downloaded)	  1.9	  	  
   Subsequent iteration (cached)	  1.82

After this patch:

HTML Load Speed   	  	  	  
   All iterations	  15.51	  	  
   First iteration (downloaded)	  2.6	  	  
   Subsequent iteration (cached)	  1.84

HTML Load Speed   	  	  	  
   All iterations	  14.61	  	  
   First iteration (downloaded)	  1.86	  	  
   Subsequent iteration (cached)	  1.82

HTML Load Speed   	  	  	  
   All iterations	  14.53	  	  
   First iteration (downloaded)	  1.87	  	  
   Subsequent iteration (cached)	  1.81

It seems that the "First iteration" of the first trial has bigger variance than other tests due to network speed fluctuation. Other results show this patch don't introduce performance regression.
Comment 12 Alexey Proskuryakov 2009-06-01 07:02:57 PDT
See also: bug 8961 and its duplicates. Maybe this is a duplicate, too?
Comment 13 Shinichiro Hamaji 2009-06-01 19:20:08 PDT
(In reply to comment #12)
> See also: bug 8961 and its duplicates. Maybe this is a duplicate, too?

I think this is not the duplicate of the bug 8961. Hyatt's patch didn't fix this bug and my patch didn't fix the bug 8961 as well.

By the way I've just found that this is the duplicate of this bug: https://bugs.webkit.org/show_bug.cgi?id=24860
Comment 14 Shinichiro Hamaji 2009-06-01 22:55:18 PDT
Created attachment 30856 [details]
patch v3

 LayoutTests/ChangeLog                              |   20 ++++++++
 ...tion-using-js-object-with-toString-expected.txt |    3 +
 ...dow-location-using-js-object-with-toString.html |    4 +-
 ...indow-shadow-location-using-string-expected.txt |    3 +
 .../window-shadow-location-using-string.html       |    4 +-
 .../viewsource-frame-unfinished-script.html        |    1 +
 .../viewsource-frame-unfinished-textarea.html      |    1 +
 .../frames/viewsource-unfinished-tags-expected.txt |    3 +
 .../fast/frames/viewsource-unfinished-tags.html    |   46 ++++++++++++++++++
 WebCore/ChangeLog                                  |   18 +++++++
 WebCore/html/HTMLTokenizer.cpp                     |   50 ++++++++++----------
 WebCore/html/HTMLTokenizer.h                       |    4 +-
 12 files changed, 126 insertions(+), 31 deletions(-)
Comment 15 Shinichiro Hamaji 2009-06-01 22:59:31 PDT
As I've found Ian is saying that re-parsing is invalid, I updated the patch so that this patch only affects for view-source mode and there should be no security issues.

https://bugs.webkit.org/show_bug.cgi?id=6314
Comment 16 Shinichiro Hamaji 2009-06-04 00:20:34 PDT
Created attachment 30939 [details]
patch v4

 LayoutTests/ChangeLog                              |   20 ++++++++
 ...tion-using-js-object-with-toString-expected.txt |    3 +
 ...dow-location-using-js-object-with-toString.html |    4 +-
 ...indow-shadow-location-using-string-expected.txt |    3 +
 .../window-shadow-location-using-string.html       |    4 +-
 .../viewsource-frame-unfinished-script.html        |    1 +
 .../viewsource-frame-unfinished-textarea.html      |    1 +
 .../frames/viewsource-unfinished-tags-expected.txt |    3 +
 .../fast/frames/viewsource-unfinished-tags.html    |   46 ++++++++++++++++++
 WebCore/ChangeLog                                  |   18 +++++++
 WebCore/html/HTMLTokenizer.cpp                     |   50 ++++++++++----------
 WebCore/html/HTMLTokenizer.h                       |    4 +-
 12 files changed, 126 insertions(+), 31 deletions(-)
Comment 17 Shinichiro Hamaji 2009-06-04 00:21:40 PDT
Comment on attachment 30939 [details]
patch v4

Sorry, I fixed my email address in ChangeLog.
Comment 18 Shinichiro Hamaji 2009-06-04 21:52:08 PDT
Created attachment 30990 [details]
patch v5

 LayoutTests/ChangeLog                              |   22 +++++++++
 ...tion-using-js-object-with-toString-expected.txt |    3 +
 ...dow-location-using-js-object-with-toString.html |    4 +-
 ...indow-shadow-location-using-string-expected.txt |    3 +
 .../window-shadow-location-using-string.html       |    4 +-
 .../viewsource-frame-unfinished-script.html        |    1 +
 .../viewsource-frame-unfinished-textarea.html      |    1 +
 .../frames/viewsource-unfinished-tags-expected.txt |    3 +
 .../fast/frames/viewsource-unfinished-tags.html    |   46 ++++++++++++++++++
 WebCore/ChangeLog                                  |   20 ++++++++
 WebCore/html/HTMLTokenizer.cpp                     |   50 ++++++++++----------
 WebCore/html/HTMLTokenizer.h                       |    4 +-
 12 files changed, 130 insertions(+), 31 deletions(-)
Comment 19 Shinichiro Hamaji 2009-06-04 21:53:09 PDT
Comment on attachment 30990 [details]
patch v5

I inserted some line breaks in ChangeLog as I was advised that we should keep lines in ChangeLog short.
Comment 20 Adam Barth 2009-06-09 12:42:23 PDT
Comment on attachment 30990 [details]
patch v5

I'm not an expert in the HTML tokenizer, but this patch LGTM.
Comment 21 Brent Fulgham 2009-06-11 11:16:20 PDT
Landed in @r44608.