Bug 25512

Summary: Texts in some unfinished tags aren't visible in view-source mode
Product: WebKit Reporter: Shinichiro Hamaji <hamaji>
Component: DOMAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: ap, hamaji, hyatt
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
URL: data:text/html,<textarea>foobar
Attachments:
Description Flags
Handle texts in unfinished special tags.
mjs: review-
fixes for layouttest and rename "special tag"
mjs: review-
patch v3
none
patch v4
none
patch v5 abarth: review+

Shinichiro Hamaji
Reported 2009-05-01 14:46:45 PDT
For HTML file whose content is "<textarea>foobar", "foobar" isn't seen even with view-source mode. This problem happens with script, style, textarea, title, xmp, and iframe tags. Related chromium bug: http://code.google.com/p/chromium/issues/detail?id=10686
Attachments
Handle texts in unfinished special tags. (9.31 KB, patch)
2009-05-01 15:05 PDT, Shinichiro Hamaji
mjs: review-
fixes for layouttest and rename "special tag" (15.83 KB, patch)
2009-05-04 20:07 PDT, Shinichiro Hamaji
mjs: review-
patch v3 (15.74 KB, patch)
2009-06-01 22:55 PDT, Shinichiro Hamaji
no flags
patch v4 (15.69 KB, patch)
2009-06-04 00:20 PDT, Shinichiro Hamaji
no flags
patch v5 (15.72 KB, patch)
2009-06-04 21:52 PDT, Shinichiro Hamaji
abarth: review+
Shinichiro Hamaji
Comment 1 2009-05-01 15:05:08 PDT
Created attachment 29948 [details] Handle texts in unfinished special tags. LayoutTests/ChangeLog | 17 +++++++++ .../fast/frames/resources/viewsource-frame-3.html | 1 + .../frames/viewsource-unfinished-tags-expected.txt | 1 + .../fast/frames/viewsource-unfinished-tags.html | 39 ++++++++++++++++++++ WebCore/ChangeLog | 16 ++++++++ WebCore/html/HTMLTokenizer.cpp | 39 ++++++++++++-------- 6 files changed, 97 insertions(+), 16 deletions(-)
Alexey Proskuryakov
Comment 2 2009-05-02 02:32:43 PDT
Is this patch meant for review? Please mark it as such by clicking Edit link to the right of it if it is. Briefly looking over the code, I think that "specialTagParsed" is not a great name for the variable. What do these tags have in common? Answering that question should help giving a more self-documanting name to it.
Shinichiro Hamaji
Comment 3 2009-05-02 14:01:41 PDT
Comment on attachment 29948 [details] Handle texts in unfinished special tags. Thanks for the review! I forgot to change the flag for review, sorry. Regarding the name "special", I agree that this name isn't good, but it's a bit difficult to find a good name... How about "non HTML text" since these tags have non HTML text node as their child node? If it sounds good to you, I'll modify the patch in the beginning of the next week. I think we need to change the function name "parseSpecial" to "parseNonHtmlText" as well. Thanks!
Shinichiro Hamaji
Comment 4 2009-05-04 20:07:09 PDT
Created attachment 30009 [details] fixes for layouttest and rename "special tag" LayoutTests/ChangeLog | 24 +++++++++ ...tion-using-js-object-with-toString-expected.txt | 3 + ...dow-location-using-js-object-with-toString.html | 4 +- ...indow-shadow-location-using-string-expected.txt | 3 + .../window-shadow-location-using-string.html | 4 +- .../viewsource-frame-unfinished-script.html | 1 + .../viewsource-frame-unfinished-textarea.html | 1 + .../frames/viewsource-unfinished-tags-expected.txt | 3 + .../fast/frames/viewsource-unfinished-tags.html | 46 ++++++++++++++++++ WebCore/ChangeLog | 20 ++++++++ WebCore/html/HTMLTokenizer.cpp | 50 ++++++++++---------- WebCore/html/HTMLTokenizer.h | 4 +- 12 files changed, 132 insertions(+), 31 deletions(-)
Shinichiro Hamaji
Comment 5 2009-05-04 20:12:15 PDT
Comment on attachment 30009 [details] fixes for layouttest and rename "special tag" Updates from the previous patch were: - Renamed functions which contain "Special" using "NonHTMLText. - Fixed layouttest by using inAnyNonHTMLText(). - Added </iframe> for two tests and fixed the expected output.
Shinichiro Hamaji
Comment 6 2009-05-12 14:58:17 PDT
Ping for review?
Maciej Stachowiak
Comment 7 2009-05-22 01:18:07 PDT
Comment on attachment 29948 [details] Handle texts in unfinished special tags. In general the fix looks good, but I would suggest doing performance tests such as iBench HTML or other page load test benchmarks to confirm no regression, since the tokenizer is very performance-sensitive. Please repost with performance data.
Maciej Stachowiak
Comment 8 2009-05-22 02:31:50 PDT
Comment on attachment 30009 [details] fixes for layouttest and rename "special tag" Looks great! Like I said in the other patch, though, I'd like to see some perf testing of this. Please repost with perf testing results.
Shinichiro Hamaji
Comment 9 2009-05-22 03:19:17 PDT
Comment on attachment 29948 [details] Handle texts in unfinished special tags. Sorry, the next patch contains this patch and I should have marked this as obsolete.
Shinichiro Hamaji
Comment 10 2009-05-22 03:31:41 PDT
> In general the fix looks good, but I would suggest doing performance tests such > as iBench HTML or other page load test benchmarks to confirm no regression, > since the tokenizer is very performance-sensitive. Please repost with > performance data. I see. Thanks for the advice! I'll learn the tool and update this patch.
Shinichiro Hamaji
Comment 11 2009-06-01 05:27:58 PDT
I ran iBench and there seem to be no performance regression. I ran HTML load speed test 3 times. Before this patch: HTML Load Speed All iterations 15.15 First iteration (downloaded) 2.07 Subsequent iteration (cached) 1.87 HTML Load Speed All iterations 14.63 First iteration (downloaded) 1.86 Subsequent iteration (cached) 1.82 HTML Load Speed All iterations 14.66 First iteration (downloaded) 1.9 Subsequent iteration (cached) 1.82 After this patch: HTML Load Speed All iterations 15.51 First iteration (downloaded) 2.6 Subsequent iteration (cached) 1.84 HTML Load Speed All iterations 14.61 First iteration (downloaded) 1.86 Subsequent iteration (cached) 1.82 HTML Load Speed All iterations 14.53 First iteration (downloaded) 1.87 Subsequent iteration (cached) 1.81 It seems that the "First iteration" of the first trial has bigger variance than other tests due to network speed fluctuation. Other results show this patch don't introduce performance regression.
Alexey Proskuryakov
Comment 12 2009-06-01 07:02:57 PDT
See also: bug 8961 and its duplicates. Maybe this is a duplicate, too?
Shinichiro Hamaji
Comment 13 2009-06-01 19:20:08 PDT
(In reply to comment #12) > See also: bug 8961 and its duplicates. Maybe this is a duplicate, too? I think this is not the duplicate of the bug 8961. Hyatt's patch didn't fix this bug and my patch didn't fix the bug 8961 as well. By the way I've just found that this is the duplicate of this bug: https://bugs.webkit.org/show_bug.cgi?id=24860
Shinichiro Hamaji
Comment 14 2009-06-01 22:55:18 PDT
Created attachment 30856 [details] patch v3 LayoutTests/ChangeLog | 20 ++++++++ ...tion-using-js-object-with-toString-expected.txt | 3 + ...dow-location-using-js-object-with-toString.html | 4 +- ...indow-shadow-location-using-string-expected.txt | 3 + .../window-shadow-location-using-string.html | 4 +- .../viewsource-frame-unfinished-script.html | 1 + .../viewsource-frame-unfinished-textarea.html | 1 + .../frames/viewsource-unfinished-tags-expected.txt | 3 + .../fast/frames/viewsource-unfinished-tags.html | 46 ++++++++++++++++++ WebCore/ChangeLog | 18 +++++++ WebCore/html/HTMLTokenizer.cpp | 50 ++++++++++---------- WebCore/html/HTMLTokenizer.h | 4 +- 12 files changed, 126 insertions(+), 31 deletions(-)
Shinichiro Hamaji
Comment 15 2009-06-01 22:59:31 PDT
As I've found Ian is saying that re-parsing is invalid, I updated the patch so that this patch only affects for view-source mode and there should be no security issues. https://bugs.webkit.org/show_bug.cgi?id=6314
Shinichiro Hamaji
Comment 16 2009-06-04 00:20:34 PDT
Created attachment 30939 [details] patch v4 LayoutTests/ChangeLog | 20 ++++++++ ...tion-using-js-object-with-toString-expected.txt | 3 + ...dow-location-using-js-object-with-toString.html | 4 +- ...indow-shadow-location-using-string-expected.txt | 3 + .../window-shadow-location-using-string.html | 4 +- .../viewsource-frame-unfinished-script.html | 1 + .../viewsource-frame-unfinished-textarea.html | 1 + .../frames/viewsource-unfinished-tags-expected.txt | 3 + .../fast/frames/viewsource-unfinished-tags.html | 46 ++++++++++++++++++ WebCore/ChangeLog | 18 +++++++ WebCore/html/HTMLTokenizer.cpp | 50 ++++++++++---------- WebCore/html/HTMLTokenizer.h | 4 +- 12 files changed, 126 insertions(+), 31 deletions(-)
Shinichiro Hamaji
Comment 17 2009-06-04 00:21:40 PDT
Comment on attachment 30939 [details] patch v4 Sorry, I fixed my email address in ChangeLog.
Shinichiro Hamaji
Comment 18 2009-06-04 21:52:08 PDT
Created attachment 30990 [details] patch v5 LayoutTests/ChangeLog | 22 +++++++++ ...tion-using-js-object-with-toString-expected.txt | 3 + ...dow-location-using-js-object-with-toString.html | 4 +- ...indow-shadow-location-using-string-expected.txt | 3 + .../window-shadow-location-using-string.html | 4 +- .../viewsource-frame-unfinished-script.html | 1 + .../viewsource-frame-unfinished-textarea.html | 1 + .../frames/viewsource-unfinished-tags-expected.txt | 3 + .../fast/frames/viewsource-unfinished-tags.html | 46 ++++++++++++++++++ WebCore/ChangeLog | 20 ++++++++ WebCore/html/HTMLTokenizer.cpp | 50 ++++++++++---------- WebCore/html/HTMLTokenizer.h | 4 +- 12 files changed, 130 insertions(+), 31 deletions(-)
Shinichiro Hamaji
Comment 19 2009-06-04 21:53:09 PDT
Comment on attachment 30990 [details] patch v5 I inserted some line breaks in ChangeLog as I was advised that we should keep lines in ChangeLog short.
Adam Barth
Comment 20 2009-06-09 12:42:23 PDT
Comment on attachment 30990 [details] patch v5 I'm not an expert in the HTML tokenizer, but this patch LGTM.
Brent Fulgham
Comment 21 2009-06-11 11:16:20 PDT
Landed in @r44608.
Note You need to log in before you can comment on or make changes to this bug.