Bug 14959 - No back forward entry added for pages created in javascript
Summary: No back forward entry added for pages created in javascript
Status: RESOLVED LATER
Alias: None
Product: WebKit
Classification: Unclassified
Component: History (show other bugs)
Version: 523.x (Safari 3)
Hardware: All All
: P2 Normal
Assignee: Nobody
URL: javascript:document.open();document.w...
Keywords:
Depends on:
Blocks:
 
Reported: 2007-08-13 13:15 PDT by Matt Perry
Modified: 2012-04-08 04:03 PDT (History)
10 users (show)

See Also:


Attachments
work in progress (13.70 KB, patch)
2007-08-20 18:04 PDT, Matt Perry
no flags Details | Formatted Diff | Diff
proposed patch (22.20 KB, patch)
2007-08-29 17:59 PDT, Matt Perry
no flags Details | Formatted Diff | Diff
updated patch (22.19 KB, patch)
2007-12-21 17:08 PST, Matt Perry
darin: review-
Details | Formatted Diff | Diff
revised patch (22.87 KB, patch)
2008-01-10 12:11 PST, Matt Perry
darin: review-
Details | Formatted Diff | Diff
revised patch (fixed) (24.62 KB, patch)
2008-01-14 12:18 PST, Matt Perry
beidson: review-
Details | Formatted Diff | Diff
reduced test case (229 bytes, text/html)
2008-02-01 17:07 PST, Matt Perry
no flags Details
Patch (14.23 KB, patch)
2012-03-27 02:08 PDT, Pravin D
no flags Details | Formatted Diff | Diff
Patch (33.75 KB, text/plain)
2012-04-03 23:00 PDT, Pravin D
no flags Details
Patch (33.66 KB, text/plain)
2012-04-03 23:35 PDT, Pravin D
no flags Details
Proposed Patch (15.98 KB, patch)
2012-04-04 03:38 PDT, Pravin D
no flags Details | Formatted Diff | Diff
TestCases_Patch (17.29 KB, patch)
2012-04-04 03:49 PDT, Pravin D
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ec2-cr-linux-01 (6.52 MB, application/zip)
2012-04-04 08:09 PDT, WebKit Review Bot
no flags Details
Proposed Patch (31.81 KB, patch)
2012-04-07 07:40 PDT, Pravin D
abarth: review-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Matt Perry 2007-08-13 13:15:04 PDT
Simple test case: paste this URL into your browser javascript:document.open();document.write("<body>test</body>");document.close()

In IE and FF, a new page is created, and you can click Back to go to the page you were on before.  In Safari, no back/forward item is created.  Instead, the current page is temporarily replaced (you can get back to it by navigating elsewhere, going back, then reloading).

This is not a regression, as it happens in shipping Safari on OS X (10.4.10).
Comment 1 Matt Perry 2007-08-14 14:55:01 PDT
I'm going to try working on this.  Here's my idea so far, so maybe someone can tell me whether it sounds reasonable or if I am way off base:

1. Route the document.write() text through the FrameLoader using SubstituteData, so the normal load events get fired.
  - document.open will start a load with an empty SubstituteData
  - document.write will append to the SubstituteData, which will write to the tokenizer
  - document.close will finish the load
This part will require some changes to the way SubstituteDate can be loaded (right now it must come in a single chunk).

2. Store SubstituteData on the HistoryItem, so the page can be restored when you navigate back to it.
Comment 2 Derk-Jan Hartman 2007-08-15 08:30:15 PDT
since document.open came up on IRC yesterday, might wanna take a look at bug #14959
Comment 3 Derk-Jan Hartman 2007-08-15 08:31:04 PDT
Eh correction #14968 of course.
Comment 4 Matt Perry 2007-08-20 18:04:36 PDT
Created attachment 16045 [details]
work in progress

Here's what I have so far, in case anyone cares.  It's certainly not finished, as there are some bugs, failing tests, and there's no new tests included.  But the basic idea is there (which is generally to have document.write go through the Frameloader instead of writing directly to the tokenizer).

If anyone cares to take a look, any suggestions would be appreciated.

One issue I haven't figured out how to solve yet is that my call to FrameLoader::load in Document::open results in the document being closed and reopened again.  It also doesn't work very nicely with the page cache, which I've turned off so far.
Comment 5 Matt Perry 2007-08-24 14:31:22 PDT
After struggling to get my original approach to work, I've decided to go another route.  Using the FrameLoader code caused too many problems.

Now I'm going to attempt something simpler:
1. In document.open(), add a new HistoryItem that holds an empty SubstituteData.
2. In document.write(), append to the SubstituteData.
3. When navigating to a HistoryItem with a valid SubstituteData, load that data as if it were the page contents.
Comment 6 Matt Perry 2007-08-29 17:59:02 PDT
Created attachment 16160 [details]
proposed patch

This patch adds support for putting document.open() generated pages in session history.  javascript: urls should also create new history items, but I'll attack that in another patch (should be much simpler).
Comment 7 Matt Perry 2007-10-09 11:15:01 PDT
Just commenting on this bug so it doesn't get lost.
Comment 8 Maciej Stachowiak 2007-10-13 19:35:19 PDT
Don't worry, it's not lost. We are trying to get through the review queue which sadly had a bit of a backlog.
Comment 9 Matt Perry 2007-12-21 17:08:48 PST
Created attachment 18052 [details]
updated patch

Fixes to previous patch to make it apply and build again.
Comment 10 Darin Adler 2007-12-29 15:14:35 PST
Comment on attachment 18052 [details]
updated patch

Patch looks great.

I have a few complaints that should be relatively simple to fix.

Many apologies for not reviewing this long ago. I had assumed that someone else was going to review. Thank you very much for working on this.

 1311     // Params set to match the old behavior, though replace=false is probably a more sane default. 
 1312     open("text/html", true);

I don't think the issue here is "sane default" but rather matching other browsers, since open is a public DOM function accessible from JavaScript. Can this change to replace = false? If not, then why not? This comment makes it seem like something is broken. Is something broken here? If so, what? What test would demonstrate the breakage?

 113     String mimeType("text/html");
 114     if (args.size() >= 1) {
 115         mimeType = String(args[0]->toString(exec)).lower();
 116         if (mimeType != "text/html" && mimeType != "replace")
 117             mimeType = "text/plain";  // anything other than text/html is treated as plaintext.
 118     }

It's unnecessarily inefficient to actually allocate the buffer for the string "text/html" -- the logic should be changed so that it gets set on both sides of an if.

Generally, we do not ever check the size of arguments. Instead we rely on the fact that arguments not passed are undefined. There are very few counterexamples to this (although one of them is earlier in this very function). And the JavaScript arguments list object makes this easy by returning undefined for any values past the end of the list. So this code should check args[0]->isUndefined() rather than args.size(). It's also worth testing behavior of other browsers when the value null is passed in -- is this treated as the MIME type "text/plain" or as "text/html"? It would be good to have a few tests that cover this. The comment doesn't make it clear at all why the value "replace" is allowed for mimeType and I must admit I don't understand why. The comment should start with a capital letter if it ends with a period.

 120     bool replace = false;
 121     if (args.size() >= 2)
 122         replace = equalIgnoringCase("replace", String(args[1]->toString(exec)));

The best way to handle this is probably to just leave out the if statement. We can let values like undefined get converted to string and rejected since they aren't equal to "replace".

 920     // Contains the text written to the document by script, eg through document.write().
 921     RefPtr<SharedBuffer> m_scriptedDocumentBuf;

How about a name that matches the comment more closely? Perhaps m_textWrittenByScript.

I don't like the existing name because we don't normally abbreviate buffer as "buf" and I don't see how "scripted document buffer" makes sense, since there is no such thing as a scripted document.

 1578         open(String("text/html"), false);

Why the explicit cast to String here?

1567          write(DeprecatedString("<html>"));
 1582         write(String("<html>"));

Can we get rid of the DeprecatedString overload?

 1587     if (m_scriptedDocumentBuf) {
 1588         CString utf8 = text.utf8();
 1589         m_scriptedDocumentBuf->append(utf8.data(), utf8.length());
 1590     }

Is there a good reason to convert this text to UTF-8 then decode it back to UTF-16 later? We really don't want to be converting back and forth between UTF-16 and UTF-8 unless there's a benefit to doing so. I'd like to avoid the extra memory allocation too.

 350     void open(const String&, bool);

While we do omit argument names in headers when they are not needed, this is a case where they are needed. It's not at all clear what the string or boolean are. We leave the names out when the argument type or function name makes the name superfluous.

 740     bool newItem = false;

The boolean should not be named "new item" because it is a boolean flag, not an item. It should be named something like isItemNew, itemIsNew or historyItemIsNew.

 755     DeprecatedString generatedURLString("generated:");
 756     generatedURLString.append(m_frame->document()->url());
 757     KURL generatedURL(generatedURLString);

For code like this, it's generally better to not make a temporary -- for one thing it means we have to revisit this callsite when eliminating DeprecatedString. This should simply be:

    KURL generatedURL("generated:" + m_frame->document()->url());

Perhaps the URL scheme should also be changed to something with a vendor prefix. Just plain "generated:" seems like it might conflict with other uses. Something with webkit in it would probably be better. Can the URL used here be detected by JavaScript? If so, what effect will the URL have on site compatibility? What URL do you see on other web browsers in this case? Do any of the tests detect this generated URL?

 759     SubstituteData substituteData(buffer, mimeType, String("UTF-8"), m_URL, generatedURL);
 760     item->setSubstituteData(substituteData);

This should use UTF-16 and we should not be converting text to UTF-8 at all. And I think there's no need to have a named temporary here.

20822108 void FrameLoader::load(const ResourceRequest& request, const NavigationAction& action, FrameLoadType type, PassRefPtr<FormState> formState)

 2113 void FrameLoader::load(const ResourceRequest& request, const NavigationAction& action, FrameLoadType type, PassRefPtr<FormState> formState, const SubstituteData& substituteData)

We're really concerned about the huge number of load functions in the FrameLoader class and we're trying to fix that. Adding yet another is unfortunate. On the other hand, since this effectively just gives the last parameter a default value of null, I think it's OK.
Comment 11 Matt Perry 2008-01-10 12:11:14 PST
Created attachment 18369 [details]
revised patch
Comment 12 Matt Perry 2008-01-10 12:23:09 PST
(In reply to comment #10)
> (From update of attachment 18052 [details] [edit])
> Patch looks great.
> 
> I have a few complaints that should be relatively simple to fix.
> 
> Many apologies for not reviewing this long ago. I had assumed that someone else
> was going to review. Thank you very much for working on this.

Thanks for taking the time to do a thorough review.

>  1311     // Params set to match the old behavior, though replace=false is
> probably a more sane default. 
>  1312     open("text/html", true);
> 
> I don't think the issue here is "sane default" but rather matching other
> browsers, since open is a public DOM function accessible from JavaScript. Can
> this change to replace = false? If not, then why not? This comment makes it
> seem like something is broken. Is something broken here? If so, what? What test
> would demonstrate the breakage?

I revised the comment, which I think was misleading.  Document::open() is not called from JavaScript (the one with params is).  open() is called in various places internal to WebCore, and I just wanted to preserve the old calling defaults to avoid unintentionally changing anything.  An example callsite is DOMImplementation::createHTMLDocument().

>     KURL generatedURL("generated:" + m_frame->document()->url());
> 
> Perhaps the URL scheme should also be changed to something with a vendor
> prefix. Just plain "generated:" seems like it might conflict with other uses.
> Something with webkit in it would probably be better. Can the URL used here be
> detected by JavaScript? If so, what effect will the URL have on site
> compatibility? What URL do you see on other web browsers in this case? Do any
> of the tests detect this generated URL?

I went with "webkitgenerated".  As far as I can tell, this url (used for the substitute data's responseURL) is entirely internal to WebCore, and not visible to the user or JavaScript at all.  The main reason in making it different is so it looks like a different page to the FrameLoader, and we don't try to do a reload.  Firefox uses "wyciwyg:" (what you cache is what you get).

I believe I took care of all your other suggestions.
Comment 13 Eric Seidel (no email) 2008-01-10 21:42:42 PST
Comment on attachment 18369 [details]
revised patch

Looks great Matt.  I'll leave this for Darin to review for real though.

I should note, there are two single line ifs which differ from the style guidelines:

} else {
 120         mimeType = "text/html";
 121     }

9     if (m_textWrittenByScript) {
 1590         m_textWrittenByScript->append(reinterpret_cast<const char*>(text.characters()),
 1591                                       text.length() * sizeof(UChar));
 1592     }

Otherwise the patch as a whole looks sane.  I'll leave darin to do the final r+.
Comment 14 Darin Adler 2008-01-11 18:34:02 PST
Comment on attachment 18369 [details]
revised patch

It would be good to make this do something predictable when toString raises an exception, but it's not like we're doing that well elsewhere.

r=me
Comment 15 Darin Adler 2008-01-13 12:14:43 PST
I applied the patch and ran tests. And all the new tests failed.
Comment 16 Darin Adler 2008-01-13 12:16:56 PST
Comment on attachment 18369 [details]
revised patch

The patch is great, but the file document-open.js is missing from the tests directory. We need a version of the patch that includes that new file.
Comment 17 Matt Perry 2008-01-14 12:18:22 PST
Created attachment 18447 [details]
revised patch (fixed)

Oops, sorry about that.  Resource files added.
Comment 18 Darin Adler 2008-01-14 12:48:55 PST
Comment on attachment 18447 [details]
revised patch (fixed)

OK. I'll try to land this next time I have some free time.
Comment 19 Darin Adler 2008-01-27 00:49:57 PST
Committed revision 29816.
Comment 20 Darin Adler 2008-01-31 15:07:05 PST
This seems to have broken back/forward on Gmail.

'I can't navigate to Gmail's inbox page from Gmail Compose's page. When I login into my gmail account and click "Compose Mail", the url shows <http://mail.google.com/mail/#compose>. Clicking the back toolbar icon, should go to  <http://mail.google.com/mail/#inbox> but the page redirects and remains at  <http://mail.google.com/mail/?ui=2&view=bsp&ver=ymdfwq781tpu>".

Given that, I think we're going to have to roll this change out.
Comment 21 Darin Adler 2008-01-31 23:47:24 PST
Unfortunately, this patch broke Gmail! Rolling it out and re-opening the bug.
Comment 22 Kevin McCullough 2008-02-01 10:51:30 PST
Another case that may be easier to reduce was found here:

* STEPS TO REPRODUCE
1. open http://www.greatschools.net/content/schoolChoiceCenter.page
2. Click any article, but not the menu (ex. Choosing a School:)
3. Click Back button till you get to original page.

* RESULTS
I get a page with a picture link, then another page with another link before finally coming back to original fully displayed page.
Comment 23 Matt Perry 2008-02-01 17:07:07 PST
Created attachment 18860 [details]
reduced test case

Thanks for the alternate repro case, Kevin.  I was able to make a reduced test case from it.  The problem is from the way ads are added to the page: by write()ing to an iframe's contentWindow.document.
Comment 24 Alexey Proskuryakov 2008-02-05 01:36:45 PST
Comment on attachment 18447 [details]
revised patch (fixed)

Clearing the review flag to get this out of commit queue.
Comment 25 Derk-Jan Hartman 2008-10-30 14:12:52 PDT
Can someone please update this ticket ? The issue seems to be forgotten event though it is apparently fixed if someone would spend the required time on getting the patch in the tree.
Comment 26 Matt Perry 2008-10-30 14:16:14 PDT
Sorry, I sort of dropped the ball on this.  My initial patch caused some regressions with Gmail and other sites.  I then got distracted by other things and haven't had time to work on it since.  I don't know when I'm going to have time to fix the issues with my patch.
Comment 27 Pravin D 2012-03-27 02:08:25 PDT
Created attachment 134004 [details]
Patch
Comment 28 Pravin D 2012-03-27 02:12:58 PDT
Added a patch for the issue, using Matt Perry's patch. The new patch has the fix for document.write in a iFrame , i.e  fixes the navigation in Gmail (Comment #20  by From Darin Adler).
Comment 29 Adam Barth 2012-03-27 02:23:37 PDT
Comment on attachment 134004 [details]
Patch

This patch contains numerous style violations and lacks any tests.  If you're interested in contributing to WebKit, you might want to start with an easier bug.  This one pretty tricky.
Comment 30 Adam Barth 2012-03-27 16:33:19 PDT
Comment on attachment 134004 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=134004&action=review

> Source/WebCore/bindings/js/JSHTMLDocumentCustom.cpp:128
> +    // See bug <https://bugs.webkit.org/show_bug.cgi?id=14959> for discussion.

We don't usually put comments like this in the source files.  We use "svn annotate" and ChangeLogs to find the bug threads about the code.

> Source/WebCore/bindings/js/JSHTMLDocumentCustom.cpp:129
> +    UString firstString = exec->argument(0).toString(exec)->value(exec);

What if toString throws an exception?

> Source/WebCore/bindings/js/JSHTMLDocumentCustom.cpp:132
> +    if (mimeType != "text/plain")

This check doesn't seem like it can be right.  What if the mimeType is "text/plain; charset=utf-8" ?

> Source/WebCore/bindings/js/JSHTMLDocumentCustom.cpp:133
> +            mimeType = "text/html";

This indent is wrong.

> Source/WebCore/dom/Document.cpp:2125
> +    // This method is called by various places in the WebCore code. Arguments
> +    // chosen for legacy reasons.

This comment is pretty vague.  Why else would a function exist other than to be called from various places?

> Source/WebCore/dom/Document.cpp:2163
>      if (m_frame)
> -        m_frame->loader()->didExplicitOpen();
> +        m_textWrittenByScript = SharedBuffer::create(); // Buffer to record data written by document.write calls.
> +        m_frame->loader()->didExplicitOpen(mimeType, replace, m_textWrittenByScript.get());

Did you mean to put { } around the body of this if-statement or did you mean for the second line to be executed unconditionally?
Comment 31 Darin Adler 2012-03-27 17:34:50 PDT
Comment on attachment 134004 [details]
Patch

The patch also needs to include a test case showing the bug is fixed.
Comment 32 Pravin D 2012-04-03 23:00:28 PDT
Created attachment 135517 [details]
Patch
Comment 33 Pravin D 2012-04-03 23:35:10 PDT
Created attachment 135522 [details]
Patch
Comment 34 Pravin D 2012-04-03 23:49:28 PDT
Have added a revised patch with test case.
Comment 35 Pravin D 2012-04-04 03:38:52 PDT
Created attachment 135548 [details]
Proposed Patch
Comment 36 Pravin D 2012-04-04 03:49:56 PDT
Created attachment 135549 [details]
TestCases_Patch
Comment 37 Pravin D 2012-04-04 04:17:41 PDT
(In reply to comment #30)
> > Source/WebCore/bindings/js/JSHTMLDocumentCustom.cpp:129
> > +    UString firstString = exec->argument(0).toString(exec)->value(exec);
> 
> What if toString throws an exception?

Have added check to test if "argument(0)" is Undefined or not.
 
> > Source/WebCore/bindings/js/JSHTMLDocumentCustom.cpp:132
> > +    if (mimeType != "text/plain")
> 
> This check doesn't seem like it can be right.  What if the mimeType is "text/plain; charset=utf-8" ?

mime type is extracted by checking if the type string contains the "MIMETYPE" in a case insensitive way.

Also changed the structure/functions from the previous patch. Now there is one Document::open(mime, replace, ownerDoc) which is called from JSHTMLDocument instead of the regular Document::open(ownerDoc). Document::open(mime, replace, ownerDoc)  in turn calls the Document::open(ownerDoc) and also handles adding/replacing history item contents through FrameLoader. Document::open(mime, replace, ownerDoc) is also responsible in deciding if a history item must be modified/added for that particular frame.

Moved the actual History Item manipulation from FrameLoader to HistoryController class

PS:
I have 2 seperate patches, one for the code changes and one for the test cases. EWS was/is failing for all platforms and style. So thought adding code changes and Test cases could help... Maybe not so !!
Comment 38 WebKit Review Bot 2012-04-04 08:09:04 PDT
Comment on attachment 135549 [details]
TestCases_Patch

Attachment 135549 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/12325810

New failing tests:
http/tests/navigation/document-open-html-adds-history-item.html
http/tests/navigation/document-open-default-new-window-adds-history-item.html
http/tests/navigation/document-open-html-replace-no-history-item.html
http/tests/navigation/document-open-default-adds-history-item.html
http/tests/navigation/document-write-without-open-adds-history-item.html
http/tests/navigation/document-open-in-iframe-no-history.html
Comment 39 WebKit Review Bot 2012-04-04 08:09:11 PDT
Created attachment 135591 [details]
Archive of layout-test-results from ec2-cr-linux-01

The attached test failures were seen while running run-webkit-tests on the chromium-ews.
Bot: ec2-cr-linux-01  Port: <class 'webkitpy.common.config.ports.ChromiumXVFBPort'>  Platform: Linux-2.6.35-28-virtual-x86_64-with-Ubuntu-10.10-maverick
Comment 40 Brady Eidson 2012-04-04 08:49:21 PDT
The code change and test cases need to be in the same patch.
Comment 41 Pravin D 2012-04-05 02:05:23 PDT
Ok, i'll upload another patch with both code changes and test cases.
Comment 42 Pravin D 2012-04-07 07:40:32 PDT
Created attachment 136129 [details]
Proposed Patch
Comment 43 Adam Barth 2012-04-08 00:05:35 PDT
Comment on attachment 136129 [details]
Proposed Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=136129&action=review

This patch has many serious problems.  As I wrote in Comment #29: If you're interested in contributing to WebKit, you might want to start with an easier bug.  This one pretty tricky.

> Source/WebCore/loader/HistoryController.cpp:890
> +    KURL generatedURL(ParsedURLString, "webkitgenerated:" + url);

What is the "webkitgenerated" scheme?  That looks like something you just made up.
Comment 44 Adam Barth 2012-04-08 00:06:48 PDT
This bug has too much noise in it to be useful for actually solving this problem.
Comment 45 Pravin D 2012-04-08 04:03:10 PDT
> This patch has many serious problems.  As I wrote in Comment #29: If you're interested in contributing to WebKit, you might want to start with an easier bug.  This one pretty tricky.
> 
> > Source/WebCore/loader/HistoryController.cpp:890
> > +    KURL generatedURL(ParsedURLString, "webkitgenerated:" + url);
> 
> What is the "webkitgenerated" scheme?  That looks like something you just made up.

Had to use a different URL for the substitute data.  Appended it with webkitgenerated. Was just trying to generate a different URL, from the given one...