Bug 19822

Summary: REGRESSION (r30243): setting location.hash to "#" causes a reload
Product: WebKit Reporter: Cameron Zwarich (cpst) <zwarich>
Component: Page LoadingAssignee: Cameron Zwarich (cpst) <zwarich>
Status: RESOLVED FIXED    
Severity: Normal CC: ap, beidson, bjorn.tipling, darin, ddkilzer, jurkuipers, khooyp, oliver, sam
Priority: P1 Keywords: HasReduction, InRadar, Regression
Version: 528+ (Nightly build)   
Hardware: Mac   
OS: OS X 10.5   
URL: http://today.ask.com/default
Attachments:
Description Flags
Reduction
none
Patch that should work
none
Proposed patch
none
Proposed patch with test
none
Proposed patch with test
beidson: review-
Revised proposed patch beidson: review+

Description Cameron Zwarich (cpst) 2008-06-29 23:39:34 PDT
In Safari 3.1.1, http://today.ask.com/default loads fine, but in ToT (r34883) it keeps on reloading. This is because location.hash is set to "#" on each load.
Comment 1 Cameron Zwarich (cpst) 2008-06-29 23:40:12 PDT
Created attachment 22004 [details]
Reduction
Comment 2 Cameron Zwarich (cpst) 2008-06-30 11:39:45 PDT
*** Bug 17627 has been marked as a duplicate of this bug. ***
Comment 3 Cameron Zwarich (cpst) 2008-06-30 11:41:51 PDT
This seems related to bug 13067.
Comment 4 David Kilzer (:ddkilzer) 2008-06-30 12:46:47 PDT
<rdar://problem/6044293>
Comment 5 David Kilzer (:ddkilzer) 2008-06-30 12:47:39 PDT
It would be interesting to know when this regressed via WebKitTools/Scripts/bisect-builds.
Comment 6 Brady Eidson 2008-06-30 13:18:30 PDT
I've narrowed this to between r30471 and r30584
Comment 7 Brady Eidson 2008-06-30 13:23:26 PDT
nevermind, it broke earlier than that.  sigh
Comment 8 David Kilzer (:ddkilzer) 2008-06-30 13:24:16 PDT
(In reply to comment #6)
> I've narrowed this to between r30471 and r30584

Odd.  The bisect-builds script with Safari DP 4 on Leopard 10.5.4 reports:

Works: r30240  Fails: r30267
Comment 9 Cameron Zwarich (cpst) 2008-06-30 13:33:50 PDT
(In reply to comment #8)
> Odd.  The bisect-builds script with Safari DP 4 on Leopard 10.5.4 reports:
> 
> Works: r30240  Fails: r30267

The only suspicion revision is r30243:

http://trac.webkit.org/changeset/30243
Comment 10 Cameron Zwarich (cpst) 2008-06-30 13:34:29 PDT
(In reply to comment #9) 
> The only suspicion revision is r30243:

Oops. That was a typo. I meant "suspicious" revision.
Comment 11 Brady Eidson 2008-06-30 13:36:01 PDT
http://trac.webkit.org/changeset/30243 seems like a likely candidate
Comment 12 Brady Eidson 2008-06-30 13:36:33 PDT
Dammit, people, if all 3 of us are working on this...  I'm just gonna pull out and let you two hash it out  :)
Comment 13 Brady Eidson 2008-06-30 14:32:33 PDT
I confirmed, as suspected, that 30243 caused this regression.
Comment 14 Brady Eidson 2008-06-30 14:33:47 PDT
Oh boy, it was a huge patch that fixed about a dozen different fixes.
Comment 15 David Kilzer (:ddkilzer) 2008-06-30 14:55:48 PDT
git-bisect confirms that the regression occurred in r30243.
Comment 16 Cameron Zwarich (cpst) 2008-06-30 19:16:34 PDT
I'm assigning this to myself. I'll take a shot at this tonight, but I probably won't have time if it drags into tomorrow.

I also apologize for the prior confusion about who was working on this.
Comment 17 Darin Adler 2008-06-30 19:29:42 PDT
Sorry my patch broke this. Cameron, I'm glad you're going to fix this. Let me know if you need anything from me.
Comment 18 Cameron Zwarich (cpst) 2008-06-30 21:28:10 PDT
Strangely enough, the problem is that "url.ref() == str" is false even though both strings are empty. The comparison is ultimately done in StringHash::equal() in platform/text/StringHash.h, so the bug is likely there.
Comment 19 Cameron Zwarich (cpst) 2008-06-30 23:25:31 PDT
The problematic code in StringImpl.h:equal() is

            if (a == b)
                return true;
            if (!a || !b)
                return false;

It is possible for a StringImpl* of an empty string to be non-null, so this code is incorrect. However, if I change it to take the length of a non-null StringImpl*, it passes the check in JSLocation::setHash(), but later on in loading it dies with this backtrace:

#0  0x02b3dc3d in WebCore::StringImpl::length (this=0xffffffff) at text/StringImpl.h:86
#1  0x028dcd3b in WebCore::StringHash::equal (a=0xffffffff, b=0x0) at text/StringHash.h:42
#2  0x02e9f126 in WebCore::equal (a=0xffffffff, b=0x0) at /Users/Cameron/WebKit/WebCore/platform/text/StringImpl.cpp:879
#3  0x028dccdf in WebCore::operator== (a=@0x14d24968, b=@0xbfffe1cc) at text/PlatformString.h:224
#4  0x028f52b5 in WTF::HashTable<WebCore::String, WebCore::String, WTF::IdentityExtractor<WebCore::String>, WebCore::StringHash, WTF::HashTraits<WebCore::String>, WTF::HashTraits<WebCore::String> >::isEmptyBucket (value=@0x14d24968) at HashTable.h:327

I guess 0xffffffff is an empty or deleted value in the hash table. Should it be getting used here? What is the best way to fix this?
Comment 20 Cameron Zwarich (cpst) 2008-06-30 23:26:43 PDT
Created attachment 22017 [details]
Patch that should work

Here is the patch that should work, but fails due to the problem mentioned in the previous post.
Comment 21 Geoffrey Garen 2008-06-30 23:58:32 PDT
It is a (correct) feature that the null string and the empty string do not compare equal. Either KURL::ref needs to change to return an empty string instead of a null string in the !hasRef() case, for the sake of compatibility with substring operations, which return the empty string when no such substring exists, or clients performing substring operations need to account for null vs empty.
Comment 22 Cameron Zwarich (cpst) 2008-07-01 00:57:26 PDT
Geoff straightened me out on IRC. I guess I thought WebCore's Strings should work like UString. The problem is that KURL::ref() possibly returns a null string. The natural thing to do is to make it return the empty string, but we should check its callsites first. It has 7 callsites in WebCore:

void JSLocation::setHash(ExecState* exec, JSValue* value)
{
    ...
    if (url.ref() == str)
        return;
    ...
}

We want url.ref() here to be the empty string, not the null string, as that fixes the bug.

String KURL::prettyURL() const
{
    ...
    if (m_fragmentEnd != m_queryEnd) {
        result.append('#');
        append(result, ref());
    }
    ...
}

Clearly, null and empty are the same here.

String HTMLAnchorElement::hash() const
{
    return "#" + href().ref();
}

It doesn't matter here, because the result is immediately being concatenated with another string.

String Location::hash() const
{
    if (!m_frame)
        return String();

    const KURL& url = this->url();
    return url.ref().isNull() ? "" : "#" + url.ref();
}

We can simply change the isNull() here to an isEmpty(), and the result of the second call is immediately concatenated. Not changing it causes the following tests to fail, so there is already test coverage here:

fast/dom/location-hash.html
http/tests/security/local-user-CSS-from-remote.html
http/tests/security/xss-DENIED-assign-location-hash.html

AccessibilityObject* AccessibilityRenderObject::internalLinkElement() const
{
    HTMLAnchorElement* anchor = anchorElement();
    if (!anchor)
        return 0;
    
    KURL linkURL = anchor->href();
    String ref = linkURL.ref();
    if (ref.isEmpty())
        return 0;
    
    // check if URL is the same as current URL
    linkURL.setRef("");
    ...
}

It doesn't matter here, because String::isEmpty() is the same for null and empty strings.

void FrameLoader::gotoAnchor()
{
    // If our URL has no ref, then we have no place we need to jump to.
    // OTOH If CSS target was set previously, we want to set it to 0, recalc
    // and possibly repaint because :target pseudo class may have been
    // set (see bug 11321).
    if (!m_URL.hasRef() && !(m_frame->document() && m_frame->document()->getCSSTarget()))
        return;

    String ref = m_URL.ref();
    if (gotoAnchor(ref))
        return;

    // Try again after decoding the ref, based on the document's encoding.
    if (m_decoder)
        gotoAnchor(decodeURLEscapeSequences(ref, m_decoder->encoding()));
}

Calling gotoAnchor() with an empty ref (as opposed to a null ref) seems to be fine, as the test fast/css/target-fragment-match.html for bug 11321 still passes.
Comment 23 Cameron Zwarich (cpst) 2008-07-01 01:44:16 PDT
Created attachment 22018 [details]
Proposed patch

Here is a patch that fixes the problem. My attempts to make a layout test have all failed, but I will keep on trying. If anyone has an idea about how to get DRT to keep on going with the scheduled location changes, please let me know.
Comment 24 Cameron Zwarich (cpst) 2008-07-01 03:00:26 PDT
Created attachment 22020 [details]
Proposed patch with test

I found the http/tests/loading/ directory.
Comment 25 Cameron Zwarich (cpst) 2008-07-01 03:06:20 PDT
Created attachment 22021 [details]
Proposed patch with test

I forgot to include a description of the test in the test output.
Comment 26 David Kilzer (:ddkilzer) 2008-07-01 07:15:42 PDT
Great analysis, Cameron!
Comment 27 Brady Eidson 2008-07-01 07:41:42 PDT
Comment on attachment 22021 [details]
Proposed patch with test

r=me!
Comment 28 Alexey Proskuryakov 2008-07-01 08:06:11 PDT
Comment on attachment 22021 [details]
Proposed patch with test

I think that a better explanation of expected results in the test would be beneficial. It would be very hard to figure out which order of callbacks is right otherwise, and might even make someone land incorrect results on top of correct ones.
Comment 29 Brady Eidson 2008-07-01 08:11:53 PDT
Comment on attachment 22021 [details]
Proposed patch with test

Changed my mind - Alexey raises a good point.  Additionally, tests that rely on DRT's load delegate output would do well to point out that the test was designed for DRT and might not be effective in the browser.
Comment 30 Darin Adler 2008-07-01 08:18:28 PDT
There's a logical that ref() returns null instead of empty when there is no ref. It allows the caller to distinguish the case of an empty ref "<url>#" from no ref at all "<url>". That seems like a good design to me. Does it need to change? Can we change the callers instead?
Comment 31 Cameron Zwarich (cpst) 2008-07-01 11:15:45 PDT
(In reply to comment #30)
> There's a logical that ref() returns null instead of empty when there is no
> ref. It allows the caller to distinguish the case of an empty ref "<url>#" from
> no ref at all "<url>". That seems like a good design to me. Does it need to
> change? Can we change the callers instead?

I agree with your point. It doesn't seem like it would ever a problem, because anyone that wants to know whether there is a fragment checks url.hasRef() instead of url.ref().isNull(), but Brady pointed out to me that conversions into platform-specific URL classes could mean that bugs get introduced elsewhere. I'll change the one caller to account for the case where they are both empty.

I will also include a more detailed description of failure in the layout test, as well as pointing out that if it fails in the browser it will simply reload forever.
Comment 32 Cameron Zwarich (cpst) 2008-07-01 11:49:04 PDT
Created attachment 22031 [details]
Revised proposed patch

Here is a new patch incorporating the comments posted in this bug.

Here is the output when the test fails:

main frame - didStartProvisionalLoadForFrame
main frame - didCommitLoadForFrame
main frame - willPerformClientRedirectToURL: http://127.0.0.1:8000/loading/location-hash-reload-cycle.html 
main frame - didCancelClientRedirectForFrame
main frame - willPerformClientRedirectToURL: http://127.0.0.1:8000/loading/location-hash-reload-cycle.html 
main frame - didHandleOnloadEventsForFrame
main frame - willPerformClientRedirectToURL: http://127.0.0.1:8000/loading/location-hash-reload-cycle.html 
main frame - didFinishDocumentLoadForFrame
main frame - didFinishLoadForFrame
This test checks that no loader actions occur when setting window.location.hash to the empty string or "#". If this test fails when run in a browser, it will reload continuously. If this test fails when run in DumpRenderTree, the FrameLoader callback output will contain willPerformClientRedirectToURL callbacks.

Both of the window.location.hashes change the output independently of each other. When run in a browser, it is very noticeable that it is reloading forever.
Comment 33 Brady Eidson 2008-07-01 11:59:02 PDT
Comment on attachment 22031 [details]
Revised proposed patch

r=me
Comment 34 Cameron Zwarich (cpst) 2008-07-01 12:06:41 PDT
Landed in r34927.
Comment 35 Darin Adler 2008-07-01 15:06:51 PDT
Comment on attachment 22031 [details]
Revised proposed patch

Cameron, I know you meant this:

    if (oldRef == str || (oldRef.isNull() && str.isEmpty()))

You forgot to use oldRef in the old part of the expression.
Comment 36 Cameron Zwarich (cpst) 2008-07-01 15:09:25 PDT
(In reply to comment #35)
> (From update of attachment 22031 [details] [edit])
> Cameron, I know you meant this:
> 
>     if (oldRef == str || (oldRef.isNull() && str.isEmpty()))
> 
> You forgot to use oldRef in the old part of the expression.

How did Brady and I miss that? r=you to fix it?
Comment 37 Darin Adler 2008-07-01 16:05:08 PDT
(In reply to comment #36)
> How did Brady and I miss that? r=you to fix it?

As you probably already noticed, I fixed it.
Comment 38 Cameron Zwarich (cpst) 2008-07-18 22:48:25 PDT
*** Bug 17832 has been marked as a duplicate of this bug. ***