<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://bugs.webkit.org/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4.1"
          urlbase="https://bugs.webkit.org/"
          
          maintainer="admin@webkit.org"
>

    <bug>
          <bug_id>61572</bug_id>
          
          <creation_ts>2011-05-26 15:51:43 -0700</creation_ts>
          <short_desc>KURL::protocolIs(const char* protocol) asserts in Debug builds with valid protocols</short_desc>
          <delta_ts>2011-06-21 09:08:58 -0700</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WebKit</product>
          <component>WebCore Misc.</component>
          <version>528+ (Nightly build)</version>
          <rep_platform>All</rep_platform>
          <op_sys>All</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>FIXED</resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords>InRadar</keywords>
          <priority>P2</priority>
          <bug_severity>Normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="David Kilzer (:ddkilzer)">ddkilzer</reporter>
          <assigned_to name="Andy Estes">aestes</assigned_to>
          <cc>aestes</cc>
    
    <cc>darin</cc>
    
    <cc>joepeck</cc>
          

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>410950</commentid>
    <comment_count>0</comment_count>
    <who name="David Kilzer (:ddkilzer)">ddkilzer</who>
    <bug_when>2011-05-26 15:51:43 -0700</bug_when>
    <thetext>* SUMMARY
KURL::protocolIs(const char* protocol) throws an assertion in Debug builds when a (custom) protocol is compared that contains anything other than letters:

ASSERTION FAILED: isASCIILower(lowercaseLetter)
Source/WebCore/platform/KURL.cpp(63) : bool WebCore::isLetterMatchIgnoringCase(UChar, char)
1   _ZN7WebCoreL25isLetterMatchIgnoringCaseEtc
2   WebCore::KURL::protocolIs(char const*) const

Note that Section 3.1 of RFC 3986 &lt;http://tools.ietf.org/html/rfc3986#section-3.1&gt; states that a scheme must start with a letter, and be composed of letters, digits, &quot;+&quot;, &quot;-&quot; and &quot;.&quot; characters.  Thus this code that assumes that all characters in the scheme are letters is incorrect:

    // Do the comparison without making a new string object.
    for (int i = 0; i &lt; m_schemeEnd; ++i) {
        if (!protocol[i] || !isLetterMatchIgnoringCase(m_string[i], protocol[i]))
            return false;
    }</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>410993</commentid>
    <comment_count>1</comment_count>
    <who name="Darin Adler">darin</who>
    <bug_when>2011-05-26 16:41:02 -0700</bug_when>
    <thetext>This is working as designed.

The protocolIs function was designed to be used only with protocols that contain only letters. We can add the feature of checking for other protocols, and that will make it slower. We’re not assuming that all valid protocols contain entirely letters, just that all the ones we are using with this function are entirely letters.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>411042</commentid>
    <comment_count>2</comment_count>
    <who name="David Kilzer (:ddkilzer)">ddkilzer</who>
    <bug_when>2011-05-26 17:30:20 -0700</bug_when>
    <thetext>&lt;rdar://problem/9510987&gt;</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>422979</commentid>
    <comment_count>3</comment_count>
    <who name="Darin Adler">darin</who>
    <bug_when>2011-06-17 17:28:09 -0700</bug_when>
    <thetext>The issue is not just an assertion. The code actually works wrong for custom protocols with non-letters.

When I originally wrote this we wanted it to be super-fast for protocols like &quot;http&quot; and &quot;file&quot; and it seemed the right tradeoff to only work with letters. If we want a slightly slower one that can work for any protocol, it&apos;s easy to change this into that.

Or we could even have both, although I don&apos;t see an obvious way to automatically choose the faster one at compile time.

The current function could be named fastAllLetterProtocolIs or something annoying like that.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>422982</commentid>
    <comment_count>4</comment_count>
    <who name="Andy Estes">aestes</who>
    <bug_when>2011-06-17 17:32:30 -0700</bug_when>
    <thetext>*** Bug 62917 has been marked as a duplicate of this bug. ***</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>422986</commentid>
    <comment_count>5</comment_count>
    <who name="Darin Adler">darin</who>
    <bug_when>2011-06-17 17:35:46 -0700</bug_when>
    <thetext>(In reply to comment #3)
&gt; The current function could be named fastAllLetterProtocolIs or something annoying like that.

Not sure we need a super-fast version, though.

Please also see my comments in the patch in bug 62917.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>424099</commentid>
    <comment_count>6</comment_count>
    <who name="Andy Estes">aestes</who>
    <bug_when>2011-06-20 17:07:04 -0700</bug_when>
    <thetext>(In reply to comment #3)
&gt; The issue is not just an assertion. The code actually works wrong for custom protocols with non-letters.
&gt; 
&gt; When I originally wrote this we wanted it to be super-fast for protocols like &quot;http&quot; and &quot;file&quot; and it seemed the right tradeoff to only work with letters. If we want a slightly slower one that can work for any protocol, it&apos;s easy to change this into that.
&gt; 
&gt; Or we could even have both, although I don&apos;t see an obvious way to automatically choose the faster one at compile time.
&gt; 
&gt; The current function could be named fastAllLetterProtocolIs or something annoying like that.

Since the valid non-letter scheme characters (&apos;0&apos;-&apos;9&apos;, &apos;+&apos;, &apos;-&apos;, &apos;.&apos;) all fall in the range [32, 63], ORing with 0x20 is a no-op. I think we can make a special version of isLetterMatchIgnoringCase() that works only for protocol characters. The check will be identical to isLetterMatchIgnoringCase() but will have a more permissive assertion.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>424102</commentid>
    <comment_count>7</comment_count>
    <who name="Darin Adler">darin</who>
    <bug_when>2011-06-20 17:07:55 -0700</bug_when>
    <thetext>Good idea!</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>424110</commentid>
    <comment_count>8</comment_count>
      <attachid>97894</attachid>
    <who name="Andy Estes">aestes</who>
    <bug_when>2011-06-20 17:22:29 -0700</bug_when>
    <thetext>Created attachment 97894
Patch</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>424141</commentid>
    <comment_count>9</comment_count>
      <attachid>97894</attachid>
    <who name="Darin Adler">darin</who>
    <bug_when>2011-06-20 18:05:09 -0700</bug_when>
    <thetext>Comment on attachment 97894
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=97894&amp;action=review

Is the isLetterMatchIgnoringCase function used anywhere else? If not, then we probably should remove it.

&gt; Source/WebCore/platform/KURL.cpp:249
&gt; +    ASSERT(isSchemeChar(character));
&gt; +    ASSERT(isASCIILower(schemeCharacter) || (!isASCIIUpper(schemeCharacter) &amp;&amp; isSchemeChar(schemeCharacter)));
&gt; +    return (character | 0x20) == schemeCharacter;

I think we want another assertion:

    ASSERT(schemeCharacter &amp; 0x20);</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>424175</commentid>
    <comment_count>10</comment_count>
    <who name="Andy Estes">aestes</who>
    <bug_when>2011-06-20 18:50:55 -0700</bug_when>
    <thetext>(In reply to comment #9)
&gt; (From update of attachment 97894 [details])
&gt; View in context: https://bugs.webkit.org/attachment.cgi?id=97894&amp;action=review
&gt; 
&gt; Is the isLetterMatchIgnoringCase function used anywhere else? If not, then we probably should remove it.

It&apos;s used in numerous other places in KURL.

&gt; 
&gt; &gt; Source/WebCore/platform/KURL.cpp:249
&gt; &gt; +    ASSERT(isSchemeChar(character));
&gt; &gt; +    ASSERT(isASCIILower(schemeCharacter) || (!isASCIIUpper(schemeCharacter) &amp;&amp; isSchemeChar(schemeCharacter)));
&gt; &gt; +    return (character | 0x20) == schemeCharacter;
&gt; 
&gt; I think we want another assertion:
&gt; 
&gt;     ASSERT(schemeCharacter &amp; 0x20);

Agreed. Thanks for the review!</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>424281</commentid>
    <comment_count>11</comment_count>
    <who name="Andy Estes">aestes</who>
    <bug_when>2011-06-20 23:05:29 -0700</bug_when>
    <thetext>Committed r89336: &lt;http://trac.webkit.org/changeset/89336&gt;</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>424287</commentid>
    <comment_count>12</comment_count>
      <attachid>97894</attachid>
    <who name="Andy Estes">aestes</who>
    <bug_when>2011-06-20 23:10:48 -0700</bug_when>
    <thetext>Comment on attachment 97894
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=97894&amp;action=review

&gt; Source/WebCore/platform/KURL.cpp:1788
&gt; +        if (!isSchemeCharacterMatchIgnoringCase(url[i], protocol[i]))

I landed this without this change. Unlike the KURL member function, many strings are passed into the protocolIs() free function that are not normalized URLs and trigger assertions in isSchemeCharacterMatchIgnoringCase(). Of course the free function will trigger an assertion if the second argument isn&apos;t a letter-only protocol just like the member function did before this patch.

I&apos;m not sure how to handle this. It looks like some, but not all, call sites can be converted to call the KURL member function instead.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>424489</commentid>
    <comment_count>13</comment_count>
    <who name="Darin Adler">darin</who>
    <bug_when>2011-06-21 09:08:58 -0700</bug_when>
    <thetext>I suggest making a slightly less optimized version for the case where we need it. I don’t know why I was so obsessed with using the | 0x20 trick.</thetext>
  </long_desc>
      
          <attachment
              isobsolete="0"
              ispatch="1"
              isprivate="0"
          >
            <attachid>97894</attachid>
            <date>2011-06-20 17:22:29 -0700</date>
            <delta_ts>2011-06-20 23:10:48 -0700</delta_ts>
            <desc>Patch</desc>
            <filename>bug-61572-20110620172228.patch</filename>
            <type>text/plain</type>
            <size>3084</size>
            <attacher name="Andy Estes">aestes</attacher>
            
              <data encoding="base64">U3VidmVyc2lvbiBSZXZpc2lvbjogODkzMTIKZGlmZiAtLWdpdCBhL1NvdXJjZS9XZWJDb3JlL0No
YW5nZUxvZyBiL1NvdXJjZS9XZWJDb3JlL0NoYW5nZUxvZwppbmRleCBlMTExODk2MjdhODQ5NjNl
NmRjYTAyYjZmZTQzZDEzZjE0Y2FkYWFiLi5kNWVhY2YwMmQ5NTQ0NDA0YzM5YmExMmNhZDI0YzFh
OTFiZjNjYmQ3IDEwMDY0NAotLS0gYS9Tb3VyY2UvV2ViQ29yZS9DaGFuZ2VMb2cKKysrIGIvU291
cmNlL1dlYkNvcmUvQ2hhbmdlTG9nCkBAIC0xLDMgKzEsMjMgQEAKKzIwMTEtMDYtMjAgIEFuZHkg
RXN0ZXMgIDxhZXN0ZXNAYXBwbGUuY29tPgorCisgICAgICAgIFJldmlld2VkIGJ5IE5PQk9EWSAo
T09QUyEpLgorCisgICAgICAgIEtVUkw6OnByb3RvY29sSXMoY29uc3QgY2hhciogcHJvdG9jb2wp
IGFzc2VydHMgaW4gRGVidWcgYnVpbGRzIHdpdGgKKyAgICAgICAgdmFsaWQgcHJvdG9jb2xzCisg
ICAgICAgIGh0dHBzOi8vYnVncy53ZWJraXQub3JnL3Nob3dfYnVnLmNnaT9pZD02MTU3MgorCisg
ICAgICAgIE5vIG5ldyB0ZXN0cy4gTm8gY29kZSBjdXJyZW50bHkgY2FsbHMgcHJvdG9jb2xJcygp
IHdpdGggYSBwcm90b2NvbCB0aGF0CisgICAgICAgIGNvbnRhaW5zIGEgbm9uLWxldHRlciBjaGFy
YWN0ZXIuCisKKyAgICAgICAgKiBwbGF0Zm9ybS9LVVJMLmNwcDoKKyAgICAgICAgKFdlYkNvcmU6
OmlzU2NoZW1lQ2hhcmFjdGVyTWF0Y2hJZ25vcmluZ0Nhc2UpOiBBIGhlbHBlciBmdW5jdGlvbiB0
aGF0CisgICAgICAgIGNvbXBhcmVzIHR3byBjaGFyYWN0ZXJzIGlnbm9yaW5nIGNhc2UuIEl0IGFz
c3VtZXMgKGFuZCBhc3NlcnRzKSB0aGF0CisgICAgICAgIGJvdGggY2hhcmFjdGVycyBhcmUgdmFs
aWQgc2NoZW1lIGNoYXJhY3RlcnMsIGFuZCB0aGF0IGlmIHRoZSBzZWNvbmQKKyAgICAgICAgYXJn
dW1lbnQgaXMgYSBsZXR0ZXIgdGhhdCBpdCBpcyBsb3dlcmNhc2UuCisgICAgICAgIChXZWJDb3Jl
OjpLVVJMOjpwcm90b2NvbElzKTogQ2FsbCBpc1NjaGVtZUNoYXJhY3Rlck1hdGNoSWdub3JpbmdD
YXNlKCkKKyAgICAgICAgaW5zdGVhZCBvZiBpc0xldHRlck1hdGNoSWdub3JpbmdDYXNlKCkuCisg
ICAgICAgIChXZWJDb3JlOjpwcm90b2NvbElzKTogRGl0dG8uCisKIDIwMTEtMDYtMjAgIEFkYW0g
QmFydGggIDxhYmFydGhAd2Via2l0Lm9yZz4KIAogICAgICAgICBSZXZpZXdlZCBieSBFcmljIFNl
aWRlbC4KZGlmZiAtLWdpdCBhL1NvdXJjZS9XZWJDb3JlL3BsYXRmb3JtL0tVUkwuY3BwIGIvU291
cmNlL1dlYkNvcmUvcGxhdGZvcm0vS1VSTC5jcHAKaW5kZXggODYxMWVkMTg5OWQwMTYyMWMwZWRj
NDJiYjlkZDE2MzA5ZmFlMDRmZi4uMjkwMDQyYjVlM2Y2ZDlhZDdhNjA1M2VkODQ4MjllMDc3MTM2
M2RmNSAxMDA2NDQKLS0tIGEvU291cmNlL1dlYkNvcmUvcGxhdGZvcm0vS1VSTC5jcHAKKysrIGIv
U291cmNlL1dlYkNvcmUvcGxhdGZvcm0vS1VSTC5jcHAKQEAgLTI0MSw2ICsyNDEsMTMgQEAgc3Rh
dGljIGlubGluZSBib29sIGlzSVB2NkNoYXIodW5zaWduZWQgY2hhciBjKSB7IHJldHVybiBjaGFy
YWN0ZXJDbGFzc1RhYmxlW2NdICYKIHN0YXRpYyBpbmxpbmUgYm9vbCBpc1BhdGhTZWdtZW50RW5k
Q2hhcihjaGFyIGMpIHsgcmV0dXJuIGNoYXJhY3RlckNsYXNzVGFibGVbc3RhdGljX2Nhc3Q8dW5z
aWduZWQgY2hhcj4oYyldICYgUGF0aFNlZ21lbnRFbmRDaGFyOyB9CiBzdGF0aWMgaW5saW5lIGJv
b2wgaXNQYXRoU2VnbWVudEVuZENoYXIoVUNoYXIgYykgeyByZXR1cm4gYyA8PSAweGZmICYmIChj
aGFyYWN0ZXJDbGFzc1RhYmxlW2NdICYgUGF0aFNlZ21lbnRFbmRDaGFyKTsgfQogc3RhdGljIGlu
bGluZSBib29sIGlzQmFkQ2hhcih1bnNpZ25lZCBjaGFyIGMpIHsgcmV0dXJuIGNoYXJhY3RlckNs
YXNzVGFibGVbY10gJiBCYWRDaGFyOyB9CisgICAgCitzdGF0aWMgaW5saW5lIGJvb2wgaXNTY2hl
bWVDaGFyYWN0ZXJNYXRjaElnbm9yaW5nQ2FzZShjaGFyIGNoYXJhY3RlciwgY2hhciBzY2hlbWVD
aGFyYWN0ZXIpCit7CisgICAgQVNTRVJUKGlzU2NoZW1lQ2hhcihjaGFyYWN0ZXIpKTsKKyAgICBB
U1NFUlQoaXNBU0NJSUxvd2VyKHNjaGVtZUNoYXJhY3RlcikgfHwgKCFpc0FTQ0lJVXBwZXIoc2No
ZW1lQ2hhcmFjdGVyKSAmJiBpc1NjaGVtZUNoYXIoc2NoZW1lQ2hhcmFjdGVyKSkpOworICAgIHJl
dHVybiAoY2hhcmFjdGVyIHwgMHgyMCkgPT0gc2NoZW1lQ2hhcmFjdGVyOworfQogCiBzdGF0aWMg
aW5saW5lIGludCBoZXhEaWdpdFZhbHVlKFVDaGFyIGMpCiB7CkBAIC03MDIsNyArNzA5LDcgQEAg
Ym9vbCBLVVJMOjpwcm90b2NvbElzKGNvbnN0IGNoYXIqIHByb3RvY29sKSBjb25zdAogCiAgICAg
Ly8gRG8gdGhlIGNvbXBhcmlzb24gd2l0aG91dCBtYWtpbmcgYSBuZXcgc3RyaW5nIG9iamVjdC4K
ICAgICBmb3IgKGludCBpID0gMDsgaSA8IG1fc2NoZW1lRW5kOyArK2kpIHsKLSAgICAgICAgaWYg
KCFwcm90b2NvbFtpXSB8fCAhaXNMZXR0ZXJNYXRjaElnbm9yaW5nQ2FzZShtX3N0cmluZ1tpXSwg
cHJvdG9jb2xbaV0pKQorICAgICAgICBpZiAoIXByb3RvY29sW2ldIHx8ICFpc1NjaGVtZUNoYXJh
Y3Rlck1hdGNoSWdub3JpbmdDYXNlKG1fc3RyaW5nW2ldLCBwcm90b2NvbFtpXSkpCiAgICAgICAg
ICAgICByZXR1cm4gZmFsc2U7CiAgICAgfQogICAgIHJldHVybiAhcHJvdG9jb2xbbV9zY2hlbWVF
bmRdOyAvLyBXZSBzaG91bGQgaGF2ZSBjb25zdW1lZCBhbGwgY2hhcmFjdGVycyBpbiB0aGUgYXJn
dW1lbnQuCkBAIC0xNzc4LDcgKzE3ODUsNyBAQCBib29sIHByb3RvY29sSXMoY29uc3QgU3RyaW5n
JiB1cmwsIGNvbnN0IGNoYXIqIHByb3RvY29sKQogICAgIGZvciAoaW50IGkgPSAwOyA7ICsraSkg
ewogICAgICAgICBpZiAoIXByb3RvY29sW2ldKQogICAgICAgICAgICAgcmV0dXJuIHVybFtpXSA9
PSAnOic7Ci0gICAgICAgIGlmICghaXNMZXR0ZXJNYXRjaElnbm9yaW5nQ2FzZSh1cmxbaV0sIHBy
b3RvY29sW2ldKSkKKyAgICAgICAgaWYgKCFpc1NjaGVtZUNoYXJhY3Rlck1hdGNoSWdub3JpbmdD
YXNlKHVybFtpXSwgcHJvdG9jb2xbaV0pKQogICAgICAgICAgICAgcmV0dXJuIGZhbHNlOwogICAg
IH0KIH0K
</data>
<flag name="review"
          id="91981"
          type_id="1"
          status="+"
          setter="darin"
    />
          </attachment>
      

    </bug>

</bugzilla>