<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://bugs.webkit.org/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4.1"
          urlbase="https://bugs.webkit.org/"
          
          maintainer="admin@webkit.org"
>

    <bug>
          <bug_id>24419</bug_id>
          
          <creation_ts>2009-03-05 22:26:41 -0800</creation_ts>
          <short_desc>WordAwareIterator not working in WebCore/editing/Editor.cpp</short_desc>
          <delta_ts>2009-04-06 13:10:20 -0700</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WebKit</product>
          <component>WebCore Misc.</component>
          <version>528+ (Nightly build)</version>
          <rep_platform>PC</rep_platform>
          <op_sys>Linux</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>INVALID</resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>Normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>0</everconfirmed>
          <reporter name="Diego Escalante Urrelo">diegoe</reporter>
          <assigned_to name="Nobody">webkit-unassigned</assigned_to>
          <cc>gustavo</cc>
    
    <cc>mrowe</cc>
          

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>112538</commentid>
    <comment_count>0</comment_count>
    <who name="Diego Escalante Urrelo">diegoe</who>
    <bug_when>2009-03-05 22:26:41 -0800</bug_when>
    <thetext>So in WebCore/editing/Editor.cpp there&apos;s the code calling checkSpellingOfString of the current backend, while implementing the one for GTK+ I realized that this code (and I might be wrong) should be handling me the string analyzed /word by word/ and not as a whole.
Example:
- wrong:
            client-&gt;checkSpellingOfString(&quot;helol john&quot;, len, &amp;misspellingLocation, &amp;misspellingLength);
- right:
            client-&gt;checkSpellingOfString(&quot;helol&quot;, len, &amp;misspellingLocation, &amp;misspellingLength);
(second iteration)
            client-&gt;checkSpellingOfString(&quot;john&quot;, len, &amp;misspellingLocation, &amp;misspellingLength);

I base my theory in the name of the text iterator (WordAwareIterator?) and that there&apos;s a while() loop for calling checkSpellingOfString with an .advance() method and a check for .atEnd(). If this was not supposed to split by word, then why the while()? why .advance() the iterator?.

Based on all that, I suggest that this is a bug in the text iterator and that it should actually be giving me words and not the full phrase.

static String findFirstMisspellingInRange(EditorClient* client, Range* searchRange, int&amp; firstMisspellingOffset, bool markAll)
{
    ASSERT_ARG(client, client);
    ASSERT_ARG(searchRange, searchRange);
    
    WordAwareIterator it(searchRange);
    firstMisspellingOffset = 0;
    
    String firstMisspelling;
    int currentChunkOffset = 0;

    while (!it.atEnd()) {
        const UChar* chars = it.characters();
        int len = it.length();
        
        // Skip some work for one-space-char hunks
        if (!(len == 1 &amp;&amp; chars[0] == &apos; &apos;)) {
            
            int misspellingLocation = -1;
            int misspellingLength = 0;
            client-&gt;checkSpellingOfString(chars, len, &amp;misspellingLocation, &amp;misspellingLength);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

            // 5490627 shows that there was some code path here where the String constructor below crashes.
            // We don&apos;t know exactly what combination of bad input caused this, so we&apos;re making this much
            // more robust against bad input on release builds.
            ASSERT(misspellingLength &gt;= 0);
            ASSERT(misspellingLocation &gt;= -1);
            ASSERT(misspellingLength == 0 || misspellingLocation &gt;= 0);
            ASSERT(misspellingLocation &lt; len);
            ASSERT(misspellingLength &lt;= len);
            ASSERT(misspellingLocation + misspellingLength &lt;= len);
            
            if (misspellingLocation &gt;= 0 &amp;&amp; misspellingLength &gt; 0 &amp;&amp; misspellingLocation &lt; len &amp;&amp; misspellingLength &lt;= len &amp;&amp; misspellingLocation + misspellingLength &lt;= len) {
                
                // Remember first-encountered misspelling and its offset
                if (!firstMisspelling) {
                    firstMisspellingOffset = currentChunkOffset + misspellingLocation;
                    firstMisspelling = String(chars + misspellingLocation, misspellingLength);
                }
                
                // Mark this instance if we&apos;re marking all instances. Otherwise bail out because we found the first one.
                if (!markAll)
                    break;
                
                // Compute range of misspelled word
                RefPtr&lt;Range&gt; misspellingRange = TextIterator::subrange(searchRange, currentChunkOffset + misspellingLocation, misspellingLength);
                
                // Store marker for misspelled word
                ExceptionCode ec = 0;
                misspellingRange-&gt;startContainer(ec)-&gt;document()-&gt;addMarker(misspellingRange.get(), DocumentMarker::Spelling);
                ASSERT(ec == 0);
            }
        }
        
        currentChunkOffset += len;
        it.advance();
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    }
    
    return firstMisspelling;
}</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>112550</commentid>
    <comment_count>1</comment_count>
    <who name="Mark Rowe (bdash)">mrowe</who>
    <bug_when>2009-03-06 00:06:01 -0800</bug_when>
    <thetext>It&apos;s not quite clear to me what you think is wrong.  it.characters() returns a pointer into a given buffer.  it.length() returns the number of characters in the current word.  The buffer that it.characters() points in to may be substantially longer than the value returned by it.length(), which I suspect is what is throwing you off, but that is because the underlying buffer is probably the one associated with a DOM node or similar and allocating a new buffer per word would be slooooow.

Am I missing something here?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>113383</commentid>
    <comment_count>2</comment_count>
    <who name="Gustavo Noronha (kov)">gustavo</who>
    <bug_when>2009-03-12 07:24:23 -0700</bug_when>
    <thetext>I think this is working correctly. What happens is the patch in https://bugs.webkit.org/show_bug.cgi?id=15616 doesn&apos;t return the correct values for misspellingLocation and misspellingLength.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>116662</commentid>
    <comment_count>3</comment_count>
    <who name="Gustavo Noronha (kov)">gustavo</who>
    <bug_when>2009-04-06 13:10:20 -0700</bug_when>
    <thetext>I think we have pretty much confirmed that this is working correctly, by now. I&apos;ll close the bug and let Diego reopen it if he disagrees. =)</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>