Bug 39802 - Implement SegmentedString::lookAheadSlowCase
Summary: Implement SegmentedString::lookAheadSlowCase
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: 528+ (Nightly build)
Hardware: Other OS X 10.5
: P2 Normal
Assignee: Adam Barth
Depends on:
Blocks: 39259
  Show dependency treegraph
Reported: 2010-05-26 18:34 PDT by Adam Barth
Modified: 2010-05-26 23:54 PDT (History)
1 user (show)

See Also:

Work in progress (4.58 KB, patch)
2010-05-26 18:34 PDT, Adam Barth
no flags Details | Formatted Diff | Diff
Patch (5.55 KB, patch)
2010-05-26 19:26 PDT, Adam Barth
no flags Details | Formatted Diff | Diff
Patch (5.71 KB, patch)
2010-05-26 21:38 PDT, Adam Barth
no flags Details | Formatted Diff | Diff
Patch for landing (5.70 KB, patch)
2010-05-26 23:30 PDT, Adam Barth
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Adam Barth 2010-05-26 18:34:02 PDT
Implement SegmentedString::lookAheadSlowCase
Comment 1 Adam Barth 2010-05-26 18:34:40 PDT
Created attachment 57184 [details]
Work in progress
Comment 2 Adam Barth 2010-05-26 19:26:14 PDT
Created attachment 57187 [details]
Comment 3 Darin Adler 2010-05-26 20:51:52 PDT
Comment on attachment 57187 [details]

> +        if (!m_pushedChar1 && string.length() <= (unsigned) m_currentString.m_length) {

We should use C++ casts such as static_cast, not C-style casts.

> +        if (compare(string.characters(), futureCharacters.data(), string.length() * sizeof(UChar)))

I believe the "* sizeof(UChar)" here is incorrect.

The use of Vector<UChar> specifically for the result of the advance function seems slightly quirky to me. The function always returns a fixed number of characters, so paying for the range check and such as you append characters to a vector doesn't make sense. I would have the function take a UChar* instead.

Since strings made from vectors use two malloc blocks as opposed to a single block in the type where a vector is not involved, it would be better to use String::createUninitialized and create a string of exactly the right size. As long as advance doesn't have a preference for Vector<UChar> then you could use that easily.

Do you really need to name these local comparison functions in the cryptic memcmp style? If you're going to define your own functions anyway, maybe just have them return booleans instead of begin memcmp-style -1,0,1 functions, and give them normal names.

review- because of the "* sizeof(UChar)" error.
Comment 4 Adam Barth 2010-05-26 21:16:06 PDT
> review- because of the "* sizeof(UChar)" error.

Good catch.  That snuck in because of an earlier iteration.
Comment 5 Adam Barth 2010-05-26 21:38:59 PDT
Created attachment 57191 [details]

Thanks to Darin, we now pass the doctype suite in the resumer harness.  I haven't turned on the test because it fails with the old parser.
Comment 6 Darin Adler 2010-05-26 23:05:43 PDT
Comment on attachment 57191 [details]

> +        *consumedCharacters++ = *current();

Just a nitpick: I would write:

    consumedCharacters[i] = *current();

But the two are equivalent.

> +        if (string.length() > length())
> +            return NotEnoughCharacters;
> +        unsigned count = string.length();

If you set up count one line earlier then you wouldn't have to evaluate string.length() twice.

> +        UChar* consumedCharacters = 0;

No need to initialize to zero here.

Comment 7 Adam Barth 2010-05-26 23:30:39 PDT
Created attachment 57198 [details]
Patch for landing
Comment 8 Adam Barth 2010-05-26 23:54:15 PDT
Comment on attachment 57198 [details]
Patch for landing

Clearing flags on attachment: 57198

Committed r60274: <http://trac.webkit.org/changeset/60274>
Comment 9 Adam Barth 2010-05-26 23:54:22 PDT
All reviewed patches have been landed.  Closing bug.