Bug 42976

Summary: Implement IDBKeyPath parser.
Product: WebKit Reporter: Marcus Bulach <bulach>
Component: New BugsAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: andreip, bulach, cevans, jorlow, skylined
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Other   
OS: OS X 10.5   
Bug Depends on:    
Bug Blocks: 43276    
Attachments:
Description Flags
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch abarth: review+

Description Marcus Bulach 2010-07-26 09:01:24 PDT
Implement IDBKeyPath parser.
Comment 1 Marcus Bulach 2010-07-26 09:04:14 PDT
Created attachment 62579 [details]
Patch
Comment 2 Jeremy Orlow 2010-07-26 09:39:10 PDT
Comment on attachment 62579 [details]
Patch

WebCore/storage/IndexedDatabaseRequest.idl:36
 +          // FIXME: IDBKeyPath is public only to allow testing.
FIXMEs should have an action...i.e. delete as soon as we can because it's only here for testing...fix elsewhere.

LayoutTests/storage/indexeddb/idb-keypath-expected.txt:17
 +  indexedDB.makeKeyPath('a[34][20].foo[2].bar')
maybe insert a newline via "debug('');" between test cases? (here and elsewhere)



LayoutTests/storage/indexeddb/script-tests/idb-keypath.js:9
 +      for (;keyPath.hasNext(); keyPath.next(), ++i) {
space before keyPath...but why not put the var i = 0 in there?



WebCore/storage/IDBKeyPath.cpp:50
 +      Token token;
What does this do?

WebCore/storage/IDBKeyPath.cpp:63
 +  void IDBKeyPath::next()
I guess this next interface stuff makes sense in terms of easily testing (without building a lot of infrastructure that's just going to go away) but for the real implementation, I don't see why we don't just give a pointer to the raw array + size check.  This stuff seems overly complicated.

WebCore/storage/IDBKeyPath.cpp:173
 +      return (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z');
this is a subset of javascript identifiers




Do we really want to support array syntax within keyRanges?  Also, what do we do when we see malformed keyPaths?  Right now it seems we try our best to parse them, but I don't think this is the right behavior.  I think we should raise an exception on parse errors.  (The create function should return 0 and the caller would raise.)  All of this really needs to be specced out tho.

Haven't looked at the parser or tokenizer in detail yet.  To be honest, it feels a little heavyweight for our needs, but I'm guessing it'll pay off in the long run.  We'll probably want multiple eyes looking at it though...since it'd be one of the easier ways to introduce security issues, I'd think.
Comment 3 Marcus Bulach 2010-07-26 12:08:09 PDT
Created attachment 62593 [details]
Patch
Comment 4 Marcus Bulach 2010-07-26 12:33:09 PDT
thanks for the comments so far, Jeremy!

brief background / reasoning:
1. not sure whether or not we'll support arrays, but it seems that if we support named properties, we should support indexed values as well.
2. the reason for a separate class is only to expose for tests, since we have no way to exercise this code (yet).
3. just passing the array with a known struct will be fine, the API is mostly exposed for tests. however, the basic concept would still be the same: I think it'd be nice to have the "SerializedScriptValueExtractor" just traverse some kind of structure: get a name or index, de-serialize, repeat.. having this separation would allow us to decouple the the extractor from the parser.

(In reply to comment #2)
> (From update of attachment 62579 [details])
> WebCore/storage/IndexedDatabaseRequest.idl:36
>  +          // FIXME: IDBKeyPath is public only to allow testing.
> FIXMEs should have an action...i.e. delete as soon as we can because it's only here for testing...fix elsewhere.
Done.

> 
> LayoutTests/storage/indexeddb/idb-keypath-expected.txt:17
>  +  indexedDB.makeKeyPath('a[34][20].foo[2].bar')
> maybe insert a newline via "debug('');" between test cases? (here and elsewhere)
Done in the "check" function above.

> 
> 
> 
> LayoutTests/storage/indexeddb/script-tests/idb-keypath.js:9
>  +      for (;keyPath.hasNext(); keyPath.next(), ++i) {
> space before keyPath...but why not put the var i = 0 in there?
done.

> 
> 
> 
> WebCore/storage/IDBKeyPath.cpp:50
>  +      Token token;
> What does this do?
duh, leftover. removed.

> 
> WebCore/storage/IDBKeyPath.cpp:63
>  +  void IDBKeyPath::next()
> I guess this next interface stuff makes sense in terms of easily testing (without building a lot of infrastructure that's just going to go away) but for the real implementation, I don't see why we don't just give a pointer to the raw array + size check.  This stuff seems overly complicated.

Agreed. See above.

> 
> WebCore/storage/IDBKeyPath.cpp:173
>  +      return (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z');
> this is a subset of javascript identifiers
Not yet done, I'll try to find the authoritative list of identifiers.


> 
> 
> 
> 
> Do we really want to support array syntax within keyRanges?  Also, what do we do when we see malformed keyPaths?  Right now it seems we try our best to parse them, but I don't think this is the right behavior.  I think we should raise an exception on parse errors.  (The create function should return 0 and the caller would raise.)  All of this really needs to be specced out tho.

Indeed. The keyPath is not yet fully specced, so I'm flying a bit blind here.
However, I think the main purpose of decoupling the "SerializedScriptValueExtractor" from the "KeyPathParser" is a "Good Thing", and will allow us to move to a point where the "Extractor"s (both JSC and V8) don't care about what a KeyPath is, as long as it provides an iterator of property names / indexes.
thoughts?

ahn, I also exposed a "parserError", the caller can do whatever we decide (either throw an exception, ignore, use what was able to parse, ...)

> 
> Haven't looked at the parser or tokenizer in detail yet.  To be honest, it feels a little heavyweight for our needs, but I'm guessing it'll pay off in the long run.  We'll probably want multiple eyes looking at it though...since it'd be one of the easier ways to introduce security issues, I'd think.

as mentioned somewhere else, there are two parts that worry me:
1. parsing the keypath (this bit).
2. extracting the value of a serializedscriptvalue..

both will work on unsafe, external data, and the more eyes (and tests!) we get the better.
feel free to add anyone else to this bug.
Comment 5 Jeremy Orlow 2010-07-27 06:32:05 PDT
(In reply to comment #4)
> thanks for the comments so far, Jeremy!
> 
> brief background / reasoning:
> 1. not sure whether or not we'll support arrays, but it seems that if we support named properties, we should support indexed values as well.
> 2. the reason for a separate class is only to expose for tests, since we have no way to exercise this code (yet).
> 3. just passing the array with a known struct will be fine, the API is mostly exposed for tests. however, the basic concept would still be the same: I think it'd be nice to have the "SerializedScriptValueExtractor" just traverse some kind of structure: get a name or index, de-serialize, repeat.. having this separation would allow us to decouple the the extractor from the parser.

I agree they should be separate.  But is there any possible way we can make the "deserialize" code within SerializedScriptValue use the same extractor?

> (In reply to comment #2)
> > (From update of attachment 62579 [details] [details])
> > WebCore/storage/IndexedDatabaseRequest.idl:36
> >  +          // FIXME: IDBKeyPath is public only to allow testing.
> > FIXMEs should have an action...i.e. delete as soon as we can because it's only here for testing...fix elsewhere.
> Done.
> 
> > 
> > LayoutTests/storage/indexeddb/idb-keypath-expected.txt:17
> >  +  indexedDB.makeKeyPath('a[34][20].foo[2].bar')
> > maybe insert a newline via "debug('');" between test cases? (here and elsewhere)
> Done in the "check" function above.
> 
> > 
> > 
> > 
> > LayoutTests/storage/indexeddb/script-tests/idb-keypath.js:9
> >  +      for (;keyPath.hasNext(); keyPath.next(), ++i) {
> > space before keyPath...but why not put the var i = 0 in there?
> done.
> 
> > 
> > 
> > 
> > WebCore/storage/IDBKeyPath.cpp:50
> >  +      Token token;
> > What does this do?
> duh, leftover. removed.
> 
> > 
> > WebCore/storage/IDBKeyPath.cpp:63
> >  +  void IDBKeyPath::next()
> > I guess this next interface stuff makes sense in terms of easily testing (without building a lot of infrastructure that's just going to go away) but for the real implementation, I don't see why we don't just give a pointer to the raw array + size check.  This stuff seems overly complicated.
> 
> Agreed. See above.
> 
> > 
> > WebCore/storage/IDBKeyPath.cpp:173
> >  +      return (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z');
> > this is a subset of javascript identifiers
> Not yet done, I'll try to find the authoritative list of identifiers.
> 
> 
> > 
> > 
> > 
> > 
> > Do we really want to support array syntax within keyRanges?  Also, what do we do when we see malformed keyPaths?  Right now it seems we try our best to parse them, but I don't think this is the right behavior.  I think we should raise an exception on parse errors.  (The create function should return 0 and the caller would raise.)  All of this really needs to be specced out tho.
> 
> Indeed. The keyPath is not yet fully specced, so I'm flying a bit blind here.
> However, I think the main purpose of decoupling the "SerializedScriptValueExtractor" from the "KeyPathParser" is a "Good Thing", and will allow us to move to a point where the "Extractor"s (both JSC and V8) don't care about what a KeyPath is, as long as it provides an iterator of property names / indexes.
> thoughts?
> 
> ahn, I also exposed a "parserError", the caller can do whatever we decide (either throw an exception, ignore, use what was able to parse, ...)

But if there's an error, should we even return an object?
 
> > 
> > Haven't looked at the parser or tokenizer in detail yet.  To be honest, it feels a little heavyweight for our needs, but I'm guessing it'll pay off in the long run.  We'll probably want multiple eyes looking at it though...since it'd be one of the easier ways to introduce security issues, I'd think.
> 
> as mentioned somewhere else, there are two parts that worry me:
> 1. parsing the keypath (this bit).
> 2. extracting the value of a serializedscriptvalue..
> 
> both will work on unsafe, external data, and the more eyes (and tests!) we get the better.
> feel free to add anyone else to this bug.
Comment 6 Andrei Popescu 2010-07-27 07:37:36 PDT
> indexedDB.makeKeyPath('foo.bar.zoo')

Hmm, makeKeyPath() is not a method that exists in the spec. Rather than extending the indexedDB object with methods that are intended solely for testing, perhaps we should add a couple of methods to LayoutTestController instead?

> module storage {
>     // FIXME: IDBKeyPath is public only to allow testing, remove as soon as we have more comprehensive > test infrastructure for IndexedDB. 
>     interface [
>         Conditional=INDEXED_DATABASE
>     ] IDBKeyPath {

Oh, I see. Looking around, I cannot find any precedent for this. So maybe it's better to have a LTC method that returns true / false depending on whether the input is a valid keypath and was successfully parsed or not? I realize this doesn't allow as extensive testing, but I am not sure it's ok to add non-standard interfaces for testing, either.

>168     else
> 169        state = Start;

Here we are in an array, after the right bracket and we jump to the Start state, which allows an identifier. Does this mean that the following sequence will parse successfully?

identifier[100]anotherIdentifier

Looks great, otherwise!
Comment 7 Adam Barth 2010-07-27 08:33:35 PDT
Comment on attachment 62593 [details]
Patch

WebCore/storage/IDBKeyPath.cpp:193
 +      return (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z');
Should we use the version of this function in ASCIITypes.h?

WebCore/storage/IDBKeyPath.cpp:203
 +          token.start = token.end = m_ptr;
Please don't use this form of assignment.

WebCore/storage/IDBKeyPath.cpp:206
 +      ASSERT(m_ptr <= m_end);
Should this be m_ptr < m_end ?

WebCore/storage/IDBKeyPath.cpp:237
 +      return TokError;
I'd move this into the default case, but that's just a stylistic point.

WebCore/storage/IDBKeyPath.cpp:242
 +      const UChar* runStart = m_ptr;
I'd just call that "start"

WebCore/storage/IDBKeyPath.cpp:249
 +      token.stringToken = builder.toString();
Why are we using a string builder here?  The append is outside the loop.

WebCore/storage/IDBKeyPath.cpp:265
 +          return TokError;
so 01 only tokenizes the "0" and leaves the 1.  I presume that's what you mean.

WebCore/storage/IDBKeyPath.cpp:271
 +      for (i = 0; i < token.end - token.start; i++) {
Please break "token.end - token.start" out as a length variable since you're using it more than once.

WebCore/storage/IDBKeyPath.cpp:275
 +      buffer[i] = 0;
Yuck.  This seems error prone.  I can see that it's correct, but just computing the int directly seems way easier.
Comment 8 Marcus Bulach 2010-07-27 09:59:12 PDT
Created attachment 62703 [details]
Patch
Comment 9 Marcus Bulach 2010-07-27 10:33:14 PDT
thanks everyone!
this is not yet final: I'll remove the IDLs and add support via TestShell / DRT wrappers soon, but would appreciate if you could take another look at the parser?

replies inline:

jorlow:
(In reply to comment #5)
> (In reply to comment #4)
> > thanks for the comments so far, Jeremy!
> > 
> > brief background / reasoning:
> > 1. not sure whether or not we'll support arrays, but it seems that if we support named properties, we should support indexed values as well.
> > 2. the reason for a separate class is only to expose for tests, since we have no way to exercise this code (yet).
> > 3. just passing the array with a known struct will be fine, the API is mostly exposed for tests. however, the basic concept would still be the same: I think it'd be nice to have the "SerializedScriptValueExtractor" just traverse some kind of structure: get a name or index, de-serialize, repeat.. having this separation would allow us to decouple the the extractor from the parser.
> 
> I agree they should be separate.  But is there any possible way we can make the "deserialize" code within SerializedScriptValue use the same extractor?

yep, this is going to be the second step.
either:
1. homogeneize the SerializedScriptValue interfaces to provide something like
"getNamedPropertyValue()" and "getIndexedPropertyValue()" so that the extractor doesn't care if it's talking to V8 or JSC.

2. pass this array (or iterator) of identifiers, and let the SerializedScriptValues deal with it.

I'm not there yet to decide, but would like to go with (1) rather than (2)..

> > Indeed. The keyPath is not yet fully specced, so I'm flying a bit blind here.
> > However, I think the main purpose of decoupling the "SerializedScriptValueExtractor" from the "KeyPathParser" is a "Good Thing", and will allow us to move to a point where the "Extractor"s (both JSC and V8) don't care about what a KeyPath is, as long as it provides an iterator of property names / indexes.
> > thoughts?
> > 
> > ahn, I also exposed a "parserError", the caller can do whatever we decide (either throw an exception, ignore, use what was able to parse, ...)
> 
> But if there's an error, should we even return an object?

this is pending decision on the spec.

also: I fixed the JS identifiers.

andreip:

(In reply to comment #6)
> > indexedDB.makeKeyPath('foo.bar.zoo')
> 
> Hmm, makeKeyPath() is not a method that exists in the spec. Rather than extending the indexedDB object with methods that are intended solely for testing, perhaps we should add a couple of methods to LayoutTestController instead?
  
yep, I'll do this shortly (need to create WebKit wrappers, etc..)
but meanwhile, could you please take another look at the parsing bits?
I hope the next patches will only add the wrappers and not touch the logic.

> 
> > module storage {
> >     // FIXME: IDBKeyPath is public only to allow testing, remove as soon as we have more comprehensive > test infrastructure for IndexedDB. 
> >     interface [
> >         Conditional=INDEXED_DATABASE
> >     ] IDBKeyPath {
> 
> Oh, I see. Looking around, I cannot find any precedent for this. So maybe it's better to have a LTC method that returns true / false depending on whether the input is a valid keypath and was successfully parsed or not? I realize this doesn't allow as extensive testing, but I am not sure it's ok to add non-standard interfaces for testing, either.
> 
> >168     else
> > 169        state = Start;
> 
> Here we are in an array, after the right bracket and we jump to the Start state, which allows an identifier. Does this mean that the following sequence will parse successfully?
> 
> identifier[100]anotherIdentifier

good point, added test case.

> 
> Looks great, otherwise!
thanks!


abarth:

(In reply to comment #7)
> (From update of attachment 62593 [details])
> WebCore/storage/IDBKeyPath.cpp:193
>  +      return (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z');
> Should we use the version of this function in ASCIITypes.h?
done.

> 
> WebCore/storage/IDBKeyPath.cpp:203
>  +          token.start = token.end = m_ptr;
> Please don't use this form of assignment.
fixed.

> 
> WebCore/storage/IDBKeyPath.cpp:206
>  +      ASSERT(m_ptr <= m_end);
> Should this be m_ptr < m_end ?
done.

> 
> WebCore/storage/IDBKeyPath.cpp:237
>  +      return TokError;
> I'd move this into the default case, but that's just a stylistic point.
not sure I understood this..

> 
> WebCore/storage/IDBKeyPath.cpp:242
>  +      const UChar* runStart = m_ptr;
> I'd just call that "start"
done.

> 
> WebCore/storage/IDBKeyPath.cpp:249
>  +      token.stringToken = builder.toString();
> Why are we using a string builder here?  The append is outside the loop.
left over, using String directly.

> 
> WebCore/storage/IDBKeyPath.cpp:265
>  +          return TokError;
> so 01 only tokenizes the "0" and leaves the 1.  I presume that's what you mean.
it was an error, added a test case and fixed.

> 
> WebCore/storage/IDBKeyPath.cpp:271
>  +      for (i = 0; i < token.end - token.start; i++) {
> Please break "token.end - token.start" out as a length variable since you're using it more than once.
see below.
  
> 
> WebCore/storage/IDBKeyPath.cpp:275
>  +      buffer[i] = 0;
> Yuck.  This seems error prone.  I can see that it's correct, but just computing the int directly seems way easier.
I completely changed it to use to String::toIntStrict(), looks much simpler and safer now.
Comment 10 Marcus Bulach 2010-07-28 08:32:24 PDT
Created attachment 62830 [details]
Patch
Comment 11 Marcus Bulach 2010-07-28 08:34:39 PDT
Hi,

I completely removed the IDL and (un)related methods, so now IDBKeyPath is not accessible via javascript.
I added a unit test, and as soon as IndexedDatabase starts using it, we'll be able to have more comprehensive and cross-platform LayoutTests.

Another look, please?

Thanks!

(In reply to comment #10)
> Created an attachment (id=62830) [details]
> Patch
Comment 12 Andrei Popescu 2010-08-02 03:08:19 PDT
> void IDBKeyPath::parse()
> (...)
>  case Array : {
> (...)
> m_lexer.next();
 > token = m_lexer.currentToken();

Same question here as in 42976: You are only checking for dot or left bracket. But what if tokEnd follows? Seems like you're returning ParserErrorAfterArray in that case?
Comment 13 Marcus Bulach 2010-08-02 03:58:30 PDT
Created attachment 63186 [details]
Patch
Comment 14 Marcus Bulach 2010-08-02 04:03:06 PDT
(In reply to comment #12)
> > void IDBKeyPath::parse()
> > (...)
> >  case Array : {
> > (...)
> > m_lexer.next();
>  > token = m_lexer.currentToken();
> 
> Same question here as in 42976: You are only checking for dot or left bracket. But what if tokEnd follows? Seems like you're returning ParserErrorAfterArray in that case?

yes, good point.
fixed in the latest patch, and also tidy up with the following rule:
1. added an "End" state.
2. on any given state, there's a set of "state changes" depending on the condition.
3. if there's an error, it returns.

I also added a couple of test for this case as well.

hopefully this simplifies the state transitions, but I'd surely appreciate more eyes looking into it. another look, please?
Comment 15 Andrei Popescu 2010-08-02 04:08:31 PDT
LGTM
Comment 16 Marcus Bulach 2010-08-04 13:26:45 PDT
Created attachment 63487 [details]
Patch
Comment 17 Marcus Bulach 2010-08-04 13:42:06 PDT
Created attachment 63489 [details]
Patch
Comment 18 Marcus Bulach 2010-08-05 02:29:48 PDT
replying Jeremy's comments from https://bugs.webkit.org/show_bug.cgi?id=43276
the major change is that I removed the class and instead there's a function that returns a vector of IDBKeyPathElements.
plus some other clean ups. more below.
another look please?


(In reply to comment #11)
> (From update of attachment 63313 [details])
> WebCore/storage/IDBKeyPath.cpp:101
>  +              if (token == TokIdentifier)
> Use a switch?
hmm, it doesn't seem it'd make it any more readable.

> 
> WebCore/storage/IDBKeyPath.cpp:46
>  +      , m_currentToken(0)
> Use the enum value
this is gone.

> 
> WebCore/storage/IDBKeyPath.cpp:58
>  +      return m_currentToken < m_tokens.size();
> I guess I'd suggest just converting over all this hasNext/next stuff to just returning the raw vector.  I don't feel strongly, but I just don't see much of a point of creating this interface.

removed the class, converted to a single function that returns a vector of elements.

> 
> WebCore/storage/IDBKeyPath.cpp:70
>  +          return "";
> Maybe you should just let it crash?  I.e. not do bounds checking?
obsolete.

> 
> WebCore/storage/IDBKeyPath.cpp:90
>  +      return m_parserError;
> Maybe all the methods should be ASSERTing that a parse error never happened?
the function now returns an error, it may be better to let the caller decide what to do with it?

> 
> WebCore/storage/IDBKeyPath.cpp:103
>  +              else if (token == TokLBracket)
> I'd lean towards spelling these all the way out.
Done.

> 
> WebCore/storage/IDBKeyPath.cpp:116
>  +              Token idbToken;
> these 2 token variable names are confusing.  The idb doesn't make it any less so.  Maybe 'outputToken' or something?
There's now a TokenType (with the names spelled out as above), and IDBKeyPathElement, let me know if it's any clearer.

> 
> 
> 
> general note: you should talk to the guy who does fuzzers (skylined..?) and see if we can get some coverage on this
indeed! once we agree on the implementation, I'll add more people to it.

> 
> also, could we just do this more simply?  for example, in python pseudo code I could do this:
> 
> tokens = []
> if (path == ""):
>   return []
> for (component in path.split(".")):
>   if (component ~= /^([a-zA-Z0-9_]+)\[([0-9]+)\]$/):
>     tokens.append($1, $2)
>   else if (component ~= /^([a-zA-Z0-9_]+)$/):
>     tokens.append($1, null)
>   else:
>     return error
> return tokens
> And that'd be about it.  Do we have any sort of regex support build into WebKit already?  The split would be pretty easy to do as well.  I'm just concerned that we're jumping the gun with all of this complexity.
> 
> Anyhow...

I don't think this would be much simpler.. We'd still need to get the numbers out of the indices.
There's the n-ary array case "foo[3][4]", etc.. seems that this single-pass parser might be simpler in the end?

> 
> 
> WebCore/storage/IDBKeyPath.cpp:145
>  +              ASSERT(token.type == TokNumber);
> this seems mildly redundant, but I'm ok with it
> 
> WebCore/storage/IDBKeyPath.cpp:163
>  +                  state = Array;
> but this is an invalid state!

not really. it's perfectly possible to have n-ary array, i.e., "foo[3][9][5].age". ValidKeyPath1 test exercises this.

> 
> WebCore/storage/IDBKeyPath.cpp:147
>  +              idbToken.hasIndex = true;
> maybe assert it was false?
obsolete.

> 
> WebCore/storage/IDBKeyPath.cpp:190
>  +  
> default:
>   raise error
as above, perhaps the caller should decide?

> 
> WebCore/storage/IDBKeyPath.cpp:99
>  +          case Start : {
> maybe pull this out of the state machine?
done.

> 
> WebCore/storage/IDBKeyPath.cpp:278
>  +          return TokError;
> I'd just do an m_ptr >= m_end first and then return...then no need for an else
done.

> 
> WebCore/storage/IDBKeyPath.cpp:292
>  +      return TokError;
> same again...test the error condition and return rather than putting everythign in a block
done.

> 
> WebCore/storage/IDBKeyPath.cpp:252
>  +      if (m_ptr < m_end && isSafeIdentifierStartCharacter(*m_ptr))
> ditto
done.

> 
> WebCore/storage/IDBKeyPath.h:84
>  +      class Lexer {
> Why does this need to be defined here?
moved to the cpp file.

> 
> WebCore/storage/IDBKeyPath.h:116
>  +          const UChar* m_ptr;
> should this be called m_cursor?
hmm, cursor and idb is a combination already in use..
isn't ptr clear?

> 
> WebCore/storage/IDBKeyPath.h:100
>  +          TokenType next()
> these could all be on one line
removed.
Comment 19 Jeremy Orlow 2010-08-05 07:00:13 PDT
Comment on attachment 63489 [details]
Patch

get rid of the indexeddatabaserequest stuff
Comment 20 Marcus Bulach 2010-08-05 12:33:22 PDT
Created attachment 63618 [details]
Patch
Comment 21 Marcus Bulach 2010-08-05 12:33:57 PDT
(In reply to comment #19)
> (From update of attachment 63489 [details])
> get rid of the indexeddatabaserequest stuff

done, it should now contain only the parser and associated tests.
Comment 22 Marcus Bulach 2010-08-11 09:29:45 PDT
Created attachment 64124 [details]
Patch
Comment 23 Marcus Bulach 2010-08-11 09:32:29 PDT
Fixed some minor comments as a result of https://bugs.webkit.org/show_bug.cgi?id=43276

skylined: your name was mentioned as a possible reviewer, would you mind help us here?
Comment 24 Adam Barth 2010-08-11 22:39:51 PDT
Comment on attachment 64124 [details]
Patch

WebCore/storage/IDBKeyPath.cpp:58
 +        return m_currentTokenType;
Bad indent.  You can also put this all on one line.  Why does the name of the access not match the name of the member variable?

WebCore/storage/IDBKeyPath.cpp:67
 +      const IDBKeyPathElement& currentElement()
Also can be on one line.
Comment 25 Marcus Bulach 2010-08-12 05:44:04 PDT
Created attachment 64214 [details]
Patch
Comment 26 Jeremy Orlow 2010-08-12 05:46:49 PDT
you gotten an r+ so unless you've made non-trivial changes, just commit.
Comment 27 Marcus Bulach 2010-08-12 05:56:33 PDT
thanks Adam! all comments addressed.

Chris: jorlow suggested adding you as a reviewer for this patch, would you mind taking a look at it?

quick background: this patch parses an "IDBKeyPath" (part of indexed database). an "IDBKeyPath" basically specifies an expression that can be applied to an object in order to extract a property, say "items[3].age" would be applied to an object to obtain the age of the third element in items array of said object.

for completeness, the whole chain besides patch contains:
the "extractor" bit, i.e., given parsed key path and an object, returns the key.
https://bugs.webkit.org/show_bug.cgi?id=43276

the utility process (sandbox) that will run this code in the chromium side:
http://codereview.chromium.org/3043037/show

please, ping me if you need more information about this.

(In reply to comment #24)
> (From update of attachment 64124 [details])
> WebCore/storage/IDBKeyPath.cpp:58
>  +        return m_currentTokenType;
> Bad indent.  You can also put this all on one line.  Why does the name of the access not match the name of the member variable?
Done.

> 
> WebCore/storage/IDBKeyPath.cpp:67
>  +      const IDBKeyPathElement& currentElement()
> Also can be on one line.
Done.
Comment 28 Chris Evans 2010-08-12 16:54:03 PDT
Looks like a lot of care has been taken; thanks.
What about:
+    int number = numberAsString.toIntStrict(&ok);

Does toIntStrict() permit negatives? If so, is that desired and should it be rejected?

What about a crazy large int? Should that be rejected out of hand here or does that check belong at a different level?
Comment 29 Jeremy Orlow 2010-08-13 01:36:59 PDT
Crazy large int sounds like a good test case, btw.
Comment 30 Jeremy Orlow 2010-08-13 02:31:42 PDT
Comment on attachment 64214 [details]
Patch

Please address Evan's comments.  I'll take one more quick skim through and rubber stamp it then.
Comment 31 Marcus Bulach 2010-08-13 08:11:34 PDT
Created attachment 64340 [details]
Patch
Comment 32 Marcus Bulach 2010-08-13 08:16:12 PDT
thanks for the quick reply chris! comments addressed, more inline:

(In reply to comment #28)
> Looks like a lot of care has been taken; thanks.
> What about:
> +    int number = numberAsString.toIntStrict(&ok);
> 
> Does toIntStrict() permit negatives? If so, is that desired and should it be rejected?

good point! made it unsigned, and using toUIntStrict.

> 
> What about a crazy large int? Should that be rejected out of hand here or does that check belong at a different level?
added a test for it (toUIntStrict fails for larger-than-int). also, you're correct: higher level (https://bugs.webkit.org/show_bug.cgi?id=43276) tests for the value being available in the array, it never de-references as-is.

another quick look please?
Comment 33 Chris Evans 2010-08-13 13:35:40 PDT
LGTM
Comment 34 Adam Barth 2010-08-14 23:10:32 PDT
Comment on attachment 64340 [details]
Patch

Thanks for getting a thorough review.  It looks like everyone's comments have been addressed.
Comment 35 Marcus Bulach 2010-08-16 05:45:23 PDT
Committed r65417: <http://trac.webkit.org/changeset/65417>