Created attachment 43784 [details] [TEST CASE] Shows Poor Performance of Stringify with undefined I recently ran some benchmarks for native JSON in a few browsers and stumbled on the fact that WebKit's performance with the "undefined" value was considerably slow. A version of my test case is attached which highlights the issues. Here is a screenshot showing the improvement I got: http://grab.by/MjK Safari 4.0.4 ~120ms WebKit Nightly ~249ms My Build ~10ms My change still passes all tests. No tests are added, since this is really a benchmark.
Created attachment 43785 [details] [PATCH] Replace substr() with truncate() in the rollback case I'm new to this area of WebKit, so please review carefully! Also, should I add an assert that the new truncate length is less then the existing length?
Does Dromero test JSON parsing yet? Would be nice if it did. I suspect this would work fine, I don't know if this breaks a UString design principle though. I'm not sure if there are other cases where UString::Rep has more data that its len member claims it does.
Comment on attachment 43785 [details] [PATCH] Replace substr() with truncate() in the rollback case > + void truncate(int length) const { m_rep->len = length; } I am concerned about this change -- i'm not convinced that it's safe (as rep's may be shared). What is the object structure that leads to us need to roll back the string? I'm also wondering if maybe it's worth just switching to Vector<UChar> while we build the string, which will remove any data sharing issues we could hit.
(In reply to comment #3) > What is the object structure that leads to us need to roll > back the string? The JSON result is continually built by appending onto a string builder (UString). JSON specifies that we should not put undefined values in the resulting sting. The case where it is rolled back is when contents were already put on the builder string, and then we encountered an "undefined" value, and we have to remove the contents we already put on the string. So in the following case: { a: undefined } The builder would follow these steps: Action Builder So Far ----------- -------------- 1. start object { 2. put key and seperator {"a": 3. encounter undefined value... rollback { 4. finish properties, close object {} 5. done
(In reply to comment #4) > (In reply to comment #3) > > What is the object structure that leads to us need to roll > > back the string? > > The JSON result is continually built by appending onto a string builder > (UString). JSON specifies that we should not put undefined values in the > resulting sting. > > The case where it is rolled back is when contents were already put on the > builder string, and then we encountered an "undefined" value, and we have to > remove the contents we already put on the string. So in the following case: > > { a: undefined } > > The builder would follow these steps: > > Action Builder So Far > ----------- -------------- > 1. start object { > 2. put key and seperator {"a": > 3. encounter undefined value... rollback { > 4. finish properties, close object {} > 5. done Yeah i saw it once i actually looked at the code. I'm just testing a patch that switches stringification over to basically using a Vector<UChar>. I'm testing perf, etc now.
Created attachment 43802 [details] Patch
Comment on attachment 43802 [details] Patch > + class StringBuilder : public Vector<UChar> { Subclassing Vector is unsafe, because it does not have a virtual destructor. Maybe adding private operator delete would make it a little safer? I'm not 100% clear on what the best way to make it safe is. > + for (size_t i = 0; i < len; i++) > + Vector<UChar>::append(str[i]); Should we ASSERT that the only low ASCII is used here? Otherwise, conversion from signed to unsigned could go badly (not to mention that we don't know what encoding it is). > + inline void append(const char ch) Someone could ask you why you don't use full words instead of those "ch", "len" and "str". > + Vector<UChar>::append(ch); Would a simple "using Vector<UChar>::append" achieve the same result? r=me
Committed r51352
(In reply to comment #7) > (From update of attachment 43802 [details]) > > + class StringBuilder : public Vector<UChar> { > > Subclassing Vector is unsafe, because it does not have a virtual destructor. > Maybe adding private operator delete would make it a little safer? I'm not 100% > clear on what the best way to make it safe is. Private inheritance might work better. Then you can expose Vector functions with "using" individually. > Should we ASSERT that the only low ASCII is used here? Otherwise, conversion > from signed to unsigned could go badly (not to mention that we don't know what > encoding it is). Yes. > > + inline void append(const char ch) > > Someone could ask you why you don't use full words instead of those "ch", "len" > and "str". Like me. > Would a simple "using Vector<UChar>::append" achieve the same result? Yes.
(In reply to comment #9) > > Subclassing Vector is unsafe, because it does not have a virtual destructor. > > Maybe adding private operator delete would make it a little safer? I'm not 100% > > clear on what the best way to make it safe is. Since StringBuilder doesn't add any data members or override any virtual functions, I think it's not unsafe.