RESOLVED FIXED 3539
Array join and toString methods do not support circular references
https://bugs.webkit.org/show_bug.cgi?id=3539
Summary Array join and toString methods do not support circular references
David Wheeler
Reported 2005-06-14 19:19:33 PDT
If I have a circular reference in WebKit JavaScript, I can't dig into it effectively. For example, given this code: var ary = [1, 2]; ary.push(ary); for (var i = 0; i < ary.length; i++) { document.write(typeof ary[0] + ': ' + ary[i] + ' '); } In Safari, this outputs "number: 1 number: 2". In Firefox, it outputs "number: 1 number: 2 number: 1,2, ". The upshot is that I cannot access any of the elements of ary when I get at it through ary[2], but I should be able to. ary == ary[2] does evaluate to true, so I'm not sure what's happening here.
Attachments
testcase (264 bytes, text/html)
2005-06-14 23:27 PDT, Joost de Valk (AlthA)
no flags
Patch to fix the *type* of error given to what it should be (641 bytes, patch)
2005-06-25 06:32 PDT, Oliver Hunt
darin: review-
Patch to match behaviour of firefox and IE (2.36 KB, patch)
2005-06-25 23:21 PDT, Oliver Hunt
darin: review-
edge case test for patch (295 bytes, text/html)
2005-06-26 00:19 PDT, Oliver Hunt
no flags
Another edge case, slightly less ugly than previous (243 bytes, text/html)
2005-06-26 01:12 PDT, Oliver Hunt
no flags
test case for cyclic references in array.toString (952 bytes, text/html)
2005-06-27 04:54 PDT, Oliver Hunt
no flags
expected results of cyclic reference testcase (80 bytes, text/plain)
2005-06-27 04:56 PDT, Oliver Hunt
no flags
testcase for tostring with non-cyclic, but deep hierarchies (925 bytes, text/html)
2005-06-27 05:30 PDT, Oliver Hunt
no flags
expected output from the deep hierarchies test (121 bytes, text/plain)
2005-06-27 05:31 PDT, Oliver Hunt
no flags
sort() Test Case (286 bytes, text/html)
2005-06-27 15:45 PDT, David Wheeler
no flags
Quick test for join and toString operating concurrently (414 bytes, text/html)
2005-06-28 03:51 PDT, Oliver Hunt
no flags
Reimplemented patch with HashSet (3.65 KB, patch)
2005-10-29 21:52 PDT, Oliver Hunt
darin: review-
Hopefully the final patch (1.07 KB, patch)
2005-11-29 17:07 PST, Oliver Hunt
no flags
Fixed HashMap to be typesafe (1.10 KB, patch)
2005-11-29 17:25 PST, Oliver Hunt
mjs: review+
Testcase for toString on arrays with self references (1.49 KB, text/html)
2005-11-29 17:49 PST, Oliver Hunt
no flags
Testcase, though it needs a better name (1.50 KB, text/html)
2005-11-29 18:28 PST, Oliver Hunt
no flags
Expected output from testcase (512 bytes, text/plain)
2005-11-29 18:29 PST, Oliver Hunt
no flags
Joost de Valk (AlthA)
Comment 1 2005-06-14 23:27:28 PDT
Created attachment 2348 [details] testcase
Joost de Valk (AlthA)
Comment 2 2005-06-14 23:28:21 PDT
I'll confirm this one, since it doesn't do the same as it does in Firefox either. But, please attach the testcase yourself next time :).
David Wheeler
Comment 3 2005-06-17 11:54:35 PDT
I forgot to mention that the reason it does not print what's in the third slot of the array is that an exception is thrown. The exception is: TypeError – No default value undefined Line: 0 My guess is that this is triggered somehow by the circular reference. Also, this only comes up if I iterate over the array using for() or do something else like call sort(). If I just reach into the circular reference, like this: document.write(ary[2][1]); things work fine. HTH.
Oliver Hunt
Comment 4 2005-06-25 06:29:56 PDT
Righto have been going through ECMA 262 document, and it looks like mozilla/firefox's result is non standard. That said the current error given is also not ideal, and that's a result of ObjectImp::defaultValue() unconditionally blatting an exception. I'm attaching a patch that fixes the behaviour of defaultValue() so that the user gets a RangeError (specifically for a stack overflow, the root cause of this problem). The bug is made apparent in the Array.join method which is defined to produce a string containing a list of each element of the array. If an element is undefined or null then an empty string is substituted. Logically in the event of an exception the exception should be propogated (the spec doesn't say otherwise), so that's what we do. Is this an ideal solution? It's what the standard says, but is it what is expected?
Oliver Hunt
Comment 5 2005-06-25 06:32:11 PDT
Created attachment 2640 [details] Patch to fix the *type* of error given to what it should be Fixes the behaviour of ObjectImp::defaultValue() to prevent it from overriding an already thrown exception
David Wheeler
Comment 6 2005-06-25 10:25:28 PDT
Can you cite the passage(s) in the ECSM 262 standard that demonstrate that Firefox/Mozilla's behavior is non-standard? It seems to me that any first-class programming language should properly support circular references.
Brendan Eich
Comment 7 2005-06-25 11:32:28 PDT
ECMA-262 does not specify behavior in the case of divergence through ToString. This is a flaw in the standard. Edition 3 section 16 allows extensions such as the one in Mozilla's SpiderMonkey engine. In 1998 or so, leading up to Edition 3 work, I implemented and proposed toSource and sharp variables (after Common Lisp) to handle cycles and join points in object graphs. Example from this bug: $ Linux_All_DBG.OBJ/js js> a = [1, 2] 1,2 js> a.push(a) 3 js> a 1,2, js> a.toSource() #1=[1, 2, #1#] In general, in SpiderMonkey, uneval(v) produces a string s such that eval(s) is either === v if v is primitive, or an isomorphic object graph if v is an object reference. In the object case, uneval tries to call toSource, just as toString is tried from [[DefaultValue]] in 262. Some on the committee were against any such complete serialization notation, claiming serialization was too domain-specific to capture in the standard. That is a cop-out: standards follow the 80/20 rule too, or should. Do cycles and join points fall into the hard-case 20 percent? I think not. Does serializing hooks for private native data fall there? Probably, but that's a different case from cycles and join points. There may also have been aesthetic objections, without counterproposals. This is all water under the bridge, unless some subset of ECMA TG1 rallies again for Edition 4. Hope this helps, /be
Oliver Hunt
Comment 8 2005-06-25 22:10:41 PDT
The join operator (what an arrays ToString is implemented as) is definied in ECMA-262 15.4.4.5 for easy reference: <quote> Ar r a y . pr o t o t y pe . j o i n ( s e pa r a t o r ) The el ement s of t he ar r ay ar e conver t ed t o st r i ngs, and t hese st r i ngs ar e t hen concat enat ed, separ at ed by occur r ences of t he separat or. I f no separ at or i s pr ovi ded, a si ngl e comma i s used as t he separ at or . The join met hod t akes one ar gument , separat or, and per f or ms t he f ol l owi ng st eps: 1. Cal l t he [ [ Get ]] met hod of t hi s obj ect wi t h ar gument "length". 2. Cal l ToUi nt 32( Resul t ( 1) ) . 3. I f separat or i s undef i ned, l et separat or be t he si ngl e- char act er st r i ng ",". 4. Cal l ToSt r i ng( separat or) . 5. If Resul t (2) i s zero, ret urn t he empt y st ri ng. 6. Cal l t he [ [ Get ]] met hod of t hi s obj ect wi t h ar gument "0". 7. If Resul t (6) i s undef i ned or nul l , use t he empt y st r i ng; ot her wi se, cal l ToSt r i ng( Resul t ( 6) ) . 8. Let R be Resul t ( 7) . 9. Let k be 1. 10. I f k equal s Resul t ( 2) , r et ur n R. 11. Let S be a st r i ng val ue pr oduced by concat enat i ng R and Resul t ( 4) . 12. Cal l t he [ [ Get ]] met hod of t hi s obj ect wi t h ar gument ToSt r i ng( k) . 13. I f Resul t ( 12) i s undef i ned or nul l , use t he empt y st r i ng; ot her wi se, cal l ToSt r i ng( Resul t ( 12) ) . 14. Let R be a st r i ng val ue pr oduced by concat enat i ng S and Resul t ( 13) . 15. I ncr ease k by 1. 16. Go t o st ep 10. The length pr oper t y of t he join met hod i s 1. NOTE The join f unct i on i s i nt ent i onal l y generi c; i t does not requi re t hat i t s t hi s val ue be an Array obj ect . Theref ore, i t can be t ransf erred t o ot her ki nds of obj ect s f or use as a met hod. Whet her t he join f unct i on can be appl i ed successf ul l y t o a host obj ect i s i mpl ement at i on- dependent . </quote> Given it does not specifically handle exceptions then tthe behaviour should match standard programming semantics -- the exception should be propogated. Brendan: does mozilla just track each object ToString/join has been called in? or does it have a more elegant way of killing off self-recursion? David: JavaScript handles the self recursion easily -- ToString is a recursive function, and it behaves in exactly the same way any language would behave in this case...
Oliver Hunt
Comment 9 2005-06-25 22:21:11 PDT
Given firefox/ie behave the same, and conversation in #webkit, will provide a fix to match firefox/IE behaviour
Oliver Hunt
Comment 10 2005-06-25 23:21:38 PDT
Created attachment 2649 [details] Patch to match behaviour of firefox and IE Okay this patch is a somewhat inelegant solution, but it matches the behaviour of firefox and IE I'm not obsoleting the old patch as a i believe it does fix a bug anyway
Oliver Hunt
Comment 11 2005-06-26 00:19:43 PDT
Created attachment 2651 [details] edge case test for patch sample file demonstrating an array of depth larger than the stack allows -- people running firefox/mac may want to be careful on this one... firefox doesn't seem to like it at all. In this case we just get a straight stack overflow which raises the question of what we should do if toString encounters an exception...
Brendan Eich
Comment 12 2005-06-26 01:03:41 PDT
(In reply to comment #8) > 13. If Result (12) is undefined or null, use the empty string; otherwise, > call ToString(Result(12)). [snip] > Given it does not specifically handle exceptions then the behaviour should > match standard programming semantics -- the exception should be propogated. Section 16, the second item in the second unordered list, allows extensions to conceal errors. It's also striking that runaway recursion under toString is not dealt with by the spec -- that's a flaw, again, IMO. > Brendan: does mozilla just track each object ToString/join has been called in? > or does it have a more elegant way of killing off self-recursion? The only way to implement this correctly is to map all objects reached by the depth-first search being done by join or toString, using storage local to the current activation to hold the map. See http://lxr.mozilla.org/mozilla/search?string=EnterSharpObject. /be
Oliver Hunt
Comment 13 2005-06-26 01:12:40 PDT
Created attachment 2652 [details] Another edge case, slightly less ugly than previous This is another test case, partly to see what firefox does if an exception occurs during Array.toString firefox/mac doesn't behave well with this one either
Brendan Eich
Comment 14 2005-06-26 01:57:06 PDT
Firefox 1.0.3 on Linux behaves well enough on those tests, considering. Deer Park alpha 1 on Linux fails immediately with out of memory from document.write -- I'm not sure why. /be
Oliver Hunt
Comment 15 2005-06-27 04:54:20 PDT
Created attachment 2665 [details] test case for cyclic references in array.toString test case that works within the webkit test suite
Oliver Hunt
Comment 16 2005-06-27 04:56:11 PDT
Created attachment 2666 [details] expected results of cyclic reference testcase expected results for webkit testcase
Oliver Hunt
Comment 17 2005-06-27 05:30:22 PDT
Created attachment 2669 [details] testcase for tostring with non-cyclic, but deep hierarchies This test requires the first patch (which allows the correct exception to be propogated)
Oliver Hunt
Comment 18 2005-06-27 05:31:38 PDT
Created attachment 2670 [details] expected output from the deep hierarchies test
Oliver Hunt
Comment 19 2005-06-27 05:35:26 PDT
Should the two patches be merged into one? Strictly speaking the first patch fixes the type of exception the user would see, so might not be considered part of this bug... on the otherhand it does obscure the actual error that has occured. Otherwise if no one has any complaints i'll flag these for review
David Wheeler
Comment 20 2005-06-27 10:19:25 PDT
FWIW, I saw this bug with .sort(), too. If that just uses toString(), then it's the same bug.
Oliver Hunt
Comment 21 2005-06-27 15:35:03 PDT
I did a quick test, and sort seems to work fine -- you sure it wasn't a cycle in the array causing toString to die? With the patches applied it doesn't crash and burn at all, which seem to imply that any problems were in the output...
David Wheeler
Comment 22 2005-06-27 15:45:45 PDT
Created attachment 2676 [details] sort() Test Case No, I'm not sure, but this new test case demonstrates the problem using sort(). If it's just failing because sort uses toString() internally, then great. If not, then is it related at all?
Oliver Hunt
Comment 23 2005-06-27 15:57:25 PDT
Yep that passes with the patches applied -- so sort is using either array.tostring or array.join, either way it works.
Darin Adler
Comment 24 2005-06-27 17:24:59 PDT
Comment on attachment 2640 [details] Patch to fix the *type* of error given to what it should be Looks like a good change, but it's not formatted correctly. Please look at <http://webkit.opendarwin.org/coding/coding-style.html>. Also, it would be better if this patch had a separate bug report, and a test case demonstrating it's fixed, rather than being part of the "does not support circular references" bug.
Darin Adler
Comment 25 2005-06-27 17:36:28 PDT
Comment on attachment 2649 [details] Patch to match behaviour of firefox and IE Looks like this is on the right track. Formatting is not really right <http://webkit.opendarwin.org/coding/coding-style.html>. Before landing this, we need to figure out exactly which tests to land with it; at the moment there are lots of different tests attached to the bug and they need to be put in layout test form. I'm not comfortable with using a giant global array as part of the solution here. I suggest we make a small fixed-size global array, then switch over to using a hash table instead once we pass a certain recursion threshold to avoid O(n^2) behavior. That also has the pleasant quality of making the global much smaller. Is it OK to share a single array between both Join and ToString? I'd like to see a test that checks for that.
Oliver Hunt
Comment 26 2005-06-27 18:41:33 PDT
Comment on attachment 2670 [details] expected output from the deep hierarchies test Moved to bug 3743
Oliver Hunt
Comment 27 2005-06-27 18:42:38 PDT
Comment on attachment 2669 [details] testcase for tostring with non-cyclic, but deep hierarchies Moved to bug 3743
Oliver Hunt
Comment 28 2005-06-27 18:44:40 PDT
Comment on attachment 2640 [details] Patch to fix the *type* of error given to what it should be Moved to bug 3743
David Wheeler
Comment 29 2005-06-27 18:51:43 PDT
(In reply to comment #25) > Before landing this, we need to figure out exactly which tests to land with it; > at the moment there are lots of different tests attached to the bug and they > need to be put in layout test form. For my future reference, what is "layout test form"? Is it documented somewhere I can find it so that I can submit future tests using it? Thanks!
Oliver Hunt
Comment 30 2005-06-27 19:01:01 PDT
Comment on attachment 2651 [details] edge case test for patch obsoleted by layout tests
Oliver Hunt
Comment 31 2005-06-27 19:02:19 PDT
Comment on attachment 2652 [details] Another edge case, slightly less ugly than previous obsoleted by layout tests
Oliver Hunt
Comment 32 2005-06-27 19:04:22 PDT
Comment on attachment 2669 [details] testcase for tostring with non-cyclic, but deep hierarchies moved to bug 3743
Oliver Hunt
Comment 33 2005-06-27 19:24:32 PDT
attachment 2665 [details] and attachment 2666 [details] seem to match the layout format -- at least they fit in with the the standard webkit tests
Oliver Hunt
Comment 34 2005-06-28 03:51:04 PDT
Created attachment 2683 [details] Quick test for join and toString operating concurrently Here's a test that demonstrates the use of join and toString in a big unholy mess. Not a webkit test atm, but should be sometime early tomorrow (NZST). toString just falls through to the join method in an Array so given the absence of concurrent execution in kjs they are fine anyway. Each call to join/toString increases the stack depth, and we currently allocate enough space to fit the entire maximum call depth. Darin: re: the hashmap should i just switch over to using that now? rather than applying the current array driven look up? and what should i be using as a hashmap implementation, from irc it sounds like there's an internal one we should be using...
Geoffrey Garen
Comment 35 2005-07-25 14:13:21 PDT
I have landed the patch for bug #3991.
David Wheeler
Comment 36 2005-07-25 14:41:04 PDT
(In reply to comment #35) > I have landed the patch for bug #3991. Uh, does that mean that this issue has been fixed?
Oliver Hunt
Comment 37 2005-10-29 21:52:39 PDT
Created attachment 4532 [details] Reimplemented patch with HashSet Okay, now HashSet is available in JSC I've got round to rewriting patch to use it. Needs to be reindented, and needs a half-decent testcase, but it's here now :)
Darin Adler
Comment 38 2005-11-09 07:39:33 PST
Comment on attachment 4532 [details] Reimplemented patch with HashSet I don't understand why it's necessary to copy thisObj into a second local variable currentRef. Lets just use thisObj directly unless there's a reason not to. I think it would be easier to read this if the code said: if (visitedElems.contains(thisObj)) return String(""); before anything else (even before declaring separator and str), rather than indenting all the code inside an if statement. That would also probably make the diff smaller and easier to verify. Also, I think the HashSet should be: HashSet< ObjectImp*, PointerHash<ObjectImp*> > There's an advantage to HashSet<void*>, in that we're guaranteed all such sets are a single template expansion, preventing any possible code bloat. But a disadvantage to HashSet<void*> is that without type safety, it's possible to make certain mistakes with a set (for example, adding and removing the same object, but a pointer to a different part in the case of multiple inheritance). In this case, I think the code is simple enough that I don't think there are any real problems, but I'd prefer to get off on the right foot using this class. There are a few minor formatting problems, for example: ObjectImp *currentRef=thisObj; the above needs spaces around the "=". And: + if(!visitedElems.contains(currentRef)) { the above needs a space after the if and before the opening parenthesis. And: + if ( exec->hadException() ) + break; if you're going to modify the code, you should fix the format of that by removing the extra spaces inside the parentheses. Also, there are a number of 1-line if and else statements with braces; style guide says not to use braces in these cases.
Oliver Hunt
Comment 39 2005-11-29 17:07:36 PST
Created attachment 4860 [details] Hopefully the final patch Reformatted and implemented darins suggestion
Oliver Hunt
Comment 40 2005-11-29 17:25:37 PST
Created attachment 4861 [details] Fixed HashMap to be typesafe
Oliver Hunt
Comment 41 2005-11-29 17:49:30 PST
Created attachment 4862 [details] Testcase for toString on arrays with self references New testcase
Oliver Hunt
Comment 42 2005-11-29 18:28:38 PST
Created attachment 4863 [details] Testcase, though it needs a better name
Oliver Hunt
Comment 43 2005-11-29 18:29:29 PST
Created attachment 4864 [details] Expected output from testcase
Maciej Stachowiak
Comment 44 2005-11-30 23:45:57 PST
Comment on attachment 4861 [details] Fixed HashMap to be typesafe r=me
Maciej Stachowiak
Comment 45 2005-11-30 23:45:58 PST
Comment on attachment 4861 [details] Fixed HashMap to be typesafe r=me
Note You need to log in before you can comment on or make changes to this bug.