Bug 3539 - Array join and toString methods do not support circular references
Summary: Array join and toString methods do not support circular references
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 412
Hardware: Mac OS X 10.4
: P2 Normal
Assignee: Oliver Hunt
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-06-14 19:19 PDT by David Wheeler
Modified: 2005-12-10 11:04 PST (History)
2 users (show)

See Also:


Attachments
testcase (264 bytes, text/html)
2005-06-14 23:27 PDT, Joost de Valk (AlthA)
no flags Details
Patch to fix the *type* of error given to what it should be (641 bytes, patch)
2005-06-25 06:32 PDT, Oliver Hunt
darin: review-
Details | Formatted Diff | Diff
Patch to match behaviour of firefox and IE (2.36 KB, patch)
2005-06-25 23:21 PDT, Oliver Hunt
darin: review-
Details | Formatted Diff | Diff
edge case test for patch (295 bytes, text/html)
2005-06-26 00:19 PDT, Oliver Hunt
no flags Details
Another edge case, slightly less ugly than previous (243 bytes, text/html)
2005-06-26 01:12 PDT, Oliver Hunt
no flags Details
test case for cyclic references in array.toString (952 bytes, text/html)
2005-06-27 04:54 PDT, Oliver Hunt
no flags Details
expected results of cyclic reference testcase (80 bytes, text/plain)
2005-06-27 04:56 PDT, Oliver Hunt
no flags Details
testcase for tostring with non-cyclic, but deep hierarchies (925 bytes, text/html)
2005-06-27 05:30 PDT, Oliver Hunt
no flags Details
expected output from the deep hierarchies test (121 bytes, text/plain)
2005-06-27 05:31 PDT, Oliver Hunt
no flags Details
sort() Test Case (286 bytes, text/html)
2005-06-27 15:45 PDT, David Wheeler
no flags Details
Quick test for join and toString operating concurrently (414 bytes, text/html)
2005-06-28 03:51 PDT, Oliver Hunt
no flags Details
Reimplemented patch with HashSet (3.65 KB, patch)
2005-10-29 21:52 PDT, Oliver Hunt
darin: review-
Details | Formatted Diff | Diff
Hopefully the final patch (1.07 KB, patch)
2005-11-29 17:07 PST, Oliver Hunt
no flags Details | Formatted Diff | Diff
Fixed HashMap to be typesafe (1.10 KB, patch)
2005-11-29 17:25 PST, Oliver Hunt
mjs: review+
Details | Formatted Diff | Diff
Testcase for toString on arrays with self references (1.49 KB, text/html)
2005-11-29 17:49 PST, Oliver Hunt
no flags Details
Testcase, though it needs a better name (1.50 KB, text/html)
2005-11-29 18:28 PST, Oliver Hunt
no flags Details
Expected output from testcase (512 bytes, text/plain)
2005-11-29 18:29 PST, Oliver Hunt
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description David Wheeler 2005-06-14 19:19:33 PDT
If I have a circular reference in WebKit JavaScript, I can't dig into it effectively. For example, given this 
code:

  var ary = [1, 2];
  ary.push(ary);
  for (var i = 0; i < ary.length; i++) {
    document.write(typeof ary[0] + ': ' + ary[i] + ' ');
  }

In Safari, this outputs "number: 1 number: 2". In Firefox, it outputs "number: 1 number: 2 number: 1,2, 
". The upshot is that I cannot access any of the elements of ary when I get at it through ary[2], but I 
should be able to. ary == ary[2] does evaluate to true, so I'm not sure what's happening here.
Comment 1 Joost de Valk (AlthA) 2005-06-14 23:27:28 PDT
Created attachment 2348 [details]
testcase
Comment 2 Joost de Valk (AlthA) 2005-06-14 23:28:21 PDT
I'll confirm this one, since it doesn't do the same as it does in Firefox either. But, please attach the 
testcase yourself next time :).
Comment 3 David Wheeler 2005-06-17 11:54:35 PDT
I forgot to mention that the reason it does not print what's in the third slot of the array is that an 
exception is thrown. The exception is:

  TypeError – No default value
  undefined                                                        Line: 0

My guess is that this is triggered somehow by the circular reference. Also, this only comes up if I iterate 
over the array using for() or do something else like call sort(). If I just reach into the circular reference, 
like this:
  
  document.write(ary[2][1]);

things work fine.

HTH.
Comment 4 Oliver Hunt 2005-06-25 06:29:56 PDT
Righto have been going through ECMA 262 document, and it looks like mozilla/firefox's result is non 
standard.

That said the current error given is also not ideal, and that's a result of ObjectImp::defaultValue() 
unconditionally blatting an exception.  I'm attaching a patch that fixes the behaviour of defaultValue() 
so that the user gets a RangeError (specifically for a stack overflow, the root cause of this problem).

The bug is made apparent in the Array.join method which is defined to produce a string containing a 
list of each element of the array.  If an element is undefined or null then an empty string is substituted.  
Logically in the event of an exception the exception should be propogated (the spec doesn't say 
otherwise), so that's what we do. 

Is this an ideal solution?  It's what the standard says, but is it what is expected?
Comment 5 Oliver Hunt 2005-06-25 06:32:11 PDT
Created attachment 2640 [details]
Patch to fix the *type* of error given to what it should be

Fixes the behaviour of ObjectImp::defaultValue() to prevent it from overriding
an already thrown exception
Comment 6 David Wheeler 2005-06-25 10:25:28 PDT
Can you cite the passage(s) in the ECSM 262 standard that demonstrate that Firefox/Mozilla's behavior is 
non-standard? It seems to me that any first-class programming language should properly support circular 
references.
Comment 7 Brendan Eich 2005-06-25 11:32:28 PDT
ECMA-262 does not specify behavior in the case of divergence through ToString. 
This is a flaw in the standard.

Edition 3 section 16 allows extensions such as the one in Mozilla's SpiderMonkey
engine.  In 1998 or so, leading up to Edition 3 work, I implemented and proposed
toSource and sharp variables (after Common Lisp) to handle cycles and join
points in object graphs.  Example from this bug:

$ Linux_All_DBG.OBJ/js
js> a = [1, 2]
1,2
js> a.push(a)
3
js> a
1,2,
js> a.toSource()
#1=[1, 2, #1#]

In general, in SpiderMonkey, uneval(v) produces a string s such that eval(s) is
either === v if v is primitive, or an isomorphic object graph if v is an object
reference.  In the object case, uneval tries to call toSource, just as toString
is tried from [[DefaultValue]] in 262.

Some on the committee were against any such complete serialization notation,
claiming serialization was too domain-specific to capture in the standard.  That
is a cop-out: standards follow the 80/20 rule too, or should.  Do cycles and
join points fall into the hard-case 20 percent?  I think not.  Does serializing
hooks for private native data fall there?  Probably, but that's a different case
from cycles and join points.

There may also have been aesthetic objections, without counterproposals.  This
is all water under the bridge, unless some subset of ECMA TG1 rallies again for
Edition 4.

Hope this helps,

/be
Comment 8 Oliver Hunt 2005-06-25 22:10:41 PDT
The join operator (what an arrays ToString is implemented as) is definied in ECMA-262 15.4.4.5

for easy reference:
<quote>
Ar r a y . pr o t o t y pe . j o i n ( s e pa r a t o r ) 
The el ement s of t he ar r ay ar e conver t ed t o st r i ngs, and t hese st r i ngs ar e t hen concat enat 
ed, separ at ed 
by occur r ences of t he separat or. I f no separ at or i s pr ovi ded, a si ngl e comma i s used as t he 
separ at or . 
The join met hod t akes one ar gument , separat or, and per f or ms t he f ol l owi ng st eps: 
1. Cal l t he [ [ Get ]] met hod of t hi s obj ect wi t h ar gument "length". 
2. Cal l ToUi nt 32( Resul t ( 1) ) . 
3. I f separat or i s undef i ned, l et separat or be t he si ngl e- char act er st r i ng ",". 
4. Cal l ToSt r i ng( separat or) . 
5. If Resul t (2) i s zero, ret urn t he empt y st ri ng. 
6. Cal l t he [ [ Get ]] met hod of t hi s obj ect wi t h ar gument "0". 
7. If Resul t (6) i s undef i ned or nul l , use t he empt y st r i ng; ot her wi se, cal l ToSt r i ng( Resul t 
( 6) ) . 
8. Let R be Resul t ( 7) . 
9. Let k be 1. 
10. I f k equal s Resul t ( 2) , r et ur n R. 
11. Let S be a st r i ng val ue pr oduced by concat enat i ng R and Resul t ( 4) . 
12. Cal l t he [ [ Get ]] met hod of t hi s obj ect wi t h ar gument ToSt r i ng( k) . 
13. I f Resul t ( 12) i s undef i ned or nul l , use t he empt y st r i ng; ot her wi se, cal l ToSt r i ng( Resul 
t ( 12) ) . 
14. Let R be a st r i ng val ue pr oduced by concat enat i ng S and Resul t ( 13) . 
15. I ncr ease k by 1. 
16. Go t o st ep 10. 
The length pr oper t y of t he join met hod i s 1. 
NOTE 
The join f unct i on i s i nt ent i onal l y generi c; i t does not requi re t hat i t s t hi s val ue be an Array 
obj ect . 
Theref ore, i t can be t ransf erred t o ot her ki nds of obj ect s f or use as a met hod. Whet her t he join 
f unct i on can be appl i ed successf ul l y t o a host obj ect i s i mpl ement at i on- dependent . 
</quote>

Given it does not specifically handle exceptions then tthe behaviour should match standard 
programming semantics -- the exception should be propogated.

Brendan: does mozilla just track each object ToString/join has been called in? or does it have a more 
elegant way of killing off self-recursion?

David: JavaScript handles the self recursion easily -- ToString is a recursive function, and it behaves in 
exactly the same way any language would behave in this case...
Comment 9 Oliver Hunt 2005-06-25 22:21:11 PDT
Given firefox/ie behave the same, and conversation in #webkit, will provide a fix to match firefox/IE 
behaviour

Comment 10 Oliver Hunt 2005-06-25 23:21:38 PDT
Created attachment 2649 [details]
Patch to match behaviour of firefox and IE

Okay this patch is a somewhat inelegant solution, but it matches the behaviour
of firefox and IE

I'm not obsoleting the old patch as a i believe it does fix a bug anyway
Comment 11 Oliver Hunt 2005-06-26 00:19:43 PDT
Created attachment 2651 [details]
edge case test for patch

sample file demonstrating an array of depth larger than the stack allows --
people running firefox/mac may want to be careful on this one... firefox
doesn't seem to like it at all.

In this case we just get a straight stack overflow which raises the question of
what we should do if toString encounters an exception...
Comment 12 Brendan Eich 2005-06-26 01:03:41 PDT
(In reply to comment #8)
> 13. If Result (12) is undefined or null, use the empty string; otherwise,
>     call ToString(Result(12)).
[snip]
> Given it does not specifically handle exceptions then the behaviour should
> match standard programming semantics -- the exception should be propogated.

Section 16, the second item in the second unordered list, allows extensions to
conceal errors.  It's also striking that runaway recursion under toString is not
dealt with by the spec -- that's a flaw, again, IMO.

> Brendan: does mozilla just track each object ToString/join has been called in?
> or does it have a more elegant way of killing off self-recursion?

The only way to implement this correctly is to map all objects reached by the
depth-first search being done by join or toString, using storage local to the
current activation to hold the map.  See
http://lxr.mozilla.org/mozilla/search?string=EnterSharpObject.

/be
Comment 13 Oliver Hunt 2005-06-26 01:12:40 PDT
Created attachment 2652 [details]
Another edge case, slightly less ugly than previous

This is another test case, partly to see what firefox does if an exception
occurs during Array.toString
firefox/mac doesn't behave well with this one either
Comment 14 Brendan Eich 2005-06-26 01:57:06 PDT
Firefox 1.0.3 on Linux behaves well enough on those tests, considering.  Deer
Park alpha 1 on Linux fails immediately with out of memory from document.write
-- I'm not sure why.

/be
Comment 15 Oliver Hunt 2005-06-27 04:54:20 PDT
Created attachment 2665 [details]
test case for cyclic references in array.toString

test case that works within the webkit test suite
Comment 16 Oliver Hunt 2005-06-27 04:56:11 PDT
Created attachment 2666 [details]
expected results of cyclic reference testcase

expected results for webkit testcase
Comment 17 Oliver Hunt 2005-06-27 05:30:22 PDT
Created attachment 2669 [details]
testcase for tostring with non-cyclic, but deep hierarchies

This test requires the first patch (which allows the correct exception to be
propogated)
Comment 18 Oliver Hunt 2005-06-27 05:31:38 PDT
Created attachment 2670 [details]
expected output from the deep hierarchies test
Comment 19 Oliver Hunt 2005-06-27 05:35:26 PDT
Should the two patches be merged into one?  Strictly speaking the first patch fixes the type of exception 
the user would see, so might not be considered part of this bug... on the otherhand it does obscure the 
actual error that has occured.

Otherwise if no one has any complaints i'll flag these for review
Comment 20 David Wheeler 2005-06-27 10:19:25 PDT
FWIW, I saw this bug with .sort(), too. If that just uses toString(), then it's the same bug.
Comment 21 Oliver Hunt 2005-06-27 15:35:03 PDT
I did a quick test, and sort seems to work fine -- you sure it wasn't a cycle in the array causing toString to 
die?  With the patches applied it doesn't crash and burn at all, which seem to imply that any problems 
were in the output...
Comment 22 David Wheeler 2005-06-27 15:45:45 PDT
Created attachment 2676 [details]
sort() Test Case

No, I'm not sure, but this new test case demonstrates the problem using sort().
If it's just failing because sort uses toString() internally, then great. If
not, then is it related at all?
Comment 23 Oliver Hunt 2005-06-27 15:57:25 PDT
Yep that passes with the patches applied -- so sort is using either array.tostring or array.join, either way it 
works.
Comment 24 Darin Adler 2005-06-27 17:24:59 PDT
Comment on attachment 2640 [details]
Patch to fix the *type* of error given to what it should be

Looks like a good change, but it's not formatted correctly. Please look at
<http://webkit.opendarwin.org/coding/coding-style.html>.

Also, it would be better if this patch had a separate bug report, and a test
case demonstrating it's fixed, rather than being part of the "does not support
circular references" bug.
Comment 25 Darin Adler 2005-06-27 17:36:28 PDT
Comment on attachment 2649 [details]
Patch to match behaviour of firefox and IE

Looks like this is on the right track.

Formatting is not really right
<http://webkit.opendarwin.org/coding/coding-style.html>.

Before landing this, we need to figure out exactly which tests to land with it;
at the moment there are lots of different tests attached to the bug and they
need to be put in layout test form.

I'm not comfortable with using a giant global array as part of the solution
here. I suggest we make a small fixed-size global array, then switch over to
using a hash table instead once we pass a certain recursion threshold to avoid
O(n^2) behavior. That also has the pleasant quality of making the global much
smaller.

Is it OK to share a single array between both Join and ToString? I'd like to
see a test that checks for that.
Comment 26 Oliver Hunt 2005-06-27 18:41:33 PDT
Comment on attachment 2670 [details]
expected output from the deep hierarchies test

Moved to bug 3743
Comment 27 Oliver Hunt 2005-06-27 18:42:38 PDT
Comment on attachment 2669 [details]
testcase for tostring with non-cyclic, but deep hierarchies

Moved to bug 3743
Comment 28 Oliver Hunt 2005-06-27 18:44:40 PDT
Comment on attachment 2640 [details]
Patch to fix the *type* of error given to what it should be

Moved to bug 3743
Comment 29 David Wheeler 2005-06-27 18:51:43 PDT
(In reply to comment #25)

> Before landing this, we need to figure out exactly which tests to land with it;
> at the moment there are lots of different tests attached to the bug and they
> need to be put in layout test form.

For my future reference, what is "layout test form"? Is it documented somewhere I can find it so that I can 
submit future tests using it?

Thanks!
Comment 30 Oliver Hunt 2005-06-27 19:01:01 PDT
Comment on attachment 2651 [details]
edge case test for patch

obsoleted by layout tests
Comment 31 Oliver Hunt 2005-06-27 19:02:19 PDT
Comment on attachment 2652 [details]
Another edge case, slightly less ugly than previous

obsoleted by layout tests
Comment 32 Oliver Hunt 2005-06-27 19:04:22 PDT
Comment on attachment 2669 [details]
testcase for tostring with non-cyclic, but deep hierarchies

moved to bug 3743
Comment 33 Oliver Hunt 2005-06-27 19:24:32 PDT
attachment 2665 [details] and attachment 2666 [details] seem to match the layout format -- at least
they fit in with the the standard webkit tests
Comment 34 Oliver Hunt 2005-06-28 03:51:04 PDT
Created attachment 2683 [details]
Quick test for join and toString operating concurrently

Here's a test that demonstrates the use of join and toString in a big unholy
mess.  Not a webkit test atm, but should be sometime early tomorrow (NZST).  

toString just falls through to the join method in an Array so given the absence
of concurrent execution in kjs they are fine anyway.  Each call to
join/toString increases the stack depth, and we currently allocate enough space
to fit the entire maximum call depth.

Darin: re: the hashmap should i just switch over to using that now? rather than
applying the current array driven look up? and what should i be using as a
hashmap implementation, from irc it sounds like there's an internal one we
should be using...
Comment 35 Geoffrey Garen 2005-07-25 14:13:21 PDT
I have landed the patch for bug #3991.
Comment 36 David Wheeler 2005-07-25 14:41:04 PDT
(In reply to comment #35)
> I have landed the patch for bug #3991.

Uh, does that mean that this issue has been fixed?
Comment 37 Oliver Hunt 2005-10-29 21:52:39 PDT
Created attachment 4532 [details]
Reimplemented patch with HashSet

Okay, now HashSet is available in JSC I've got round to rewriting patch to use
it.

Needs to be reindented, and needs a half-decent testcase, but it's here now :)
Comment 38 Darin Adler 2005-11-09 07:39:33 PST
Comment on attachment 4532 [details]
Reimplemented patch with HashSet

I don't understand why it's necessary to copy thisObj into a second local
variable currentRef. Lets just use thisObj directly unless there's a reason not
to.

I think it would be easier to read this if the code said:

    if (visitedElems.contains(thisObj))
	return String("");

before anything else (even before declaring separator and str), rather than
indenting all the code inside an if statement. That would also probably make
the diff smaller and easier to verify.

Also, I think the HashSet should be:

    HashSet< ObjectImp*, PointerHash<ObjectImp*> >

There's an advantage to HashSet<void*>, in that we're guaranteed all such sets
are a single template expansion, preventing any possible code bloat. But a
disadvantage to HashSet<void*> is that without type safety, it's possible to
make certain mistakes with a set (for example, adding and removing the same
object, but a pointer to a different part in the case of multiple inheritance).
In this case, I think the code is simple enough that I don't think there are
any real problems, but I'd prefer to get off on the right foot using this
class.

There are a few minor formatting problems, for example:

    ObjectImp *currentRef=thisObj;

the above needs spaces around the "=". And:

+    if(!visitedElems.contains(currentRef)) {	 

the above needs a space after the if and before the opening parenthesis. And:

+	 if ( exec->hadException() )
+	   break;

if you're going to modify the code, you should fix the format of that by
removing the extra spaces inside the parentheses.

Also, there are a number of 1-line if and else statements with braces; style
guide says not to use braces in these cases.
Comment 39 Oliver Hunt 2005-11-29 17:07:36 PST
Created attachment 4860 [details]
Hopefully the final patch

Reformatted and implemented darins suggestion
Comment 40 Oliver Hunt 2005-11-29 17:25:37 PST
Created attachment 4861 [details]
Fixed HashMap to be typesafe
Comment 41 Oliver Hunt 2005-11-29 17:49:30 PST
Created attachment 4862 [details]
Testcase for toString on arrays with self references

New testcase
Comment 42 Oliver Hunt 2005-11-29 18:28:38 PST
Created attachment 4863 [details]
Testcase, though it needs a better name
Comment 43 Oliver Hunt 2005-11-29 18:29:29 PST
Created attachment 4864 [details]
Expected output from testcase
Comment 44 Maciej Stachowiak 2005-11-30 23:45:57 PST
Comment on attachment 4861 [details]
Fixed HashMap to be typesafe

r=me
Comment 45 Maciej Stachowiak 2005-11-30 23:45:58 PST
Comment on attachment 4861 [details]
Fixed HashMap to be typesafe

r=me