Bug 132333 - Remove unneeded CPU(BIG_ENDIAN) handling in LLInt after new bytecode format.
Summary: Remove unneeded CPU(BIG_ENDIAN) handling in LLInt after new bytecode format.
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Mark Lam
URL:
Keywords: InRadar
: 193966 (view as bug list)
Depends on: 132330 170945
Blocks: 128743
  Show dependency treegraph
 
Reported: 2014-04-29 05:22 PDT by Tomas Popela
Modified: 2019-01-29 23:07 PST (History)
18 users (show)

See Also:


Attachments
Proposed patch (4.56 KB, patch)
2014-04-29 05:26 PDT, Tomas Popela
buildbot: commit-queue-
Details | Formatted Diff | Diff
Archive of layout-test-results from webkit-ews-10 for mac-mountainlion-wk2 (524.43 KB, application/zip)
2014-04-29 05:55 PDT, Build Bot
no flags Details
Proposed patch v2 (6.04 KB, patch)
2014-04-30 05:28 PDT, Tomas Popela
no flags Details | Formatted Diff | Diff
Patch (7.10 KB, patch)
2017-05-31 08:06 PDT, Tomas Popela
no flags Details | Formatted Diff | Diff
Patch (7.12 KB, patch)
2017-06-01 04:44 PDT, Tomas Popela
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews122 for ios-simulator-wk2 (7.22 MB, application/zip)
2017-06-01 10:25 PDT, Build Bot
no flags Details
Patch (12.67 KB, patch)
2018-06-06 12:31 PDT, Caitlin Potter (:caitp)
no flags Details | Formatted Diff | Diff
Patch (11.29 KB, patch)
2018-06-06 12:55 PDT, Caitlin Potter (:caitp)
no flags Details | Formatted Diff | Diff
Patch for landing (11.12 KB, patch)
2018-06-13 15:20 PDT, Caitlin Potter (:caitp)
no flags Details | Formatted Diff | Diff
proposed patch. (2.47 KB, patch)
2019-01-29 10:55 PST, Mark Lam
ysuzuki: review+
Details | Formatted Diff | Diff
Patch for landing. (2.71 KB, patch)
2019-01-29 11:18 PST, Mark Lam
no flags Details | Formatted Diff | Diff
patch for landing. (2.72 KB, patch)
2019-01-29 11:50 PST, Mark Lam
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Tomas Popela 2014-04-29 05:22:02 PDT
When loading operand variable from instruction in _llint_op_get_from_scope and _llint_op_put_to_scope use loadpFromInstruction instead of loadisFromInstruction.
The operand is saved as a pointer (taken from http://trac.webkit.org/browser/trunk/Source/JavaScriptCore/bytecode/CodeBlock.cpp#L1763) :

instructions[i + 6].u.pointer = reinterpret_cast<void*>(op.operand);

So it should be loaded as an pointer (as it already is get/putGlobalVar macros).

Before this change (with https://bugs.webkit.org/show_bug.cgi?id=131495 applied) this simple test case was crashing jsc. After this change it works (and stress suite as well).

$ cat test.js
var a = 1;
print (a+1);
Comment 1 Tomas Popela 2014-04-29 05:25:44 PDT
The change is first comment should be https://bugs.webkit.org/show_bug.cgi?id=132330
Comment 2 Tomas Popela 2014-04-29 05:26:15 PDT
Created attachment 230367 [details]
Proposed patch
Comment 3 Build Bot 2014-04-29 05:55:39 PDT
Comment on attachment 230367 [details]
Proposed patch

Attachment 230367 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.appspot.com/results/6219900116795392

New failing tests:
http/tests/media/track/track-webvtt-slow-loading-2.html
Comment 4 Build Bot 2014-04-29 05:55:41 PDT
Created attachment 230369 [details]
Archive of layout-test-results from webkit-ews-10 for mac-mountainlion-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: webkit-ews-10  Port: mac-mountainlion-wk2  Platform: Mac OS X 10.8.5
Comment 5 Tomas Popela 2014-04-29 07:19:39 PDT
I just noticed that I oversaw two crashes in test suite (ftl-putbyiddirect.js and ftl-putbyid.js). So we should postpone the review until I investigate if it is fault of this patch (on x86_64 it is not crashing, but on ppc64 it is) or not.
Comment 6 Mark Lam 2014-04-29 07:22:14 PDT
Comment on attachment 230367 [details]
Proposed patch

Cancelling the review for now until Tomas investigates issues.
Comment 7 Tomas Popela 2014-04-30 05:28:23 PDT
Created attachment 230475 [details]
Proposed patch v2

After investigating the crashes I found that the operand in instruction is saved differently in CodeBlock ( http://trac.webkit.org/browser/trunk/Source/JavaScriptCore/bytecode/CodeBlock.cpp#L1763 ) and in LLIntSlowPaths ( http://trac.webkit.org/browser/trunk/Source/JavaScriptCore/llint/LLIntSlowPaths.cpp#L1418 ).

Namely in CodeBlock.cpp it is saved with:
     instructions[i + 6].u.pointer = reinterpret_cast<void*>(op.operand);
in LLIntSlowPaths.cpp
     pc[6].u.operand = slot.cachedOffset();

As we are using loadpFromInstruction to load the operand value I unified the code in LLIntSlowPaths.cpp with CodeBlock.cpp. After this change the tests from tests/stress (as well as my test scripts) are now passing on ppc64 as well as on x86_64.
Comment 8 Michael Catanzaro 2016-01-02 09:43:36 PST
Hi Tom, does this patch still work, or does it need to be updated?
Comment 9 Tomas Popela 2016-01-03 23:08:44 PST
(In reply to comment #8)
> Hi Tom, does this patch still work, or does it need to be updated?

It definitely needs to be updated.
Comment 10 Tomas Popela 2017-05-31 08:06:15 PDT
Created attachment 311588 [details]
Patch
Comment 11 Saam Barati 2017-05-31 15:26:12 PDT
Comment on attachment 311588 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=311588&action=review

> Source/JavaScriptCore/ChangeLog:8
> +        The operand in PutToScope and GetFromScope instructions is not set

I'm surprised that other opcodes don't suffer from this as well.

> Source/JavaScriptCore/bytecode/BytecodeDumper.cpp:1633
> +        intptr_t operand = getOperand(*(++it), type); // Operand

nit: This comment seems out of date given it now reads "operand getOperand // operand"

Same as above

> Source/JavaScriptCore/bytecode/CodeBlock.cpp:662
> +            if (op.type == ClosureVar || op.type == ClosureVarWithVarInjectionChecks || op.type == GlobalProperty || op.type == GlobalPropertyWithVarInjectionChecks)

This does not look complete. I think ModuleVar also uses this as a int value.
Comment 12 Filip Pizlo 2017-05-31 15:34:28 PDT
Comment on attachment 311588 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=311588&action=review

> Source/JavaScriptCore/ChangeLog:21
> +        The operand in PutToScope and GetFromScope instructions is not set
> +        according to how it's being accessed in the LowLevelInterpreter*.asm
> +        files. For some of the variable types it is accessed as pointer and
> +        for others as int32_t. The problem is that the value is always set to
> +        pointer (actually it's not true for LocalClosureVar as for it there is
> +        an early return) in CodeBlock::finishCreation() by issuing
> +        instructions[i + 6].u.pointer = reinterpret_cast<void*>(op.operand)
> +        there. On x86_64 it's not a problem, but on big endian arches it leads
> +        to crashes. Change the code to use the operand or pointer based on how
> +        the operand is accessed in the LowLevelInterpreter*.asm files. Also
> +        modify and correct how the operand is printed in the BytecodeDumper to
> +        get the correct output even on big endian arches. With this change
> +        there is no difference between results of stress tests on x86_64 and
> +        ppc64 and s390x on linux.

I think that if we want to fix this issue, then we need a much more systematic approach.  JSC assumes big endian all over the place.  So, rather than having one-off fixes to load the right bits, we should explore whether or not we actually care about big endian.  If we do, then we should:

- Come up with a new design of how Instruction is laid out.
- Come up with a new design for how to access JSValue innards consistently.  Right now we interchange the concept of low bits and first bits.

If we approach the bit endian problem with one-off fixes, then I don't believe JSC will ever really be able to work on big endian - we will always have a long tail of things that go wrong.  So, I think that as a project we need to seriously consider not supporting big endian at all.  And if we want to support big endian, then I'd like to see a more comprehensive solution so that it's not brittle.
Comment 13 Tomas Popela 2017-06-01 02:30:22 PDT
(In reply to Saam Barati from comment #11)
> > Source/JavaScriptCore/bytecode/CodeBlock.cpp:662
> > +            if (op.type == ClosureVar || op.type == ClosureVarWithVarInjectionChecks || op.type == GlobalProperty || op.type == GlobalPropertyWithVarInjectionChecks)
> 
> This does not look complete. I think ModuleVar also uses this as a int value.

OK, I was not sure about how the operand is used in ModuleVar as for the PutToScope it's emitting an error https://trac.webkit.org/browser/trunk/Source/JavaScriptCore/llint/LowLevelInterpreter64.asm?rev=217650#L2340 and there is no occurence in GetFromScope. But I see that there is a note in https://trac.webkit.org/browser/trunk/Source/JavaScriptCore/llint/LLIntSlowPaths.cpp?rev=217650#L1529 that the ModuleVar is always converted to ClosureVar so it should be as you said. That leaves us with:

integer:
GlobalProperty, ClosureVar, LocalClosureVar, GlobalPropertyWithVarInjectionChecks, ClosureVarWithVarInjectionChecks, ModuleVar

pointer:
GlobalVar, GlobalLexicalVar, GlobalVarWithVarInjectionChecks, GlobalLexicalVarWithVarInjectionChecks, UnresolvedProperty, UnresolvedPropertyWithVarInjectionChecks, Dynamic

and I think that should be it.
Comment 14 Tomas Popela 2017-06-01 03:23:18 PDT
(In reply to Filip Pizlo from comment #12)
> I think that if we want to fix this issue, then we need a much more
> systematic approach.  JSC assumes big endian all over the place.  So, rather
> than having one-off fixes to load the right bits, we should explore whether
> or not we actually care about big endian.  If we do, then we should:
> 
> - Come up with a new design of how Instruction is laid out.
> - Come up with a new design for how to access JSValue innards consistently. 
> Right now we interchange the concept of low bits and first bits.
> 
> If we approach the bit endian problem with one-off fixes, then I don't
> believe JSC will ever really be able to work on big endian - we will always
> have a long tail of things that go wrong.  So, I think that as a project we
> need to seriously consider not supporting big endian at all.  And if we want
> to support big endian, then I'd like to see a more comprehensive solution so
> that it's not brittle.

Thank you for your insights Filip. I can just say that this and bug 170945 were the only long standing issues that we had with running WebKitGTK+ with CLoop on big endian arches. Now after they are patched we really don't have any problems so far. Also I should note that we are not using the bmalloc, but the system malloc.

The question whether we (as the WebKit project) care or don't care about big endians is really a question for you - JSC devs. I don't know whether it adds an extra burden for your work or not. But we (me talking for Fedora Project and Red Hat; and I think that Debian as well - Berto?) are actually glad that we have a way (CLoop) to run JSC on these arches, so we would like to see the big endians supported.

Also on the other side I have to admit that my knowledge of the JSC internals is very poor and we really don't have anyone in WebKitGTK+ team that could improve the state as you suggest.

Also do you think that having the build bot that's running CLoop on some big endian platform could improve the situation?

So you think that in the meantime this patch is not suitable to be included?
Comment 15 Tomas Popela 2017-06-01 04:44:00 PDT
Created attachment 311689 [details]
Patch
Comment 16 Build Bot 2017-06-01 10:25:38 PDT
Comment on attachment 311689 [details]
Patch

Attachment 311689 [details] did not pass ios-sim-ews (ios-simulator-wk2):
Output: http://webkit-queues.webkit.org/results/3854213

New failing tests:
webrtc/audio-peer-connection-webaudio.html
webrtc/peer-connection-audio-unmute.html
Comment 17 Build Bot 2017-06-01 10:25:40 PDT
Created attachment 311721 [details]
Archive of layout-test-results from ews122 for ios-simulator-wk2

The attached test failures were seen while running run-webkit-tests on the ios-sim-ews.
Bot: ews122  Port: ios-simulator-wk2  Platform: Mac OS X 10.12.5
Comment 18 Michael Catanzaro 2017-07-02 09:12:03 PDT
(In reply to Filip Pizlo from comment #12)
> So, I think that as a project we
> need to seriously consider not supporting big endian at all.

I think it would be nice to big endian architectures if we continued to support cloop. Does supporting cloop and only cloop also entail a significant maintenance burden? If so, we should probably make sure attempting to build WebKit on a big endian architecture ends in some hard build failure with an explanation, to avoid misplaced expectations.
Comment 19 Michael Catanzaro 2018-03-13 13:00:12 PDT
This is getting tiresome. Tom, can you please upload an updated version of whatever you want to be committed? It seems that between Fedora and Debian we have three big endian architectures that are mandatory to support, so the need to keep cloop working seems clear. And I'm tired of carrying this downstream.
Comment 20 Tomas Popela 2018-03-14 04:22:33 PDT
(In reply to Michael Catanzaro from comment #19)
> This is getting tiresome. Tom, can you please upload an updated version of
> whatever you want to be committed?

The attached patch is the version that we are using (in downstream we just stripped things that are meant for the debug build).
Comment 21 Michael Catanzaro 2018-03-14 06:54:05 PDT
(In reply to Tomas Popela from comment #20)
> The attached patch is the version that we are using (in downstream we just
> stripped things that are meant for the debug build).

OK, thanks.

I'm definitely not qualified to review this, and I see most JSC reviewers are already CCed on this bug, so hopefully somebody take this opportunity to review. Otherwise, I'll give an r+ after a few days, as we clearly need to land something here.
Comment 22 Mark Lam 2018-03-14 08:07:12 PDT
Comment on attachment 311689 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=311689&action=review

I understand Filip's concern and generally agree with him that we should have a less fragile solution in the long term.  But, I see value in this patch in that it documents some of the idiosyncrasies of our bytecode.  That said, there are issues that need to be resolved, and you need to do some extra homework first to resolve these.  See below ...

r- until issues are resolved because we don't want to land something that may silently destabilize the VM.

> Source/JavaScriptCore/bytecode/CodeBlock.cpp:665
> +            if (op.type == ClosureVar || op.type == ClosureVarWithVarInjectionChecks || op.type == GlobalProperty || op.type == GlobalPropertyWithVarInjectionChecks || op.type == ModuleVar)
> +                instructions[i + 6].u.operand = op.operand;
> +            else
> +                instructions[i + 6].u.pointer = reinterpret_cast<void*>(op.operand);

Looking at LowLevelInterpreter64.asm, I see that ClosureVar, ClosureVarWithVarInjectionChecks, GlobalProperty, and GlobalPropertyWithVarInjectionChecks calls macros that do loadis from the 6th operand.  However, I didn't see ModuleVar do the same.  Can you provide details of why ModuleVar should be included in this set?

Secondly, I also see that GlobalProperty, GlobalPropertyWithVarInjectionChecks, and ModuleVar also calls getConstantScope() which does loadp from the 6th operand.  This is in disagreement with the above.  Can you research why this is ok?  Based on this info, your solution may be incomplete, or in error, or perhaps operand 6 is overloaded and used 2 ways depending on context.  Please get the data that shows us what the solution should be.
Comment 23 Tomas Popela 2018-03-14 08:12:59 PDT
Thank Mark for review..

(In reply to Mark Lam from comment #22)
> Looking at LowLevelInterpreter64.asm, I see that ClosureVar,
> ClosureVarWithVarInjectionChecks, GlobalProperty, and
> GlobalPropertyWithVarInjectionChecks calls macros that do loadis from the
> 6th operand.  However, I didn't see ModuleVar do the same.  Can you provide
> details of why ModuleVar should be included in this set?

See https://bugs.webkit.org/show_bug.cgi?id=132333#c13

> Secondly, I also see that GlobalProperty,
> GlobalPropertyWithVarInjectionChecks, and ModuleVar also calls
> getConstantScope() which does loadp from the 6th operand.  This is in
> disagreement with the above.  Can you research why this is ok?  Based on
> this info, your solution may be incomplete, or in error, or perhaps operand
> 6 is overloaded and used 2 ways depending on context.  Please get the data
> that shows us what the solution should be.

I will look into this.
Comment 24 Filip Pizlo 2018-03-14 08:30:42 PDT
Comment on attachment 311689 [details]
Patch

I second Mark's r-.

This is an insane solution to this bug.  Probably you would be a lot happier if there was a pointerOperand field in the union of type uintptr_t.  You would then replace all uses of operand with uses of pointerOperand.
Comment 25 Michael Catanzaro 2018-03-14 08:58:30 PDT
Thanks for the reviews!
Comment 26 Caitlin Potter (:caitp) 2018-06-05 07:42:29 PDT
Would it be acceptable to change `u.operand` to be an `intptr_t`, and `unsignedValue` to be a `uintptr_t`?

That seems easier than ensuring that people use `pointerOperand` instead of `pointer` or `operand` consistently.
Comment 27 Caitlin Potter (:caitp) 2018-06-06 12:31:20 PDT
Created attachment 342072 [details]
Patch

Here's my shot at this. Instead of adding conditional logic which could be hard to maintain, it changes get_from/put_to_scope to be consistent about always using loadp instead of loadis to get at the operand, and adds an intptr_t accessor to the Instruction union, though it isn't strictly needed.
Comment 28 Caitlin Potter (:caitp) 2018-06-06 12:55:26 PDT
Created attachment 342075 [details]
Patch
Comment 29 Michael Catanzaro 2018-06-06 15:45:33 PDT
Thanks very much, Caitlin!
Comment 30 Tomas Popela 2018-06-07 00:32:43 PDT
Thank you Caitlin! This looks much saner!
Comment 31 Michael Catanzaro 2018-06-11 04:13:52 PDT
Mark, Filip, could you review it please?
Comment 32 Michael Catanzaro 2018-06-13 13:56:58 PDT
OK, any JSC reviewers, then. Maybe Yusuke?
Comment 33 Mark Lam 2018-06-13 14:13:02 PDT
Comment on attachment 342075 [details]
Patch

I'm saddened by the fact that this turns a 32bit load into a 64bit load.  However, it's only in the interpreter.  I suppose it won't have a significant dent on performance.  r=me.
Comment 34 Caitlin Potter (:caitp) 2018-06-13 14:25:24 PDT
(In reply to Mark Lam from comment #33)
> Comment on attachment 342075 [details]
> Patch
> 
> I'm saddened by the fact that this turns a 32bit load into a 64bit load. 
> However, it's only in the interpreter.  I suppose it won't have a
> significant dent on performance.  r=me.

As an alternative, we could have different opcodes for cases which treat the operand as a pointer, but that seems like a lot of work and not really worth it for the likely negligible perf impact. I guess we can revisit it if needed.
Comment 35 Caitlin Potter (:caitp) 2018-06-13 14:25:52 PDT
Comment on attachment 342075 [details]
Patch

here goes...
Comment 36 WebKit Commit Bot 2018-06-13 14:27:13 PDT
Comment on attachment 342075 [details]
Patch

Rejecting attachment 342075 [details] from commit-queue.

Failed to run "['/Volumes/Data/EWS/WebKit/Tools/Scripts/webkit-patch', '--status-host=webkit-queues.webkit.org', '--bot-id=webkit-cq-03', 'validate-changelog', '--check-oops', '--non-interactive', 342075, '--port=mac']" exit_code: 1 cwd: /Volumes/Data/EWS/WebKit

/Volumes/Data/EWS/WebKit/Source/JavaScriptCore/ChangeLog neither lists a valid reviewer nor contains the string "Unreviewed" or "Rubber stamp" (case insensitive).

Full output: http://webkit-queues.webkit.org/results/8168067
Comment 37 Mark Lam 2018-06-13 14:29:05 PDT
(In reply to WebKit Commit Bot from comment #36)
> /Volumes/Data/EWS/WebKit/Source/JavaScriptCore/ChangeLog neither lists a
> valid reviewer nor contains the string "Unreviewed" or "Rubber stamp" (case
> insensitive).

Please upload a fixed patch for landing.
Comment 38 Caitlin Potter (:caitp) 2018-06-13 15:20:10 PDT
Created attachment 342697 [details]
Patch for landing
Comment 39 WebKit Commit Bot 2018-06-13 15:51:35 PDT
Comment on attachment 342697 [details]
Patch for landing

Clearing flags on attachment: 342697

Committed r232816: <https://trac.webkit.org/changeset/232816>
Comment 40 WebKit Commit Bot 2018-06-13 15:51:37 PDT
All reviewed patches have been landed.  Closing bug.
Comment 41 Radar WebKit Bug Importer 2018-06-13 15:52:43 PDT
<rdar://problem/41105252>
Comment 42 Tomas Popela 2018-06-14 01:41:33 PDT
So I tried current TOT on our JSC CLoop CI (without applying the patch that I previously provided here) and I got 524 failures when running the stress suite.
Comment 43 Michael Catanzaro 2018-06-14 08:00:56 PDT
Tomas's original patch here fixed the stress test suite in combination with the diff posted in bug #182923.

To run the stress tests:

$ ./Tools/Scripts/build-webkit --jsc-only --debug --system-malloc --no-jit
$ ./Tools/Scripts/run-jsc-stress-tests --no-copy --jsc WebKitBuild/Debug/bin/jsc --no-jit --memory-limited -v JSTests/stress
Comment 44 Caitlin Potter (:caitp) 2018-06-14 08:15:23 PDT
(In reply to Tomas Popela from comment #42)
> So I tried current TOT on our JSC CLoop CI (without applying the patch that
> I previously provided here) and I got 524 failures when running the stress
> suite.

I believe this specific problem is addressed, but there seem to be some unrelated ones that are more apparent now that this crash is fixed.

I'm looking for a solution to the stress/dead-value-with-mov-hint-in-another-block.js failure right now (the root cause of this seems to be in `put_by_id`, and always taking the slow path for that specific opcode mitigates the problem).

I think it's worth filing a new bug for those, since they do appear to be distinct.
Comment 45 Tomas Popela 2018-06-14 08:23:58 PDT
So this is Jenkins's execute shell that I'm using:

# https://bugs.webkit.org/show_bug.cgi?id=132333
curl https://src.fedoraproject.org/cgit/rpms/webkit2gtk3.git/plain/cloop-big-endians.patch | patch -p1
# https://bugs.webkit.org/show_bug.cgi?id=182923 - fix page size
curl https://src.fedoraproject.org/rpms/webkit2gtk3/raw/master/f/page-size.patch | patch -p1

# Build WebKit
./Tools/Scripts/build-webkit --jsc-only --release --system-malloc --no-jit

# Don't run tests that require lot if memory if we don't have it..
memoryAvailable=$(cat /proc/meminfo | grep MemAvailable | awk '{ print $2}')
memoryLimited=""
if [ $memoryAvailable -lt 2000000 ]; then
  memoryLimited="--memory-limited"
fi

# Increase the stack size for arches with bigger page size
# fixes stress/tail-call-no-stack-overflow.js
if [ $(getconf PAGE_SIZE) -gt 4096 ]; then
  ulimit -s 65536
fi

./Tools/Scripts/run-jsc-stress-tests --no-copy --jsc $(pwd)/WebKitBuild/Release/bin/jsc --no-jit $memoryLimited -v JSTests/stress

# fail job when tests fail
[ -f ./results/failed ] && { cat ./results/failed; echo "$(wc -l ./results/failed | cut -d' ' -f1) failure(s)"; exit 1; } || exit 0


And with this the only failed test on s390x and ppc64 is:
stress/tail-call-no-stack-overflow.js
Comment 46 Tomas Popela 2018-06-14 08:25:45 PDT
(In reply to Caitlin Potter (:caitp) from comment #44)
> (In reply to Tomas Popela from comment #42)
> > So I tried current TOT on our JSC CLoop CI (without applying the patch that
> > I previously provided here) and I got 524 failures when running the stress
> > suite.
> 
> I believe this specific problem is addressed, but there seem to be some
> unrelated ones that are more apparent now that this crash is fixed.

No, I don't think so. I would expect that this is 1:1 replacement for the patch I submitted here. Did you tried to run the stress suite?
Comment 47 Caitlin Potter (:caitp) 2018-06-14 08:33:45 PDT
(In reply to Tomas Popela from comment #46)
> (In reply to Caitlin Potter (:caitp) from comment #44)
> > (In reply to Tomas Popela from comment #42)
> > > So I tried current TOT on our JSC CLoop CI (without applying the patch that
> > > I previously provided here) and I got 524 failures when running the stress
> > > suite.
> > 
> > I believe this specific problem is addressed, but there seem to be some
> > unrelated ones that are more apparent now that this crash is fixed.
> 
> No, I don't think so. I would expect that this is 1:1 replacement for the
> patch I submitted here. Did you tried to run the stress suite?

Your patch changes CodeBlock::finishCreation to write the operand (pc+6) as either an int32 or a pointer, depending on the resolve type.

This patch changes the interpreter so that, regardless of the resolve type, we always read the operand as a pointer. It's possible there are some missed cases in LLInt, but I've checked fairly carefully. It's also possible there's other code that writes an int32 to (pc+6) somewhere, but if that's the case, I would expect different failures than the ones I'm seeing in the tests.

I'm not saying there aren't test failures, but they appear to be unrelated to the get_from_scope/put_to_scope issues. I've mentioned a specific example.
Comment 48 Caitlin Potter (:caitp) 2018-06-15 08:49:45 PDT
well, this patch clearly does get some things wrong, and it's not easy to find what those things are. My grep-fu and stepping through CLOOP hasn't turned anything up yet, and I don't have reason to think it's any optimizing layer stepping in and mucking things up so early in a failing test.

The failures don't appear directly related to get_from_scope/put_to_scope, which is weird, but yeah I don't know.

Since it will probably take a lot of time to find a more proper fix, it's probably a good idea to roll this out, and reconsider Tomas's version.

Considering there is some similar logic for the way PutByIdFlags / StructureID works, maybe it's not really so bad.
Comment 49 Michael Catanzaro 2018-06-15 11:32:07 PDT
Reverted r232816 for reason:

this patch clearly does get some things wrong, and it's not easy to find what those things are

Committed r232883: <https://trac.webkit.org/changeset/232883>
Comment 50 Michael Catanzaro 2018-06-16 10:25:01 PDT
So if we reconsider Tomas's solution, that brings us back to Mark's review in comment #22:

(In reply to Mark Lam from comment #22)
> Comment on attachment 311689 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=311689&action=review
> 
> I understand Filip's concern and generally agree with him that we should
> have a less fragile solution in the long term.  But, I see value in this
> patch in that it documents some of the idiosyncrasies of our bytecode.  That
> said, there are issues that need to be resolved, and you need to do some
> extra homework first to resolve these.  See below ...
> 
> r- until issues are resolved because we don't want to land something that
> may silently destabilize the VM.
> 
> > Source/JavaScriptCore/bytecode/CodeBlock.cpp:665
> > +            if (op.type == ClosureVar || op.type == ClosureVarWithVarInjectionChecks || op.type == GlobalProperty || op.type == GlobalPropertyWithVarInjectionChecks || op.type == ModuleVar)
> > +                instructions[i + 6].u.operand = op.operand;
> > +            else
> > +                instructions[i + 6].u.pointer = reinterpret_cast<void*>(op.operand);
> 
> Looking at LowLevelInterpreter64.asm, I see that ClosureVar,
> ClosureVarWithVarInjectionChecks, GlobalProperty, and
> GlobalPropertyWithVarInjectionChecks calls macros that do loadis from the
> 6th operand.  However, I didn't see ModuleVar do the same.  Can you provide
> details of why ModuleVar should be included in this set?
> 
> Secondly, I also see that GlobalProperty,
> GlobalPropertyWithVarInjectionChecks, and ModuleVar also calls
> getConstantScope() which does loadp from the 6th operand.  This is in
> disagreement with the above.  Can you research why this is ok?  Based on
> this info, your solution may be incomplete, or in error, or perhaps operand
> 6 is overloaded and used 2 ways depending on context.  Please get the data
> that shows us what the solution should be.
Comment 51 Tomas Popela 2018-11-07 03:11:27 PST
The attached patch needs to be reworked because of the new metadata format.
Comment 52 Tomas Popela 2019-01-17 04:56:22 PST
OK, trying the current TOT (after there were the metadata format changes and other changes done by Mark) I see that the usual case for breaking it doesn't work anymore :) (open jsc and type a=1 <enter> a+1 was usually enough). Looks like that this particular issue is probably resolved, but the stress suite still mostly crashes (need to investigate that).
Comment 53 Tomas Popela 2019-01-17 05:21:31 PST
Core was generated by `./WebKitBuild/Debug/bin/jsc JSTests/stress/add-constant-overflow-recovery.js '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007fffa87f5808 in JSC::CodeBlock::finishCreation (this=0x7fffa4560000, vm=..., ownerExecutable=0x7fffa45c0000, unlinkedCodeBlock=0x7fffa4590000, scope=0x7fffa4a10000) at /home/tpopela/WebKit/Source/JavaScriptCore/bytecode/CodeBlock.cpp:550
550             m_instructionCount += opcodeLengths[opcodeID];
(gdb) bt
#0  0x00007fffa87f5808 in JSC::CodeBlock::finishCreation (this=0x7fffa4560000, vm=..., ownerExecutable=0x7fffa45c0000, unlinkedCodeBlock=0x7fffa4590000, scope=0x7fffa4a10000) at /home/tpopela/WebKit/Source/JavaScriptCore/bytecode/CodeBlock.cpp:550
#1  0x00007fffa9200698 in JSC::ProgramCodeBlock::create (vm=0x7fffa4fe0010, ownerExecutable=0x7fffa45c0000, unlinkedCodeBlock=0x7fffa4590000, scope=0x7fffa4a10000, sourceProvider=..., firstLineColumnOffset=1) at /home/tpopela/WebKit/Source/JavaScriptCore/bytecode/ProgramCodeBlock.h:61
#2  0x00007fffa91fbd14 in JSC::ScriptExecutable::newCodeBlockFor (this=0x7fffa45c0000, kind=JSC::CodeForCall, function=0x0, scope=0x7fffa4a10000, exception=@0x7fffd3949680: 0x0) at /home/tpopela/WebKit/Source/JavaScriptCore/runtime/ScriptExecutable.cpp:220
#3  0x00007fffa91fc9fc in JSC::ScriptExecutable::prepareForExecutionImpl (this=0x7fffa45c0000, vm=..., function=0x0, scope=0x7fffa4a10000, kind=JSC::CodeForCall, resultCodeBlock=@0x7fffd3949950: 0x7fffa4fe0100) at /home/tpopela/WebKit/Source/JavaScriptCore/runtime/ScriptExecutable.cpp:352
#4  0x00007fffa8c3a674 in JSC::ScriptExecutable::prepareForExecution<JSC::ProgramExecutable> (this=0x7fffa45c0000, vm=..., function=0x0, scope=0x7fffa4a10000, kind=JSC::CodeForCall, resultCodeBlock=@0x7fffd3949950: 0x7fffa4fe0100) at /home/tpopela/WebKit/Source/JavaScriptCore/bytecode/CodeBlock.h:1054
#5  0x00007fffa8c28b18 in JSC::Interpreter::executeProgram (this=0x1002e6760f0, source=..., callFrame=0x7fffa4b30048, thisObj=0x7fffa47d0080) at /home/tpopela/WebKit/Source/JavaScriptCore/interpreter/Interpreter.cpp:809
#6  0x00007fffa8f48724 in JSC::evaluate (exec=0x7fffa4b30048, source=..., thisValue=..., returnedException=...) at /home/tpopela/WebKit/Source/JavaScriptCore/runtime/Completion.cpp:106
#7  0x000000001005c41c in runWithOptions (globalObject=0x7fffa4b30000, options=..., success=@0x7fffd394a788: true) at /home/tpopela/WebKit/Source/JavaScriptCore/jsc.cpp:2469
#8  0x000000001005dad4 in <lambda(JSC::VM&, GlobalObject*, bool&)>::operator()(JSC::VM &, GlobalObject *, bool &) const (__closure=0x7fffd394a848, vm=..., globalObject=0x7fffa4b30000, success=@0x7fffd394a788: true) at /home/tpopela/WebKit/Source/JavaScriptCore/jsc.cpp:2933
#9  0x000000001005f4dc in runJSC<jscmain(int, char**)::<lambda(JSC::VM&, GlobalObject*, bool&)> >(const CommandLine &, bool, const <lambda(JSC::VM&, GlobalObject*, bool&)> &) (options=..., isWorker=false, func=...) at /home/tpopela/WebKit/Source/JavaScriptCore/jsc.cpp:2793
#10 0x000000001005dba8 in jscmain (argc=2, argv=0x7fffd394ad98) at /home/tpopela/WebKit/Source/JavaScriptCore/jsc.cpp:2926
#11 0x000000001005aba8 in main (argc=2, argv=0x7fffd394ad98) at /home/tpopela/WebKit/Source/JavaScriptCore/jsc.cpp:2293
(gdb) frame 0
#0  0x00007fffa87f5808 in JSC::CodeBlock::finishCreation (this=0x7fffa4560000, vm=..., ownerExecutable=0x7fffa45c0000, unlinkedCodeBlock=0x7fffa4590000, scope=0x7fffa4a10000) at /home/tpopela/WebKit/Source/JavaScriptCore/bytecode/CodeBlock.cpp:550
550             m_instructionCount += opcodeLengths[opcodeID];
(gdb) p opcodeID
$1 = 1929379840
(gdb) p opcodeLengths[opcodeID]
Cannot access memory at address 0x800175fd5d30
(gdb) info locals
opcodeID = 1929379840
instruction = @0x7fffd3948eb8: {m_instructions = @0x1002e6d9fc0, m_index = 241}
__for_range = @0x1002e6d9fc0: {m_instructions = {<WTF::VectorBuffer<unsigned char, 0>> = {<WTF::VectorBufferBase<unsigned char>> = {m_buffer = 0x1002e6da1c0 "/0\376\071\375\376\240\201\374\376", m_capacity = 354, m_size = 354}, <No data fields>}, <No data fields>}}
__for_begin = {<JSC::InstructionStream::BaseRef<WTF::Vector<unsigned char, 0, WTF::UnsafeVectorOverflow, 16> const>> = {m_instructions = @0x1002e6d9fc0, m_index = 241}, <No data fields>}
__for_end = {<JSC::InstructionStream::BaseRef<WTF::Vector<unsigned char, 0, WTF::UnsafeVectorOverflow, 16> const>> = {m_instructions = @0x1002e6d9fc0, m_index = 354}, <No data fields>}
throwScope = {<JSC::ExceptionScope> = {m_vm = @0x7fffa4fe0010, m_previousScope = 0x7fffd39494d8, m_location = {stackPosition = 0x0, functionName = 0x7fffa97bd3f0 <JSC::CodeBlock::finishCreation(JSC::VM&, JSC::ScriptExecutable*, JSC::UnlinkedCodeBlock*, JSC::JSScope*)::__FUNCTION__> "finishCreation",
      file = 0x7fffa97b8420 "/home/tpopela/WebKit/Source/JavaScriptCore/bytecode/CodeBlock.cpp", line = 411}, m_recursionDepth = 5}, m_isReleased = false}
__FUNCTION__ = "finishCreation"
shouldUpdateFunctionHasExecutedCache = false
stronglyReferencedModuleEnvironments = {m_impl = {static m_maxLoad = 2, static m_minLoad = 6, m_table = 0x0, m_tableSize = 0, m_tableSizeMask = 0, m_keyCount = 0, m_deletedCount = 0, m_iterators = 0x0, m_mutex = std::unique_ptr<WTF::Lock> = {get() = 0x1002e6d94c0}}}
link_profile = {__this = 0x7fffa4560000}
link_arrayProfile = {<No data fields>}
link_objectAllocationProfile = {__vm = @0x7fffa4fe0010, __this = 0x7fffa4560000}
link_arrayAllocationProfile = {<No data fields>}
link_hitCountForLLIntCaching = {<No data fields>}
__PRETTY_FUNCTION__ = "bool JSC::CodeBlock::finishCreation(JSC::VM&, JSC::ScriptExecutable*, JSC::UnlinkedCodeBlock*, JSC::JSScope*)"
Comment 54 Mark Lam 2019-01-17 09:54:50 PST
(In reply to Tomas Popela from comment #53)
> Core was generated by `./WebKitBuild/Debug/bin/jsc
> JSTests/stress/add-constant-overflow-recovery.js '.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  0x00007fffa87f5808 in JSC::CodeBlock::finishCreation
> (this=0x7fffa4560000, vm=..., ownerExecutable=0x7fffa45c0000,
> unlinkedCodeBlock=0x7fffa4590000, scope=0x7fffa4a10000) at
> /home/tpopela/WebKit/Source/JavaScriptCore/bytecode/CodeBlock.cpp:550
> 550             m_instructionCount += opcodeLengths[opcodeID];
...
> (gdb) p opcodeID
> $1 = 1929379840

This is obviously bad news.  opcodeID should be a small int value.  In this specific case, opcodeID comes from instruction->opcodeID().  You should debug how that invalid opcodeID came to be.
Comment 55 Tomas Popela 2019-01-22 06:06:36 PST
Blame the endianess:

(gdb) p /t opcodeID
$2 = 1110011000000000000000000000000

1110011 binary == 115 decimal

(gdb) p JSC::opcodeNames[115]
$3 = 0x7ffff7320740 "op_jless"

(and that's the same as on my x86_64 machine, now I need to debug how it got there - I suspect some write from the *.asm files to wrong offset)
Comment 56 Mark Lam 2019-01-22 10:12:23 PST
(In reply to Tomas Popela from comment #55)
> Blame the endianess:
> 
> (gdb) p /t opcodeID
> $2 = 11010011000000000000000000000000
> 
> 1110011 binary == 115 decimal
> 
> (gdb) p JSC::opcodeNames[115]
> $3 = 0x7ffff7320740 "op_jless"
> 
> (and that's the same as on my x86_64 machine, now I need to debug how it got
> there - I suspect some write from the *.asm files to wrong offset)

.asm files do not write opcodeIDs.  Are you still working with a CLoop build (as the bug title indicates) or has the scope changed?

I suggest you start with the bytecode structs in the generated BytecodeStructs.h. For op_jless, look at struct OpJLess' emitImpl() method.  Step into that code.  AFAICT, the bytecode stream should be written correctly in an endian conforming way by the BytecodeGenerator using these emitter methods.

Next, check struct Instruction which reads provides the opcodeID() method and see if things are behaving correctly there.

Keep in mind that bytecodes can be narrow (8-bit) or wide (32-bit).  I don't see how this can impact endianness, but you should be aware of the 2 cases and check them both.
Comment 57 Tomas Popela 2019-01-28 05:24:19 PST
(In reply to Mark Lam from comment #56)
> Are you still working with a CLoop build
> (as the bug title indicates) or has the scope changed?

Yes, still CLoop and big endians.

> I suggest you start with the bytecode structs in the generated
> BytecodeStructs.h. For op_jless, look at struct OpJLess' emitImpl() method.
> Step into that code.  AFAICT, the bytecode stream should be written
> correctly in an endian conforming way by the BytecodeGenerator using these
> emitter methods.

For the narrow one, everything is ok. The problems are causing the wide ones. It's written as it's supposed to be:

on ppc64

220             write(u.bytes[3]);
(gdb) p u
$26 = {i = 115, bytes = "\000\000\000s"}
(gdb) p u.bytes[3]
$27 = 115 's'
(gdb) s
JSC::InstructionStreamWriter::write (this=0x101a83c0, byte=115 's') at /home/tpopela/WebKit/Source/JavaScriptCore/bytecode/InstructionStream.h:204

(the first number is the index to gen->m_writer->m_instructions->m_buffer in DerivedSources/JavaScriptCore/BytecodeStructs.h


245 115 u.bytes[3]
246   0 u.bytes[2]
247   0 u.bytes[1]
248   0 u.bytes[0]

on x86_64

(gdb) p u
$2 = {
  i = 115,
  bytes = "s\000\000"
}
(gdb)

242 115 u.bytes[0]
243   0 u.bytes[1]
244   0 u.bytes[2]
245   0 u.bytes[3]

> Next, check struct Instruction which reads provides the opcodeID() method
> and see if things are behaving correctly there.

No, the opcodeID() from Instruction.h reads the wrong value when it's a wide one.

> Keep in mind that bytecodes can be narrow (8-bit) or wide (32-bit).  I don't
> see how this can impact endianness, but you should be aware of the 2 cases
> and check them both.

As I wrote above, then narrow one is ok, the wide one is causing problems.
Comment 58 Mark Lam 2019-01-29 07:58:12 PST
(In reply to Tomas Popela from comment #57)
> (In reply to Mark Lam from comment #56)
> > Are you still working with a CLoop build
> > (as the bug title indicates) or has the scope changed?
> 
> Yes, still CLoop and big endians.
> 
> > I suggest you start with the bytecode structs in the generated
> > BytecodeStructs.h. For op_jless, look at struct OpJLess' emitImpl() method.
> > Step into that code.  AFAICT, the bytecode stream should be written
> > correctly in an endian conforming way by the BytecodeGenerator using these
> > emitter methods.
> 
> For the narrow one, everything is ok. The problems are causing the wide
> ones. It's written as it's supposed to be:
> 
> on ppc64
> 
> 220             write(u.bytes[3]);
> (gdb) p u
> $26 = {i = 115, bytes = "\000\000\000s"}
> (gdb) p u.bytes[3]
> $27 = 115 's'
> (gdb) s
> JSC::InstructionStreamWriter::write (this=0x101a83c0, byte=115 's') at
> /home/tpopela/WebKit/Source/JavaScriptCore/bytecode/InstructionStream.h:204
> 
> (the first number is the index to gen->m_writer->m_instructions->m_buffer in
> DerivedSources/JavaScriptCore/BytecodeStructs.h
> 
> 
> 245 115 u.bytes[3]
> 246   0 u.bytes[2]
> 247   0 u.bytes[1]
> 248   0 u.bytes[0]
> 
> on x86_64
> 
> (gdb) p u
> $2 = {
>   i = 115,
>   bytes = "s\000\000"
> }
> (gdb)
> 
> 242 115 u.bytes[0]
> 243   0 u.bytes[1]
> 244   0 u.bytes[2]
> 245   0 u.bytes[3]
> 
> > Next, check struct Instruction which reads provides the opcodeID() method
> > and see if things are behaving correctly there.
> 
> No, the opcodeID() from Instruction.h reads the wrong value when it's a wide
> one.
> 
> > Keep in mind that bytecodes can be narrow (8-bit) or wide (32-bit).  I don't
> > see how this can impact endianness, but you should be aware of the 2 cases
> > and check them both.
> 
> As I wrote above, then narrow one is ok, the wide one is causing problems.

I think I know the issue.  Let me check something and get back to you today.
Comment 59 Mark Lam 2019-01-29 08:47:49 PST
Comment on attachment 342697 [details]
Patch for landing

This patch is obsolete now after the new bytecode format.
Comment 60 Mark Lam 2019-01-29 08:49:20 PST
Comment on attachment 342697 [details]
Patch for landing

Correction.  This patch was already landed.  We shouldn't have re-opened this bug.  Instead, we should have opened a new bug for the new / remaining issues (because it is our practice to have 1 bug per change).
Comment 61 Mark Lam 2019-01-29 08:52:02 PST
(In reply to Mark Lam from comment #58)
> (In reply to Tomas Popela from comment #57)
> > As I wrote above, then narrow one is ok, the wide one is causing problems.
> 
> I think I know the issue.  Let me check something and get back to you today.

I'll address the issue in https://bugs.webkit.org/show_bug.cgi?id=193966.
Comment 62 Michael Catanzaro 2019-01-29 09:50:33 PST
(In reply to Mark Lam from comment #60)
> Comment on attachment 342697 [details]
> Patch for landing
> 
> Correction.  This patch was already landed.  We shouldn't have re-opened
> this bug.  Instead, we should have opened a new bug for the new / remaining
> issues (because it is our practice to have 1 bug per change).

This bug was reopened because the patch was rolled out. No patch from this bug survives in trunk.
Comment 63 Mark Lam 2019-01-29 10:48:24 PST
(In reply to Michael Catanzaro from comment #62)
> (In reply to Mark Lam from comment #60)
> > Comment on attachment 342697 [details]
> > Patch for landing
> > 
> > Correction.  This patch was already landed.  We shouldn't have re-opened
> > this bug.  Instead, we should have opened a new bug for the new / remaining
> > issues (because it is our practice to have 1 bug per change).
> 
> This bug was reopened because the patch was rolled out. No patch from this
> bug survives in trunk.

I see.  Sorry.  I'll dupe 193966 to this bug.
Comment 64 Mark Lam 2019-01-29 10:49:52 PST
*** Bug 193966 has been marked as a duplicate of this bug. ***
Comment 65 Mark Lam 2019-01-29 10:52:39 PST
Previously committed patch was rolled out.  Will upload new fix.
Comment 66 Mark Lam 2019-01-29 10:55:12 PST
Created attachment 360475 [details]
proposed patch.
Comment 67 Yusuke Suzuki 2019-01-29 10:59:51 PST
Comment on attachment 360475 [details]
proposed patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=360475&action=review

r=me with suggestion.

> Source/JavaScriptCore/bytecode/InstructionStream.h:-230
> -#endif // !CPU(BIG_ENDIAN)

Should we use the above conversion by bitwise_cast, since non-active union field access is UB.
Like the following. (I'm not sure the following can be compiled well).
auto bytes = bitwise_cast<uint8_t[4]>(i);
Comment 68 Mark Lam 2019-01-29 11:10:05 PST
Comment on attachment 360475 [details]
proposed patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=360475&action=review

>> Source/JavaScriptCore/bytecode/InstructionStream.h:-230
>> -#endif // !CPU(BIG_ENDIAN)
> 
> Should we use the above conversion by bitwise_cast, since non-active union field access is UB.
> Like the following. (I'm not sure the following can be compiled well).
> auto bytes = bitwise_cast<uint8_t[4]>(i);

Sure.  I'll make the change.
Comment 69 Mark Lam 2019-01-29 11:18:48 PST
Created attachment 360479 [details]
Patch for landing.

Thanks for the review.
Comment 70 Mark Lam 2019-01-29 11:45:43 PST
(In reply to Mark Lam from comment #68)
> Comment on attachment 360475 [details]
> proposed patch.
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=360475&action=review
> 
> >> Source/JavaScriptCore/bytecode/InstructionStream.h:-230
> >> -#endif // !CPU(BIG_ENDIAN)
> > 
> > Should we use the above conversion by bitwise_cast, since non-active union field access is UB.
> > Like the following. (I'm not sure the following can be compiled well).
> > auto bytes = bitwise_cast<uint8_t[4]>(i);
> 
> Sure.  I'll make the change.

Turns out bitwise_cast won't work because "substitution failure [with ToType = unsigned char [4], FromType = unsigned int]: function cannot return array type 'unsigned char [4]'.

I'll just use a memcpy.
Comment 71 Mark Lam 2019-01-29 11:50:18 PST
Created attachment 360482 [details]
patch for landing.
Comment 72 WebKit Commit Bot 2019-01-29 14:25:41 PST
Comment on attachment 360482 [details]
patch for landing.

Clearing flags on attachment: 360482

Committed r240684: <https://trac.webkit.org/changeset/240684>
Comment 73 WebKit Commit Bot 2019-01-29 14:25:44 PST
All reviewed patches have been landed.  Closing bug.
Comment 74 Tomas Popela 2019-01-29 23:07:45 PST
Wow! Thank you Mark! It was nice to come to the work, look at the JSC CLoop CI that I'm running on various arches and it was (nearly) green on big endians (previously it was 8 months and 5 days ago)! There are some stress tests failures and I opened bug 194007 for them. Thank you again! And tzagallo for the new bytecode format!