WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
136436
REGRESSION(
r173031
): crashes during run-layout-jsc on x86/Linux
https://bugs.webkit.org/show_bug.cgi?id=136436
Summary
REGRESSION(r173031): crashes during run-layout-jsc on x86/Linux
Julien Brianceau
Reported
2014-09-01 14:41:22 PDT
Since
r173031
, 200+ crashes seen when performing run-layout-jsc on x86 32-bit/Linux/gcc 4.7.3
Attachments
run-layout-jsc results (tested on r173161)
(133.04 KB, application/x-compressed-tar)
2014-09-01 14:44 PDT
,
Julien Brianceau
no flags
Details
Workaround proposal
(1.33 KB, patch)
2014-09-01 14:56 PDT
,
Julien Brianceau
no flags
Details
Formatted Diff
Diff
Fix proposal
(1.75 KB, patch)
2014-09-02 15:00 PDT
,
Julien Brianceau
no flags
Details
Formatted Diff
Diff
Patch
(1.89 KB, patch)
2014-09-02 15:22 PDT
,
Michael Saboff
no flags
Details
Formatted Diff
Diff
Updated patch from review comments.
(1.98 KB, patch)
2014-09-02 16:11 PDT
,
Michael Saboff
no flags
Details
Formatted Diff
Diff
Updated patch after discussion with ggaren
(9.83 KB, patch)
2014-09-04 12:52 PDT
,
Michael Saboff
ggaren
: review+
Details
Formatted Diff
Diff
Show Obsolete
(4)
View All
Add attachment
proposed patch, testcase, etc.
Julien Brianceau
Comment 1
2014-09-01 14:44:08 PDT
Created
attachment 237463
[details]
run-layout-jsc results (tested on
r173161
)
Julien Brianceau
Comment 2
2014-09-01 14:56:08 PDT
Created
attachment 237465
[details]
Workaround proposal It appears that in my env (Ubuntu Linux x86 32-bit / gcc 4.7.3): execCallee->setScope(exec->scope()); // exec pointer is valid at this point execCallee->setCodeBlock(0); // exec pointer is null now execCallee->setCallerFrame(exec); // setting null caller frame
Julien Brianceau
Comment 3
2014-09-01 14:57:27 PDT
With this patch, run-layout-jsc is running fine again: All 523 tests passed!
Akos Kiss
Comment 4
2014-09-01 15:22:08 PDT
Could you provide a disassembly of operationCallEval? That might help understanding why exec is overwritten. I'm just curious.
Julien Brianceau
Comment 5
2014-09-01 15:47:26 PDT
(In reply to
comment #4
)
> Could you provide a disassembly of operationCallEval? That might help understanding why exec is overwritten. I'm just curious.
Sure. Here is the disassembly, taken from Source/JavaScriptCore/CMakeFiles/JavaScriptCore.dir/jit/JITOperations.cpp.o file, in release mode with -g: 00003233 <operationCallEval>: EncodedJSValue JIT_OPERATION operationCallEval(ExecState* exec, ExecState* execCallee) { 3233: 55 push %ebp 3234: 89 e5 mov %esp,%ebp 3236: 53 push %ebx 3237: 83 ec 34 sub $0x34,%esp 323a: e8 fc ff ff ff call 323b <operationCallEval+0x8> 323f: 81 c3 02 00 00 00 add $0x2,%ebx ASSERT(exec->codeBlock()->codeType() != FunctionCode 3245: 8b 45 08 mov 0x8(%ebp),%eax 3248: 89 04 24 mov %eax,(%esp) 324b: e8 fc ff ff ff call 324c <operationCallEval+0x19> 3250: 89 04 24 mov %eax,(%esp) 3253: e8 fc ff ff ff call 3254 <operationCallEval+0x21> || !exec->codeBlock()->needsActivation() || exec->hasActivation()); 3258: 83 f8 02 cmp $0x2,%eax 325b: 75 58 jne 32b5 <operationCallEval+0x82> directPutByVal(exec, asObject(baseValue), subscript, value); } EncodedJSValue JIT_OPERATION operationCallEval(ExecState* exec, ExecState* execCallee) { ASSERT(exec->codeBlock()->codeType() != FunctionCode 325d: 8b 45 08 mov 0x8(%ebp),%eax 3260: 89 04 24 mov %eax,(%esp) 3263: e8 fc ff ff ff call 3264 <operationCallEval+0x31> 3268: 89 04 24 mov %eax,(%esp) 326b: e8 fc ff ff ff call 326c <operationCallEval+0x39> 3270: 84 c0 test %al,%al 3272: 74 41 je 32b5 <operationCallEval+0x82> 3274: 8b 45 08 mov 0x8(%ebp),%eax 3277: 89 04 24 mov %eax,(%esp) 327a: e8 fc ff ff ff call 327b <operationCallEval+0x48> 327f: 83 f0 01 xor $0x1,%eax 3282: 84 c0 test %al,%al 3284: 74 2f je 32b5 <operationCallEval+0x82> 3286: 8d 83 c0 21 00 00 lea 0x21c0(%ebx),%eax 328c: 89 44 24 0c mov %eax,0xc(%esp) 3290: 8d 83 00 c0 00 00 lea 0xc000(%ebx),%eax 3296: 89 44 24 08 mov %eax,0x8(%esp) 329a: c7 44 24 04 62 02 00 movl $0x262,0x4(%esp) 32a1: 00 32a2: 8d 83 90 20 00 00 lea 0x2090(%ebx),%eax 32a8: 89 04 24 mov %eax,(%esp) 32ab: e8 fc ff ff ff call 32ac <operationCallEval+0x79> 32b0: e8 fc ff ff ff call 32b1 <operationCallEval+0x7e> || !exec->codeBlock()->needsActivation() || exec->hasActivation()); execCallee->setScope(exec->scope()); 32b5: 8b 45 08 mov 0x8(%ebp),%eax 32b8: 89 04 24 mov %eax,(%esp) 32bb: e8 fc ff ff ff call 32bc <operationCallEval+0x89> 32c0: 89 44 24 04 mov %eax,0x4(%esp) 32c4: 8b 45 0c mov 0xc(%ebp),%eax 32c7: 89 04 24 mov %eax,(%esp) 32ca: e8 fc ff ff ff call 32cb <operationCallEval+0x98> execCallee->setCodeBlock(0); 32cf: c7 44 24 04 00 00 00 movl $0x0,0x4(%esp) 32d6: 00 32d7: 8b 45 0c mov 0xc(%ebp),%eax 32da: 89 04 24 mov %eax,(%esp) 32dd: e8 fc ff ff ff call 32de <operationCallEval+0xab> execCallee->setCallerFrame(exec); 32e2: 8b 45 08 mov 0x8(%ebp),%eax 32e5: 89 44 24 04 mov %eax,0x4(%esp) 32e9: 8b 45 0c mov 0xc(%ebp),%eax 32ec: 89 04 24 mov %eax,(%esp) 32ef: e8 fc ff ff ff call 32f0 <operationCallEval+0xbd> if (!isHostFunction(execCallee->calleeAsValue(), globalFuncEval)) 32f4: 8d 45 f0 lea -0x10(%ebp),%eax 32f7: 8b 55 0c mov 0xc(%ebp),%edx 32fa: 89 54 24 04 mov %edx,0x4(%esp) 32fe: 89 04 24 mov %eax,(%esp) 3301: e8 fc ff ff ff call 3302 <operationCallEval+0xcf> [...]
Akos Kiss
Comment 6
2014-09-02 08:18:57 PDT
Something very nasty is happening here. I've put my hands on an x86/32-bit chroot and built an EFL/debug jsc. When I run jsc --useLLInt=false eval.js, where eval.js is: var c = ""; eval(c); I also get the segfault. Running it in gdb and setting a breakpoint at JSC::operationCallEval reveals the following: (gdb) disas JSC::operationCallEval Dump of assembler code for function JSC::operationCallEval(JSC::ExecState*, JSC::ExecState*): 0x08808885 <+0>: push %ebp 0x08808886 <+1>: mov %esp,%ebp 0x08808888 <+3>: push %ebx 0x08808889 <+4>: sub $0x34,%esp 0x0880888c <+7>: call 0x86bfa30 <__x86.get_pc_thunk.bx> 0x08808891 <+12>: add $0xc8476f,%ebx => 0x08808897 <+18>: mov 0x8(%ebp),%eax At this point, esp=0xffffcda0 and ebp=0xffffcdd8 (the difference between them is 0x38, just as expected from the code above). The contents of the stack from *ebp are: 0xffffcdd8: 0xffffce28 // <=ebp, this is the result of push %ebp, i.e., before calling operationCallEval, ebp was 0xffffce28 0xffffcddc: 0xf5f1eef0 // this is the return address pushed on stack by the call to operationCallEval 0xffffcde0: 0xffffce28 // this is the first argument of operationCallEval, i.e., ExecState* exec (the previous ebp) 0xffffcde4: 0xffffcdd8 // this is the second argument of operationCallEval, i.e., ExecState* execCallee And this last line is strange: execCallee should point to a region of memory which has the layout of { callerFrame, returnAddress, codeBlock, scopeChain, callee, argCount, thisArg, args... }. However, in this case, the arguments of operationCallEval (i.e., the pointers exec and execCallee) on the stack are overlapping with the contents of execCallee itself. So, calling execCallee->setCodeBlock(0) zeros out execCallee[2], which is also exec. "Interestingly", this was also happening before the patch landed in
r173031
, but fortunately, exec was not used after the call to setCodeBlock, so the problem did not reveal itself. So, swapping the lines acts as a workaround but there is something nastier underneath, which would be worth of investigation. IMHO.
Julien Brianceau
Comment 7
2014-09-02 08:24:24 PDT
(In reply to
comment #6
)
> "Interestingly", this was also happening before the patch landed in
r173031
, but fortunately, exec was not used after the call to setCodeBlock, so the problem did not reveal itself. So, swapping the lines acts as a workaround but there is something nastier underneath, which would be worth of investigation. IMHO.
I fully agree with you, that's why I named my patch "workaround" and not "fix" :D
Michael Saboff
Comment 8
2014-09-02 10:07:09 PDT
(In reply to
comment #6
)
> Something very nasty is happening here. I've put my hands on an x86/32-bit chroot and built an EFL/debug jsc. When I run jsc --useLLInt=false eval.js, where eval.js is: > > var c = ""; > eval(c); > > I also get the segfault. Running it in gdb and setting a breakpoint at JSC::operationCallEval reveals the following: > > (gdb) disas JSC::operationCallEval > Dump of assembler code for function JSC::operationCallEval(JSC::ExecState*, JSC::ExecState*): > 0x08808885 <+0>: push %ebp > 0x08808886 <+1>: mov %esp,%ebp > 0x08808888 <+3>: push %ebx > 0x08808889 <+4>: sub $0x34,%esp > 0x0880888c <+7>: call 0x86bfa30 <__x86.get_pc_thunk.bx> > 0x08808891 <+12>: add $0xc8476f,%ebx > => 0x08808897 <+18>: mov 0x8(%ebp),%eax > > At this point, esp=0xffffcda0 and ebp=0xffffcdd8 (the difference between them is 0x38, just as expected from the code above). The contents of the stack from *ebp are: > > 0xffffcdd8: 0xffffce28 // <=ebp, this is the result of push %ebp, i.e., before calling operationCallEval, ebp was 0xffffce28 > 0xffffcddc: 0xf5f1eef0 // this is the return address pushed on stack by the call to operationCallEval > 0xffffcde0: 0xffffce28 // this is the first argument of operationCallEval, i.e., ExecState* exec (the previous ebp) > 0xffffcde4: 0xffffcdd8 // this is the second argument of operationCallEval, i.e., ExecState* execCallee > > And this last line is strange: execCallee should point to a region of memory which has the layout of { callerFrame, returnAddress, codeBlock, scopeChain, callee, argCount, thisArg, args... }. However, in this case, the arguments of operationCallEval (i.e., the pointers exec and execCallee) on the stack are overlapping with the contents of execCallee itself. So, calling execCallee->setCodeBlock(0) zeros out execCallee[2], which is also exec. > > "Interestingly", this was also happening before the patch landed in
r173031
, but fortunately, exec was not used after the call to setCodeBlock, so the problem did not reveal itself. So, swapping the lines acts as a workaround but there is something nastier underneath, which would be worth of investigation. IMHO.
I don't think that this is due to
r173031
. The issue is likely that the space pointed to by execCallee overlaps the stacked argument location on architectures that pass arguments via the stack. In JIT::compileCallEval(), we use the stack pointer as the calleeFrame pointer and then make the call via callOperationNoExceptionCheck(). For X86-32, we use poke(reg, offset) where the offset is a positive offset of the stack pointer to pass the arguments. I think requires that we either move up execCallee in the stack (harder) or move down the outgoing argument area for architectures that use stacked arguments (easier).
Julien Brianceau
Comment 9
2014-09-02 15:00:59 PDT
Created
attachment 237517
[details]
Fix proposal
Michael Saboff
Comment 10
2014-09-02 15:22:35 PDT
Created
attachment 237519
[details]
Patch Well, we think alike. I was working on this one and testing when you posted. I prefer keying this off of argument registers and not CPU type. BTW, X86-32 on Mac works with the old code on debug because the arguments are moved to local stack variables and for release because the args are moved into registers.
Julien Brianceau
Comment 11
2014-09-02 15:44:46 PDT
(In reply to
comment #10
)
> Created an attachment (id=237519) [details] > Patch > > Well, we think alike. I was working on this one and testing when you posted. I prefer keying this off of argument registers and not CPU type. > > BTW, X86-32 on Mac works with the old code on debug because the arguments are moved to local stack variables and for release because the args are moved into registers.
Looks good to me. As the stack must remain aligned (otherwise it will crash in an alignment check in the LLINT), do you think it's worth to assert something like this too ? ASSERT(!((sizeof(Register)*4) % stackAlignmentBytes()));
Akos Kiss
Comment 12
2014-09-02 15:51:53 PDT
(In reply to
comment #10
)
> Created an attachment (id=237519) [details] > Patch
I was wondering why to subtract sizeof(Register) * 4 from sp, why not sizeof(CallerFrameAndPC)? (And then, the arithmetic is not even necessary, since the result is already in regT1.) A stack/call frame layout problem also exists on ARM32. It would be worth looking at
https://bugs.webkit.org/show_bug.cgi?id=132740
. There, a highly similar approach has been suggested (not reviewed yet / titled dirty hack).
Michael Saboff
Comment 13
2014-09-02 16:11:44 PDT
Created
attachment 237526
[details]
Updated patch from review comments.
Geoffrey Garen
Comment 14
2014-09-03 11:48:00 PDT
Comment on
attachment 237526
[details]
Updated patch from review comments. View in context:
https://bugs.webkit.org/attachment.cgi?id=237526&action=review
> Source/JavaScriptCore/jit/JITCall32_64.cpp:227 > +#if NUMBER_OF_ARGUMENT_REGISTERS < 4 > + // Add stack space for arguments for architectures that pass most / all args on the stack > + int32_t addedStackSpace = (sizeof(GPRReg) * 4 + stackAlignmentBytes() - 1) & ~(stackAlignmentBytes() - 1); > + addPtr(TrustedImm32(-addedStackSpace), stackPointerRegister);
Who actually writes these four GPRs to the stack? Why isn't that function responsible for making room on the stack?
Michael Saboff
Comment 15
2014-09-04 12:52:23 PDT
Created
attachment 237640
[details]
Updated patch after discussion with ggaren
Geoffrey Garen
Comment 16
2014-09-04 13:50:31 PDT
Comment on
attachment 237640
[details]
Updated patch after discussion with ggaren View in context:
https://bugs.webkit.org/attachment.cgi?id=237640&action=review
r=me
> Source/JavaScriptCore/ChangeLog:10 > + That stack pointer provides space for the worse case number of stacked
"worst"
> Source/JavaScriptCore/jit/JITOperations.cpp:-615 > - execCallee->setCallerFrame(exec);
Does the LLInt need updating to perform this callerFrame store as well, or does it do so already?
Michael Saboff
Comment 17
2014-09-04 14:03:37 PDT
(In reply to
comment #16
)
> Does the LLInt need updating to perform this callerFrame store as well, or does it do so already?
The LLInt has its own slow path that includes the setting the callerFrame.
Michael Saboff
Comment 18
2014-09-04 14:23:56 PDT
Committed
r173282
: <
http://trac.webkit.org/changeset/173282
>
Csaba Osztrogonác
Comment 19
2014-09-05 00:47:16 PDT
***
Bug 132740
has been marked as a duplicate of this bug. ***
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug