Bug 153113

Summary: FTL B3 should be just as fast as FTL LLVM on Octane/crypto
Product: WebKit Reporter: Filip Pizlo <fpizlo>
Component: JavaScriptCoreAssignee: Filip Pizlo <fpizlo>
Status: RESOLVED FIXED    
Severity: Normal CC: barraclough, benjamin, commit-queue, ggaren, keith_miller, mark.lam, mhahnenb, msaboff, oliver, saam, sam
Priority: P2    
Version: WebKit Nightly Build   
Hardware: All   
OS: All   
Bug Depends on:    
Bug Blocks: 150279, 153197    
Attachments:
Description Flags
it's a start
none
a bit more
none
starting to make sense
none
more
none
lots more
none
more
none
IRC appears to work
none
more
none
sort of getting it to work
none
trying more things
none
things
none
the patch
none
the patch
none
did more things
none
the patch
saam: review+
performance none

Filip Pizlo
Reported 2016-01-14 18:09:52 PST
Things.
Attachments
it's a start (31.53 KB, patch)
2016-01-14 18:19 PST, Filip Pizlo
no flags
a bit more (41.45 KB, patch)
2016-01-14 19:52 PST, Filip Pizlo
no flags
starting to make sense (43.84 KB, patch)
2016-01-15 13:05 PST, Filip Pizlo
no flags
more (52.81 KB, patch)
2016-01-15 22:04 PST, Filip Pizlo
no flags
lots more (71.46 KB, patch)
2016-01-16 13:02 PST, Filip Pizlo
no flags
more (86.23 KB, patch)
2016-01-16 16:40 PST, Filip Pizlo
no flags
IRC appears to work (91.19 KB, patch)
2016-01-16 17:25 PST, Filip Pizlo
no flags
more (109.85 KB, patch)
2016-01-16 19:38 PST, Filip Pizlo
no flags
sort of getting it to work (129.23 KB, patch)
2016-01-16 23:55 PST, Filip Pizlo
no flags
trying more things (151.28 KB, patch)
2016-01-17 11:35 PST, Filip Pizlo
no flags
things (151.39 KB, patch)
2016-01-17 12:27 PST, Filip Pizlo
no flags
the patch (94.67 KB, patch)
2016-01-17 13:08 PST, Filip Pizlo
no flags
the patch (94.07 KB, patch)
2016-01-17 13:09 PST, Filip Pizlo
no flags
did more things (104.05 KB, patch)
2016-01-17 14:08 PST, Filip Pizlo
no flags
the patch (105.53 KB, patch)
2016-01-17 15:01 PST, Filip Pizlo
saam: review+
performance (65.23 KB, text/plain)
2016-01-17 15:48 PST, Filip Pizlo
no flags
Filip Pizlo
Comment 1 2016-01-14 18:19:25 PST
Created attachment 269023 [details] it's a start
Filip Pizlo
Comment 2 2016-01-14 19:52:50 PST
Created attachment 269028 [details] a bit more
Filip Pizlo
Comment 3 2016-01-14 21:18:49 PST
Here's my current thinking on how to proceed. I think I'll get rid of ForwardLiveness and use a much more deliberate approach: Our state mapping tells us what is live at any given moment. We use a separate liveness pass to collect the kill lists for all instructions. The kill lists tell us about the ways that things die at an Inst. There are four ways: - An early def could die above an Inst. - An early use could kill above an Inst. - A late def could die below an Inst. - A late use could kill below an Inst. Once we are armed with our state mapping and the kill lists, we can process the Inst in the order of its execution and update the state mapping based on defs and kill lists. We start above the Inst with a state that tells us what is live before it. This includes things that will die at the top. We can handle early uses and defs by giving them registers that are not live. We know which registers become live because our mapping tells us this. Then we can apply the early deaths. Then we process late uses and defs, and then apply the late deaths. This algorithm must proceed in the order of the Inst's execution. We can use the notion of locking at each stage - early and then late - to indicate which registers have already been purposed. All early uses will hold the lock while we are processing the early part of the instruction, and then release it before we start processing the late stuff. In between early and late, after we have unlocked things used early, we can perform early deaths. All early defs will hold the lock early and then release it after we process late things. As a special case, UseDef will also release the lock late. Things get weird with UseDef. Lucky for us, Insts will have a small number of UseDefs, so the algorithm doesn't have to be efficient and we can be sure that if we handle UseDefs early, then the greedy approach should just work. The locking strategy ensures that we won't try to reuse the register we already picked. The only additional logic for UseDef is that when we pick a register for it, which happens early, we look ahead to make sure that we didn't already lock the register late. The only registers that we could have already locked late are the ones that are clobbered late or defined late. One of the beautiful things about this approach is that we can process the Inst in-place. We won't edit the uses until we have accounted for them. We won't edit the defs until we have accounted for them. So, we can compute forward liveness while also editing the instruction.
Filip Pizlo
Comment 4 2016-01-15 13:05:22 PST
Created attachment 269090 [details] starting to make sense
Filip Pizlo
Comment 5 2016-01-15 22:04:17 PST
Filip Pizlo
Comment 6 2016-01-16 13:02:28 PST
Created attachment 269178 [details] lots more
Filip Pizlo
Comment 7 2016-01-16 16:40:36 PST
Filip Pizlo
Comment 8 2016-01-16 17:25:55 PST
Created attachment 269183 [details] IRC appears to work It's not saying much, but the current status of the patch is that IRC still appears to work. I haven't enabled register scavenging yet.
Filip Pizlo
Comment 9 2016-01-16 19:38:12 PST
Filip Pizlo
Comment 10 2016-01-16 23:55:16 PST
Created attachment 269187 [details] sort of getting it to work I'm no longer sure if this is a great idea.
Filip Pizlo
Comment 11 2016-01-17 11:35:58 PST
Created attachment 269190 [details] trying more things
Filip Pizlo
Comment 12 2016-01-17 12:27:36 PST
Created attachment 269194 [details] things This is a speed-up now that I've taken a different approach.
Filip Pizlo
Comment 13 2016-01-17 13:08:45 PST
Created attachment 269196 [details] the patch
Filip Pizlo
Comment 14 2016-01-17 13:09:22 PST
Created attachment 269197 [details] the patch
WebKit Commit Bot
Comment 15 2016-01-17 13:10:42 PST
Attachment 269197 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/b3/air/AirCustom.cpp:49: Place brace on its own line for function definitions. [whitespace/braces] [4] ERROR: Source/JavaScriptCore/b3/air/AirCustom.h:74: The parameter name "inst" adds no information, so it should be removed. [readability/parameter_name] [5] ERROR: Source/JavaScriptCore/b3/air/AirValidate.cpp:88: Place brace on its own line for function definitions. [whitespace/braces] [4] ERROR: Source/JavaScriptCore/b3/air/AirIteratedRegisterCoalescing.cpp:1204: Place brace on its own line for function definitions. [whitespace/braces] [4] ERROR: Source/JavaScriptCore/b3/air/AirLogRegisterPressure.cpp:60: Place brace on its own line for function definitions. [whitespace/braces] [4] ERROR: Source/JavaScriptCore/b3/air/AirLogRegisterPressure.cpp:69: Place brace on its own line for function definitions. [whitespace/braces] [4] ERROR: Source/JavaScriptCore/b3/air/AirReportUsedRegisters.cpp:55: Place brace on its own line for function definitions. [whitespace/braces] [4] ERROR: Source/JavaScriptCore/b3/air/AirAllocateStack.cpp:178: Place brace on its own line for function definitions. [whitespace/braces] [4] ERROR: Source/JavaScriptCore/b3/air/AirAllocateStack.cpp:281: Place brace on its own line for function definitions. [whitespace/braces] [4] ERROR: Source/JavaScriptCore/b3/B3CheckSpecial.cpp:112: Place brace on its own line for function definitions. [whitespace/braces] [4] ERROR: Source/JavaScriptCore/b3/air/AirFixObviousSpills.cpp:117: Place brace on its own line for function definitions. [whitespace/braces] [4] ERROR: Source/JavaScriptCore/b3/air/AirFixObviousSpills.cpp:183: Place brace on its own line for function definitions. [whitespace/braces] [4] ERROR: Source/JavaScriptCore/b3/air/AirFixObviousSpills.cpp:234: Place brace on its own line for function definitions. [whitespace/braces] [4] Total errors found: 13 in 40 files If any of these errors are false positives, please file a bug against check-webkit-style.
Filip Pizlo
Comment 16 2016-01-17 14:08:50 PST
Created attachment 269198 [details] did more things We're now at parity with LLVM on Octane/crypto.
Filip Pizlo
Comment 17 2016-01-17 15:01:24 PST
Created attachment 269201 [details] the patch
WebKit Commit Bot
Comment 18 2016-01-17 15:03:21 PST
Attachment 269201 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/b3/air/AirCustom.cpp:49: Place brace on its own line for function definitions. [whitespace/braces] [4] ERROR: Source/JavaScriptCore/b3/air/AirValidate.cpp:88: Place brace on its own line for function definitions. [whitespace/braces] [4] ERROR: Source/JavaScriptCore/b3/air/AirIteratedRegisterCoalescing.cpp:1204: Place brace on its own line for function definitions. [whitespace/braces] [4] ERROR: Source/JavaScriptCore/b3/air/AirLogRegisterPressure.cpp:60: Place brace on its own line for function definitions. [whitespace/braces] [4] ERROR: Source/JavaScriptCore/b3/air/AirLogRegisterPressure.cpp:69: Place brace on its own line for function definitions. [whitespace/braces] [4] ERROR: Source/JavaScriptCore/b3/air/AirReportUsedRegisters.cpp:55: Place brace on its own line for function definitions. [whitespace/braces] [4] ERROR: Source/JavaScriptCore/b3/air/AirCustom.h:74: The parameter name "inst" adds no information, so it should be removed. [readability/parameter_name] [5] ERROR: Source/JavaScriptCore/b3/air/AirAllocateStack.cpp:178: Place brace on its own line for function definitions. [whitespace/braces] [4] ERROR: Source/JavaScriptCore/b3/air/AirAllocateStack.cpp:281: Place brace on its own line for function definitions. [whitespace/braces] [4] ERROR: Source/JavaScriptCore/b3/B3CheckSpecial.cpp:112: Place brace on its own line for function definitions. [whitespace/braces] [4] ERROR: Source/JavaScriptCore/b3/air/AirFixObviousSpills.cpp:117: Place brace on its own line for function definitions. [whitespace/braces] [4] ERROR: Source/JavaScriptCore/b3/air/AirFixObviousSpills.cpp:183: Place brace on its own line for function definitions. [whitespace/braces] [4] ERROR: Source/JavaScriptCore/b3/air/AirFixObviousSpills.cpp:234: Place brace on its own line for function definitions. [whitespace/braces] [4] Total errors found: 13 in 42 files If any of these errors are false positives, please file a bug against check-webkit-style.
Filip Pizlo
Comment 19 2016-01-17 15:48:31 PST
Created attachment 269203 [details] performance
Saam Barati
Comment 20 2016-01-18 23:18:25 PST
Comment on attachment 269201 [details] the patch View in context: https://bugs.webkit.org/attachment.cgi?id=269201&action=review r=me with a couple comments > Source/JavaScriptCore/b3/air/AirAllocateStack.cpp:210 > + block->insts().removeAllMatching( > + [&] (const Inst& inst) -> bool { > + return !inst; > + }); Could/should this be done after looping over the block? > Source/JavaScriptCore/b3/air/AirFixObviousSpills.cpp:202 > + case Arg::Width32: Should this check for alias->mode != AllBits? > Source/JavaScriptCore/runtime/Options.h:346 > + v(bool, logAirRegisterPressure, false, nullptr) \ when do we use the "log" prefix versus the "dump" prefix? > Source/WTF/wtf/CheckedArithmetic.h:822 > + return Checked<T, RecordOverflow>(value) * checkedSum<T>(args...); Shouldn't this be: Checked<T, RecordOverflow>(value) * checkedProduct<T>(args...); ?
Filip Pizlo
Comment 21 2016-01-19 10:34:15 PST
(In reply to comment #20) > Comment on attachment 269201 [details] > the patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=269201&action=review > > r=me with a couple comments > > > Source/JavaScriptCore/b3/air/AirAllocateStack.cpp:210 > > + block->insts().removeAllMatching( > > + [&] (const Inst& inst) -> bool { > > + return !inst; > > + }); > > Could/should this be done after looping over the block? Wow, yeah. > > > Source/JavaScriptCore/b3/air/AirFixObviousSpills.cpp:202 > > + case Arg::Width32: > > Should this check for alias->mode != AllBits? No. The Width64 case has such a check because if you're observing all 64 bits (that's what Width64 means) then the alias better account for all bits (that's what AllBits means). But if we only observe 32-bits (Width32) then any of the alias modes work. > > > Source/JavaScriptCore/runtime/Options.h:346 > > + v(bool, logAirRegisterPressure, false, nullptr) \ > > when do we use the "log" prefix versus the "dump" prefix? This felt like it was like logB3PhaseTimes and logGC. > > > Source/WTF/wtf/CheckedArithmetic.h:822 > > + return Checked<T, RecordOverflow>(value) * checkedSum<T>(args...); > > Shouldn't this be: > Checked<T, RecordOverflow>(value) * checkedProduct<T>(args...); > ? Lol of course.
Filip Pizlo
Comment 22 2016-01-19 11:29:30 PST
Note You need to log in before you can comment on or make changes to this bug.