| Summary: | Change offlineasm to emit more efficient LLInt code. | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | WebKit | Reporter: | Mark Lam <mark.lam> | ||||||||
| Component: | JavaScriptCore | Assignee: | Mark Lam <mark.lam> | ||||||||
| Status: | RESOLVED FIXED | ||||||||||
| Severity: | Normal | CC: | webkit-bug-importer | ||||||||
| Priority: | P2 | Keywords: | InRadar | ||||||||
| Version: | WebKit Nightly Build | ||||||||||
| Hardware: | Unspecified | ||||||||||
| OS: | Unspecified | ||||||||||
| Attachments: |
|
||||||||||
Created attachment 460419 [details]
LLIntAssembly-before.h
Created attachment 460420 [details]
LLIntAssembly-before-wo-unnecessary-lsl.h
Created attachment 460421 [details]
LLIntAssembly-after.h
Committed r295766 (251771@main): <https://commits.webkit.org/251771@main> Reviewed commits have been landed. Closing PR #1686 and removing active labels. |
1. Ruby treats numeric 0 as truthy. However, there's a test in arm64LowerMalformedLoadStoreAddresses which assumes a value of 0 would be false. As a result, we see offlineasm emit inefficient LLInt code like this: ".loc 3 821\n" "movz x16, #0 \n" // LowLevelInterpreter64.asm:821 "add x13, x3, x16 \n" "ldr x0, [x13] \n" ... instead of this: ".loc 3 821\n" "ldr x0, [x3] \n" // LowLevelInterpreter64.asm:821 This patch fixes this. 2. offlineasm's emitARM64MoveImmediate chooses to use `movn` instead of `movz` based on whether a 64-bit value is negative or not. Instead, it should be making that decision based on the number of halfwords (16-bits) in the value that is 0xffff vs 0. As a result, offlineasm emits code like this: ".loc 1 1638\n" "movn x27, #1, lsl #48 \n" // LowLevelInterpreter.asm:1638 "movk x27, #0, lsl #32 \n" "movk x27, #0, lsl #16 \n" "movk x27, #0 \n" ... instead of this: ".loc 1 1638\n" "movz x27, #65534, lsl #48 \n" // LowLevelInterpreter.asm:1638 This patch fixes this. 3. offlineasm is trivially assuming the range of immediate offsets for ldr/str instructions is [-255..4095]. However, that's only the range for byte sized load-stores. For 32-bit, the range is [-255..16380]. For 64-bit, the range is [-255..32760]. As a result, offlineasm emits code like this: ".loc 1 633\n" "movn x16, #16383 \n" // LowLevelInterpreter.asm:633 ".loc 1 1518\n" "and x3, x3, x16 \n" // LowLevelInterpreter.asm:1518 ".loc 1 1519\n" "movz x16, #16088 \n" // LowLevelInterpreter.asm:1519 "add x17, x3, x16 \n" "ldr x3, [x17] \n" ... instead of this: ".loc 1 633\n" "movn x17, #16383 \n" // LowLevelInterpreter.asm:633 ".loc 1 1518\n" "and x3, x3, x17 \n" // LowLevelInterpreter.asm:1518 ".loc 1 1519\n" "ldr x3, [x3, #16088] \n" // LowLevelInterpreter.asm:1519 This patch fixes this for 64-bit and 32-bit load-stores. 16-bit load-stores also has a wider range, but for now, it will continue to use the conservative range. This patch also introduces an `isMalformedArm64LoadAStoreAddress` so that this range check can be done consistently in all the places that checks for it. 4. offlineasm is eagerly emitting no-op arguments in instructions, e.g. "lsl #0", and adding 0. As a result, offlineasm emits code like this: ".loc 3 220\n" "movz x13, #51168, lsl #0 \n" // LowLevelInterpreter64.asm:220 "add x17, x1, x13, lsl #0 \n" "ldr w4, [x17, #0] \n" ... instead of this: ".loc 3 220\n" "movz x13, #51168 \n" // LowLevelInterpreter64.asm:220 "add x17, x1, x13 \n" "ldr w4, [x17] \n" This unnecessary arguments are actually very common throughout the emitted LLIntAssembly.h. This patch removes these unnecessary arguments, which makes the emitted LLInt code more human readable due to less clutter.