JIT: Inline ZEND_BW_NOT#22560
Conversation
Under the tracing JIT, ~ (ZEND_BW_NOT) was not inlined and fell back to the ZEND_BW_NOT_SPEC helper, whose result could be allocated to a stack slot aliasing a spilled loop-carried CV, clobbering it and producing a wrong result. Inline ZEND_BW_NOT for the LONG case (x ^ -1) like the other bitwise ops, removing the helper call and its temporary slot.
|
Confirmed the miscompile on PHP-8.4 with JIT on ( Test is misnumbered though: the file and The root cause is more general than The inline's worth taking on its own. The open question is whether to model the helper's EX_VAR write at the source and close the whole class; the anti-dependency case is already stubbed in |
JIT: inline ZEND_BW_NOT (fixes tracing-JIT miscompilation of
~in a masked expression)Fixes #22559.
Summary
Under the tracing JIT, the bitwise-NOT operator
~is not inlined - it is emitted as a call to theZEND_BW_NOT_SPECVM helper. In a side trace, the helper's result is written to a stack slot that can alias a spilled, loop-carried CV, silently clobbering that CV. The result is a wrong value for code as ordinary as a flags computation, e.g. an integer that is provably a byte comes out negative.This PR inlines
ZEND_BW_NOTfor the LONG case (asx ^ -1), the same wayZEND_BW_OR/AND/XORare already inlined. That removes the helper call and its temporary result slot, so the collision can't happen - and it's a small perfwin. A regression test is included.
Interpreter and function-JIT were always correct; only the tracing JIT was wrong.
1. Symptom
Every operand of the
|chain is byte-sized (the~…&0x80term is0or0x04), so$fis provably in0..0xBD. Under the tracing JIT it becomes negative - the correct low byte (0x97) with all upper bits set, i.e. a full-width~value.2. Research — pinning the affected versions
Built each branch from git and tested with the JIT verified active
(
opcache_get_status()['jit']['on'] === true):So this is an IR-JIT regression, introduced with the IR JIT in 8.4 - 8.3's DynASM JIT is fine. (Two methodology notes that cost us time and are worth recording: a
--disable-allbuild omits opcache on optional-opcache versions, so the JIT never runs and the repro falsely passes — always verifyjit.on; and the bug needsopcache.jit_hot_side_exitat its default (8), because the harness default of1compiles side traces so eagerly the buggy one never forms.)This is not #22115 (loop-PHI CV register dropped from a side-exit SNAPSHOT). Building that fix resolves its reproducer but leaves this one failing - same subsystem, different defect.
3. Proof — from the disassembly
The buggy side trace (
opcache.jit_debug+ capstone), annotated:The register allocator assigned the same frame slot
0xa0to (a) the spilled loop-carried CV holding$fand (b) the result of theZEND_BW_NOThelper call. The helper result clobbers the spilled$f; the reload then loads~($a^$value)RLOAD … {%r13} # BIND(0xa0)shares slot0xa0with theBW_NOToperand/result.)This is
~-specific because~is the only bitwise op the JIT does not inline;AND/OR/XORare inlined, so(x ^ 0x80)in place of~xavoids the helper and works - which is the one-operator source-level workaround.4. The fix
Inline
ZEND_BW_NOTfor the LONG case, mirroring the other bitwise ops:ext/opcache/jit/zend_jit_ir.c- newzend_jit_bw_not(): emitsres = ir_XOR_L(op1, -1)(~x == x ^ -1), stores the LONG result. No helper, no temp slot.ext/opcache/jit/zend_jit.c- function-JIT dispatch:case ZEND_BW_NOT(definitely-LONG fast path; otherwise falls through to the VM handler as before).ext/opcache/jit/zend_jit_trace.c- tracing-JIT codegen dispatchcase ZEND_BW_NOT(withCHECK_OP1_TRACE_TYPE()), plusZEND_BW_NOTadded to theADD_OP1_TRACE_GUARD()group so the trace guards op1's type.55 insertions, no deletions. The inlined path is taken only when op1 is definitely
LONG; any other type keeps the previous helper-based behaviour.5. Why this fix
BW_NOTis the only bitwise operator not inlined by the JIT (BW_OR/AND/XORalready are). Inlining it removes the helper call entirely - so there is no result slot to alias a spilled CV - and it's a small performance win, not just a correctness patch.ir_XOR_L), touches only the JIT, and gates on a known-LONGoperand; anything else is unchanged.zend_jit_trace_allocate_registers). That is more general but deeper and riskier; inliningBW_NOTfixes the observedbug directly and is the more upstream-friendly change. (Maintainers may of course prefer to also harden the allocator.)
Testing
ext/opcache/tests/jit/gh22558.phpt(overridesopcache.jit_hot_side_exitback to8so the bug reproduces underrun-tests).ok; the~result is byte-identical between the tracing JIT and the interpreter across-300..300and edge values (PHP_INT_MIN/MAX).ext/opcache/testssuite on 8.4: 883 tests, 0 failed (853 passed, 29 skipped).