• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

tarantool / luajit / 7262987147

19 Dec 2023 02:10PM UTC coverage: 88.225% (-0.4%) from 88.616%
7262987147

push

github

fckxorg
test: add tests for debugging extensions

This patch adds tests for LuaJIT debugging
extensions for lldb and gdb.

5336 of 5969 branches covered (0.0%)

Branch coverage included in aggregate %.

20475 of 23287 relevant lines covered (87.92%)

1285545.26 hits per line

Source File
Press 'n' to go to next uncovered line, 'b' for previous

76.09
/src/lj_opt_fold.c
1
/*
2
** FOLD: Constant Folding, Algebraic Simplifications and Reassociation.
3
** ABCelim: Array Bounds Check Elimination.
4
** CSE: Common-Subexpression Elimination.
5
** Copyright (C) 2005-2017 Mike Pall. See Copyright Notice in luajit.h
6
*/
7

8
#define lj_opt_fold_c
9
#define LUA_CORE
10

11
#include <math.h>
12

13
#include "lj_obj.h"
14

15
#if LJ_HASJIT
16

17
#include "lj_buf.h"
18
#include "lj_str.h"
19
#include "lj_tab.h"
20
#include "lj_ir.h"
21
#include "lj_jit.h"
22
#include "lj_ircall.h"
23
#include "lj_iropt.h"
24
#include "lj_trace.h"
25
#if LJ_HASFFI
26
#include "lj_ctype.h"
27
#include "lj_carith.h"
28
#endif
29
#include "lj_vm.h"
30
#include "lj_strscan.h"
31
#include "lj_strfmt.h"
32

33
/* Here's a short description how the FOLD engine processes instructions:
34
**
35
** The FOLD engine receives a single instruction stored in fins (J->fold.ins).
36
** The instruction and its operands are used to select matching fold rules.
37
** These are applied iteratively until a fixed point is reached.
38
**
39
** The 8 bit opcode of the instruction itself plus the opcodes of the
40
** two instructions referenced by its operands form a 24 bit key
41
** 'ins left right' (unused operands -> 0, literals -> lowest 8 bits).
42
**
43
** This key is used for partial matching against the fold rules. The
44
** left/right operand fields of the key are successively masked with
45
** the 'any' wildcard, from most specific to least specific:
46
**
47
**   ins left right
48
**   ins any  right
49
**   ins left any
50
**   ins any  any
51
**
52
** The masked key is used to lookup a matching fold rule in a semi-perfect
53
** hash table. If a matching rule is found, the related fold function is run.
54
** Multiple rules can share the same fold function. A fold rule may return
55
** one of several special values:
56
**
57
** - NEXTFOLD means no folding was applied, because an additional test
58
**   inside the fold function failed. Matching continues against less
59
**   specific fold rules. Finally the instruction is passed on to CSE.
60
**
61
** - RETRYFOLD means the instruction was modified in-place. Folding is
62
**   retried as if this instruction had just been received.
63
**
64
** All other return values are terminal actions -- no further folding is
65
** applied:
66
**
67
** - INTFOLD(i) returns a reference to the integer constant i.
68
**
69
** - LEFTFOLD and RIGHTFOLD return the left/right operand reference
70
**   without emitting an instruction.
71
**
72
** - CSEFOLD and EMITFOLD pass the instruction directly to CSE or emit
73
**   it without passing through any further optimizations.
74
**
75
** - FAILFOLD, DROPFOLD and CONDFOLD only apply to instructions which have
76
**   no result (e.g. guarded assertions): FAILFOLD means the guard would
77
**   always fail, i.e. the current trace is pointless. DROPFOLD means
78
**   the guard is always true and has been eliminated. CONDFOLD is a
79
**   shortcut for FAILFOLD + cond (i.e. drop if true, otherwise fail).
80
**
81
** - Any other return value is interpreted as an IRRef or TRef. This
82
**   can be a reference to an existing or a newly created instruction.
83
**   Only the least-significant 16 bits (IRRef1) are used to form a TRef
84
**   which is finally returned to the caller.
85
**
86
** The FOLD engine receives instructions both from the trace recorder and
87
** substituted instructions from LOOP unrolling. This means all types
88
** of instructions may end up here, even though the recorder bypasses
89
** FOLD in some cases. Thus all loads, stores and allocations must have
90
** an any/any rule to avoid being passed on to CSE.
91
**
92
** Carefully read the following requirements before adding or modifying
93
** any fold rules:
94
**
95
** Requirement #1: All fold rules must preserve their destination type.
96
**
97
** Consistently use INTFOLD() (KINT result) or lj_ir_knum() (KNUM result).
98
** Never use lj_ir_knumint() which can have either a KINT or KNUM result.
99
**
100
** Requirement #2: Fold rules should not create *new* instructions which
101
** reference operands *across* PHIs.
102
**
103
** E.g. a RETRYFOLD with 'fins->op1 = fleft->op1' is invalid if the
104
** left operand is a PHI. Then fleft->op1 would point across the PHI
105
** frontier to an invariant instruction. Adding a PHI for this instruction
106
** would be counterproductive. The solution is to add a barrier which
107
** prevents folding across PHIs, i.e. 'PHIBARRIER(fleft)' in this case.
108
** The only exception is for recurrences with high latencies like
109
** repeated int->num->int conversions.
110
**
111
** One could relax this condition a bit if the referenced instruction is
112
** a PHI, too. But this often leads to worse code due to excessive
113
** register shuffling.
114
**
115
** Note: returning *existing* instructions (e.g. LEFTFOLD) is ok, though.
116
** Even returning fleft->op1 would be ok, because a new PHI will added,
117
** if needed. But again, this leads to excessive register shuffling and
118
** should be avoided.
119
**
120
** Requirement #3: The set of all fold rules must be monotonic to guarantee
121
** termination.
122
**
123
** The goal is optimization, so one primarily wants to add strength-reducing
124
** rules. This means eliminating an instruction or replacing an instruction
125
** with one or more simpler instructions. Don't add fold rules which point
126
** into the other direction.
127
**
128
** Some rules (like commutativity) do not directly reduce the strength of
129
** an instruction, but enable other fold rules (e.g. by moving constants
130
** to the right operand). These rules must be made unidirectional to avoid
131
** cycles.
132
**
133
** Rule of thumb: the trace recorder expands the IR and FOLD shrinks it.
134
*/
135

136
/* Some local macros to save typing. Undef'd at the end. */
137
#define IR(ref)                (&J->cur.ir[(ref)])
138
#define fins                (&J->fold.ins)
139
#define fleft                (J->fold.left)
140
#define fright                (J->fold.right)
141
#define knumleft        (ir_knum(fleft)->n)
142
#define knumright        (ir_knum(fright)->n)
143

144
/* Pass IR on to next optimization in chain (FOLD). */
145
#define emitir(ot, a, b)        (lj_ir_set(J, (ot), (a), (b)), lj_opt_fold(J))
146

147
/* Fold function type. Fastcall on x86 significantly reduces their size. */
148
typedef IRRef (LJ_FASTCALL *FoldFunc)(jit_State *J);
149

150
/* Macros for the fold specs, so buildvm can recognize them. */
151
#define LJFOLD(x)
152
#define LJFOLDX(x)
153
#define LJFOLDF(name)        static TRef LJ_FASTCALL fold_##name(jit_State *J)
154
/* Note: They must be at the start of a line or buildvm ignores them! */
155

156
/* Barrier to prevent using operands across PHIs. */
157
#define PHIBARRIER(ir)        if (irt_isphi((ir)->t)) return NEXTFOLD
158

159
/* Barrier to prevent folding across a GC step.
160
** GC steps can only happen at the head of a trace and at LOOP.
161
** And the GC is only driven forward if there's at least one allocation.
162
*/
163
#define gcstep_barrier(J, ref) \
164
  ((ref) < J->chain[IR_LOOP] && \
165
   (J->chain[IR_SNEW] || J->chain[IR_XSNEW] || \
166
    J->chain[IR_TNEW] || J->chain[IR_TDUP] || \
167
    J->chain[IR_CNEW] || J->chain[IR_CNEWI] || \
168
    J->chain[IR_BUFSTR] || J->chain[IR_TOSTR] || J->chain[IR_CALLA]))
169

170
/* -- Constant folding for FP numbers ------------------------------------- */
171

172
LJFOLD(ADD KNUM KNUM)
173
LJFOLD(SUB KNUM KNUM)
174
LJFOLD(MUL KNUM KNUM)
175
LJFOLD(DIV KNUM KNUM)
176
LJFOLD(LDEXP KNUM KNUM)
177
LJFOLD(MIN KNUM KNUM)
178
LJFOLD(MAX KNUM KNUM)
179
LJFOLDF(kfold_numarith)
109✔
180
{
181
  lua_Number a = knumleft;
109✔
182
  lua_Number b = knumright;
109✔
183
  lua_Number y = lj_vm_foldarith(a, b, fins->o - IR_ADD);
109✔
184
  return lj_ir_knum(J, y);
109✔
185
}
186

187
LJFOLD(NEG KNUM FLOAD)
188
LJFOLD(ABS KNUM FLOAD)
189
LJFOLDF(kfold_numabsneg)
10✔
190
{
191
  lua_Number a = knumleft;
10✔
192
  lua_Number y = lj_vm_foldarith(a, a, fins->o - IR_ADD);
10✔
193
  return lj_ir_knum(J, y);
10✔
194
}
195

196
LJFOLD(LDEXP KNUM KINT)
197
LJFOLDF(kfold_ldexp)
×
198
{
199
#if LJ_TARGET_X86ORX64
200
  UNUSED(J);
×
201
  return NEXTFOLD;
×
202
#else
203
  return lj_ir_knum(J, ldexp(knumleft, fright->i));
204
#endif
205
}
206

207
LJFOLD(FPMATH KNUM any)
208
LJFOLDF(kfold_fpmath)
10✔
209
{
210
  lua_Number a = knumleft;
10✔
211
  lua_Number y = lj_vm_foldfpm(a, fins->op2);
10✔
212
  return lj_ir_knum(J, y);
10✔
213
}
214

215
LJFOLD(CALLN KNUM any)
216
LJFOLDF(kfold_fpcall1)
5✔
217
{
218
  const CCallInfo *ci = &lj_ir_callinfo[fins->op2];
5✔
219
  if (CCI_TYPE(ci) == IRT_NUM) {
5✔
220
    double y = ((double (*)(double))ci->func)(knumleft);
5✔
221
    return lj_ir_knum(J, y);
5✔
222
  }
223
  return NEXTFOLD;
224
}
225

226
LJFOLD(CALLN CARG IRCALL_atan2)
227
LJFOLDF(kfold_fpcall2)
1✔
228
{
229
  if (irref_isk(fleft->op1) && irref_isk(fleft->op2)) {
1✔
230
    const CCallInfo *ci = &lj_ir_callinfo[fins->op2];
1✔
231
    double a = ir_knum(IR(fleft->op1))->n;
1✔
232
    double b = ir_knum(IR(fleft->op2))->n;
1✔
233
    double y = ((double (*)(double, double))ci->func)(a, b);
1✔
234
    return lj_ir_knum(J, y);
1✔
235
  }
236
  return NEXTFOLD;
237
}
238

239
LJFOLD(POW KNUM KNUM)
240
LJFOLDF(kfold_numpow)
2✔
241
{
242
  return lj_ir_knum(J, lj_vm_foldarith(knumleft, knumright, IR_POW - IR_ADD));
2✔
243
}
244

245
/* Must not use kfold_kref for numbers (could be NaN). */
246
LJFOLD(EQ KNUM KNUM)
247
LJFOLD(NE KNUM KNUM)
248
LJFOLD(LT KNUM KNUM)
249
LJFOLD(GE KNUM KNUM)
250
LJFOLD(LE KNUM KNUM)
251
LJFOLD(GT KNUM KNUM)
252
LJFOLD(ULT KNUM KNUM)
253
LJFOLD(UGE KNUM KNUM)
254
LJFOLD(ULE KNUM KNUM)
255
LJFOLD(UGT KNUM KNUM)
256
LJFOLDF(kfold_numcomp)
4✔
257
{
258
  return CONDFOLD(lj_ir_numcmp(knumleft, knumright, (IROp)fins->o));
4✔
259
}
260

261
/* -- Constant folding for 32 bit integers -------------------------------- */
262

263
static int32_t kfold_intop(int32_t k1, int32_t k2, IROp op)
1,816✔
264
{
265
  switch (op) {
1,816✔
266
  case IR_ADD: k1 += k2; break;
1,703✔
267
  case IR_SUB: k1 -= k2; break;
42✔
268
  case IR_MUL: k1 *= k2; break;
×
269
  case IR_MOD: k1 = lj_vm_modi(k1, k2); break;
1✔
270
  case IR_NEG: k1 = -k1; break;
×
271
  case IR_BAND: k1 &= k2; break;
×
272
  case IR_BOR: k1 |= k2; break;
×
273
  case IR_BXOR: k1 ^= k2; break;
×
274
  case IR_BSHL: k1 <<= (k2 & 31); break;
7✔
275
  case IR_BSHR: k1 = (int32_t)((uint32_t)k1 >> (k2 & 31)); break;
11✔
276
  case IR_BSAR: k1 >>= (k2 & 31); break;
×
277
  case IR_BROL: k1 = (int32_t)lj_rol((uint32_t)k1, (k2 & 31)); break;
1✔
278
  case IR_BROR: k1 = (int32_t)lj_ror((uint32_t)k1, (k2 & 31)); break;
×
279
  case IR_MIN: k1 = k1 < k2 ? k1 : k2; break;
1✔
280
  case IR_MAX: k1 = k1 > k2 ? k1 : k2; break;
50✔
281
  default: lj_assertX(0, "bad IR op %d", op); break;
282
  }
283
  return k1;
1,816✔
284
}
285

286
LJFOLD(ADD KINT KINT)
287
LJFOLD(SUB KINT KINT)
288
LJFOLD(MUL KINT KINT)
289
LJFOLD(MOD KINT KINT)
290
LJFOLD(NEG KINT KINT)
291
LJFOLD(BAND KINT KINT)
292
LJFOLD(BOR KINT KINT)
293
LJFOLD(BXOR KINT KINT)
294
LJFOLD(BSHL KINT KINT)
295
LJFOLD(BSHR KINT KINT)
296
LJFOLD(BSAR KINT KINT)
297
LJFOLD(BROL KINT KINT)
298
LJFOLD(BROR KINT KINT)
299
LJFOLD(MIN KINT KINT)
300
LJFOLD(MAX KINT KINT)
301
LJFOLDF(kfold_intarith)
412✔
302
{
303
  return INTFOLD(kfold_intop(fleft->i, fright->i, (IROp)fins->o));
412✔
304
}
305

306
LJFOLD(ADDOV KINT KINT)
307
LJFOLD(SUBOV KINT KINT)
308
LJFOLD(MULOV KINT KINT)
309
LJFOLDF(kfold_intovarith)
70✔
310
{
311
  lua_Number n = lj_vm_foldarith((lua_Number)fleft->i, (lua_Number)fright->i,
140✔
312
                                 fins->o - IR_ADDOV);
70✔
313
  int32_t k = lj_num2int(n);
70✔
314
  if (n != (lua_Number)k)
70✔
315
    return FAILFOLD;
316
  return INTFOLD(k);
70✔
317
}
318

319
LJFOLD(BNOT KINT)
320
LJFOLDF(kfold_bnot)
×
321
{
322
  return INTFOLD(~fleft->i);
×
323
}
324

325
LJFOLD(BSWAP KINT)
326
LJFOLDF(kfold_bswap)
×
327
{
328
  return INTFOLD((int32_t)lj_bswap((uint32_t)fleft->i));
×
329
}
330

331
LJFOLD(LT KINT KINT)
332
LJFOLD(GE KINT KINT)
333
LJFOLD(LE KINT KINT)
334
LJFOLD(GT KINT KINT)
335
LJFOLD(ULT KINT KINT)
336
LJFOLD(UGE KINT KINT)
337
LJFOLD(ULE KINT KINT)
338
LJFOLD(UGT KINT KINT)
339
LJFOLD(ABC KINT KINT)
340
LJFOLDF(kfold_intcomp)
554✔
341
{
342
  int32_t a = fleft->i, b = fright->i;
554✔
343
  switch ((IROp)fins->o) {
554✔
344
  case IR_LT: return CONDFOLD(a < b);
49✔
345
  case IR_GE: return CONDFOLD(a >= b);
127✔
346
  case IR_LE: return CONDFOLD(a <= b);
156✔
347
  case IR_GT: return CONDFOLD(a > b);
15✔
348
  case IR_ULT: return CONDFOLD((uint32_t)a < (uint32_t)b);
×
349
  case IR_UGE: return CONDFOLD((uint32_t)a >= (uint32_t)b);
×
350
  case IR_ULE: return CONDFOLD((uint32_t)a <= (uint32_t)b);
39✔
351
  case IR_ABC:
168✔
352
  case IR_UGT: return CONDFOLD((uint32_t)a > (uint32_t)b);
168✔
353
  default: lj_assertJ(0, "bad IR op %d", fins->o); return FAILFOLD;
354
  }
355
}
356

357
LJFOLD(UGE any KINT)
358
LJFOLDF(kfold_intcomp0)
29✔
359
{
360
  if (fright->i == 0)
29✔
361
    return DROPFOLD;
9✔
362
  return NEXTFOLD;
363
}
364

365
/* -- Constant folding for 64 bit integers -------------------------------- */
366

367
static uint64_t kfold_int64arith(jit_State *J, uint64_t k1, uint64_t k2,
368
                                 IROp op)
369
{
370
  UNUSED(J);
371
#if LJ_HASFFI
372
  switch (op) {
373
  case IR_ADD: k1 += k2; break;
374
  case IR_SUB: k1 -= k2; break;
375
  case IR_MUL: k1 *= k2; break;
376
  case IR_BAND: k1 &= k2; break;
377
  case IR_BOR: k1 |= k2; break;
378
  case IR_BXOR: k1 ^= k2; break;
379
  case IR_BSHL: k1 <<= (k2 & 63); break;
380
  case IR_BSHR: k1 = (int32_t)((uint32_t)k1 >> (k2 & 63)); break;
381
  case IR_BSAR: k1 >>= (k2 & 63); break;
382
  case IR_BROL: k1 = (int32_t)lj_rol((uint32_t)k1, (k2 & 63)); break;
383
  case IR_BROR: k1 = (int32_t)lj_ror((uint32_t)k1, (k2 & 63)); break;
384
  default: lj_assertJ(0, "bad IR op %d", op); break;
385
  }
386
#else
387
  UNUSED(k2); UNUSED(op);
388
  lj_assertJ(0, "FFI IR op without FFI");
389
#endif
390
  return k1;
391
}
392

393
LJFOLD(ADD KINT64 KINT64)
394
LJFOLD(SUB KINT64 KINT64)
395
LJFOLD(MUL KINT64 KINT64)
396
LJFOLD(BAND KINT64 KINT64)
397
LJFOLD(BOR KINT64 KINT64)
398
LJFOLD(BXOR KINT64 KINT64)
399
LJFOLDF(kfold_int64arith)
58✔
400
{
401
  return INT64FOLD(kfold_int64arith(J, ir_k64(fleft)->u64,
58✔
402
                                    ir_k64(fright)->u64, (IROp)fins->o));
403
}
404

405
LJFOLD(DIV KINT64 KINT64)
406
LJFOLD(MOD KINT64 KINT64)
407
LJFOLD(POW KINT64 KINT64)
408
LJFOLDF(kfold_int64arith2)
×
409
{
410
#if LJ_HASFFI
411
  uint64_t k1 = ir_k64(fleft)->u64, k2 = ir_k64(fright)->u64;
×
412
  if (irt_isi64(fins->t)) {
×
413
    k1 = fins->o == IR_DIV ? lj_carith_divi64((int64_t)k1, (int64_t)k2) :
×
414
         fins->o == IR_MOD ? lj_carith_modi64((int64_t)k1, (int64_t)k2) :
×
415
                             lj_carith_powi64((int64_t)k1, (int64_t)k2);
×
416
  } else {
417
    k1 = fins->o == IR_DIV ? lj_carith_divu64(k1, k2) :
×
418
         fins->o == IR_MOD ? lj_carith_modu64(k1, k2) :
×
419
                             lj_carith_powu64(k1, k2);
×
420
  }
421
  return INT64FOLD(k1);
×
422
#else
423
  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
424
#endif
425
}
426

427
LJFOLD(BSHL KINT64 KINT)
428
LJFOLD(BSHR KINT64 KINT)
429
LJFOLD(BSAR KINT64 KINT)
430
LJFOLD(BROL KINT64 KINT)
431
LJFOLD(BROR KINT64 KINT)
432
LJFOLDF(kfold_int64shift)
3✔
433
{
434
#if LJ_HASFFI
435
  uint64_t k = ir_k64(fleft)->u64;
3✔
436
  int32_t sh = (fright->i & 63);
3✔
437
  return INT64FOLD(lj_carith_shift64(k, sh, fins->o - IR_BSHL));
3✔
438
#else
439
  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
440
#endif
441
}
442

443
LJFOLD(BNOT KINT64)
444
LJFOLDF(kfold_bnot64)
×
445
{
446
#if LJ_HASFFI
447
  return INT64FOLD(~ir_k64(fleft)->u64);
×
448
#else
449
  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
450
#endif
451
}
452

453
LJFOLD(BSWAP KINT64)
454
LJFOLDF(kfold_bswap64)
×
455
{
456
#if LJ_HASFFI
457
  return INT64FOLD(lj_bswap64(ir_k64(fleft)->u64));
×
458
#else
459
  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
460
#endif
461
}
462

463
LJFOLD(LT KINT64 KINT64)
464
LJFOLD(GE KINT64 KINT64)
465
LJFOLD(LE KINT64 KINT64)
466
LJFOLD(GT KINT64 KINT64)
467
LJFOLD(ULT KINT64 KINT64)
468
LJFOLD(UGE KINT64 KINT64)
469
LJFOLD(ULE KINT64 KINT64)
470
LJFOLD(UGT KINT64 KINT64)
471
LJFOLDF(kfold_int64comp)
×
472
{
473
#if LJ_HASFFI
474
  uint64_t a = ir_k64(fleft)->u64, b = ir_k64(fright)->u64;
×
475
  switch ((IROp)fins->o) {
×
476
  case IR_LT: return CONDFOLD((int64_t)a < (int64_t)b);
×
477
  case IR_GE: return CONDFOLD((int64_t)a >= (int64_t)b);
×
478
  case IR_LE: return CONDFOLD((int64_t)a <= (int64_t)b);
×
479
  case IR_GT: return CONDFOLD((int64_t)a > (int64_t)b);
×
480
  case IR_ULT: return CONDFOLD(a < b);
×
481
  case IR_UGE: return CONDFOLD(a >= b);
×
482
  case IR_ULE: return CONDFOLD(a <= b);
×
483
  case IR_UGT: return CONDFOLD(a > b);
×
484
  default: lj_assertJ(0, "bad IR op %d", fins->o); return FAILFOLD;
485
  }
486
#else
487
  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
488
#endif
489
}
490

491
LJFOLD(UGE any KINT64)
492
LJFOLDF(kfold_int64comp0)
13✔
493
{
494
#if LJ_HASFFI
495
  if (ir_k64(fright)->u64 == 0)
13✔
496
    return DROPFOLD;
×
497
  return NEXTFOLD;
498
#else
499
  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
500
#endif
501
}
502

503
/* -- Constant folding for strings ---------------------------------------- */
504

505
LJFOLD(SNEW KKPTR KINT)
506
LJFOLDF(kfold_snew_kptr)
5✔
507
{
508
  GCstr *s = lj_str_new(J->L, (const char *)ir_kptr(fleft), (size_t)fright->i);
5✔
509
  return lj_ir_kstr(J, s);
5✔
510
}
511

512
LJFOLD(SNEW any KINT)
513
LJFOLDF(kfold_snew_empty)
48✔
514
{
515
  if (fright->i == 0)
48✔
516
    return lj_ir_kstr(J, &J2G(J)->strempty);
×
517
  return NEXTFOLD;
518
}
519

520
LJFOLD(STRREF KGC KINT)
521
LJFOLDF(kfold_strref)
75✔
522
{
523
  GCstr *str = ir_kstr(fleft);
75✔
524
  lj_assertJ((MSize)fright->i <= str->len, "bad string ref");
75✔
525
  return lj_ir_kkptr(J, (char *)strdata(str) + fright->i);
75✔
526
}
527

528
LJFOLD(STRREF SNEW any)
529
LJFOLDF(kfold_strref_snew)
3✔
530
{
531
  PHIBARRIER(fleft);
3✔
532
  if (irref_isk(fins->op2) && fright->i == 0) {
3✔
533
    return fleft->op1;  /* strref(snew(ptr, len), 0) ==> ptr */
2✔
534
  } else {
535
    /* Reassociate: strref(snew(strref(str, a), len), b) ==> strref(str, a+b) */
536
    IRIns *ir = IR(fleft->op1);
1✔
537
    if (ir->o == IR_STRREF) {
1✔
538
      IRRef1 str = ir->op1;  /* IRIns * is not valid across emitir. */
×
539
      PHIBARRIER(ir);
×
540
      fins->op2 = emitir(IRTI(IR_ADD), ir->op2, fins->op2); /* Clobbers fins! */
×
541
      fins->op1 = str;
×
542
      fins->ot = IRT(IR_STRREF, IRT_PGC);
×
543
      return RETRYFOLD;
×
544
    }
545
  }
546
  return NEXTFOLD;
547
}
548

549
LJFOLD(CALLN CARG IRCALL_lj_str_cmp)
550
LJFOLDF(kfold_strcmp)
2✔
551
{
552
  if (irref_isk(fleft->op1) && irref_isk(fleft->op2)) {
2✔
553
    GCstr *a = ir_kstr(IR(fleft->op1));
×
554
    GCstr *b = ir_kstr(IR(fleft->op2));
×
555
    return INTFOLD(lj_str_cmp(a, b));
×
556
  }
557
  return NEXTFOLD;
558
}
559

560
/* -- Constant folding and forwarding for buffers ------------------------- */
561

562
/*
563
** Buffer ops perform stores, but their effect is limited to the buffer
564
** itself. Also, buffer ops are chained: a use of an op implies a use of
565
** all other ops up the chain. Conversely, if an op is unused, all ops
566
** up the chain can go unsed. This largely eliminates the need to treat
567
** them as stores.
568
**
569
** Alas, treating them as normal (IRM_N) ops doesn't work, because they
570
** cannot be CSEd in isolation. CSE for IRM_N is implicitly done in LOOP
571
** or if FOLD is disabled.
572
**
573
** The compromise is to declare them as loads, emit them like stores and
574
** CSE whole chains manually when the BUFSTR is to be emitted. Any chain
575
** fragments left over from CSE are eliminated by DCE.
576
*/
577

578
/* BUFHDR is emitted like a store, see below. */
579

580
LJFOLD(BUFPUT BUFHDR BUFSTR)
581
LJFOLDF(bufput_append)
44✔
582
{
583
  /* New buffer, no other buffer op inbetween and same buffer? */
584
  if ((J->flags & JIT_F_OPT_FWD) &&
44✔
585
      !(fleft->op2 & IRBUFHDR_APPEND) &&
44✔
586
      fleft->prev == fright->op2 &&
43✔
587
      fleft->op1 == IR(fright->op2)->op1) {
30✔
588
    IRRef ref = fins->op1;
29✔
589
    IR(ref)->op2 = (fleft->op2 | IRBUFHDR_APPEND);  /* Modify BUFHDR. */
29✔
590
    IR(ref)->op1 = fright->op1;
29✔
591
    return ref;
29✔
592
  }
593
  return EMITFOLD;  /* Always emit, CSE later. */
15✔
594
}
595

596
LJFOLD(BUFPUT any any)
597
LJFOLDF(bufput_kgc)
1,427✔
598
{
599
  if (LJ_LIKELY(J->flags & JIT_F_OPT_FOLD) && fright->o == IR_KGC) {
1,427✔
600
    GCstr *s2 = ir_kstr(fright);
640✔
601
    if (s2->len == 0) {  /* Empty string? */
640✔
602
      return LEFTFOLD;
12✔
603
    } else {
604
      if (fleft->o == IR_BUFPUT && irref_isk(fleft->op2) &&
628✔
605
          !irt_isphi(fleft->t)) {  /* Join two constant string puts in a row. */
45✔
606
        GCstr *s1 = ir_kstr(IR(fleft->op2));
45✔
607
        IRRef kref = lj_ir_kstr(J, lj_buf_cat2str(J->L, s1, s2));
45✔
608
        /* lj_ir_kstr() may realloc the IR and invalidates any IRIns *. */
609
        IR(fins->op1)->op2 = kref;  /* Modify previous BUFPUT. */
45✔
610
        return fins->op1;
45✔
611
      }
612
    }
613
  }
614
  return EMITFOLD;  /* Always emit, CSE later. */
1,370✔
615
}
616

617
LJFOLD(BUFSTR any any)
618
LJFOLDF(bufstr_kfold_cse)
850✔
619
{
620
  lj_assertJ(fleft->o == IR_BUFHDR || fleft->o == IR_BUFPUT ||
850✔
621
             fleft->o == IR_CALLL,
622
             "bad buffer constructor IR op %d", fleft->o);
623
  if (LJ_LIKELY(J->flags & JIT_F_OPT_FOLD)) {
850✔
624
    if (fleft->o == IR_BUFHDR) {  /* No put operations? */
850✔
625
      if (!(fleft->op2 & IRBUFHDR_APPEND))  /* Empty buffer? */
2✔
626
        return lj_ir_kstr(J, &J2G(J)->strempty);
1✔
627
      fins->op1 = fleft->op1;
1✔
628
      fins->op2 = fleft->prev;  /* Relies on checks in bufput_append. */
1✔
629
      return CSEFOLD;
1✔
630
    } else if (fleft->o == IR_BUFPUT) {
848✔
631
      IRIns *irb = IR(fleft->op1);
592✔
632
      if (irb->o == IR_BUFHDR && !(irb->op2 & IRBUFHDR_APPEND))
592✔
633
        return fleft->op2;  /* Shortcut for a single put operation. */
74✔
634
    }
635
  }
636
  /* Try to CSE the whole chain. */
637
  if (LJ_LIKELY(J->flags & JIT_F_OPT_CSE)) {
774✔
638
    IRRef ref = J->chain[IR_BUFSTR];
774✔
639
    while (ref) {
3,674✔
640
      IRIns *irs = IR(ref), *ira = fleft, *irb = IR(irs->op1);
2,947✔
641
      while (ira->o == irb->o && ira->op2 == irb->op2) {
3,667✔
642
        lj_assertJ(ira->o == IR_BUFHDR || ira->o == IR_BUFPUT ||
778✔
643
                   ira->o == IR_CALLL || ira->o == IR_CARG,
644
                   "bad buffer constructor IR op %d", ira->o);
645
        if (ira->o == IR_BUFHDR && !(ira->op2 & IRBUFHDR_APPEND))
778✔
646
          return ref;  /* CSE succeeded. */
47✔
647
        if (ira->o == IR_CALLL && ira->op2 == IRCALL_lj_buf_puttab)
731✔
648
          break;
649
        ira = IR(ira->op1);
720✔
650
        irb = IR(irb->op1);
720✔
651
      }
652
      ref = irs->prev;
2,900✔
653
    }
654
  }
655
  return EMITFOLD;  /* No CSE possible. */
727✔
656
}
657

658
LJFOLD(CALLL CARG IRCALL_lj_buf_putstr_reverse)
659
LJFOLD(CALLL CARG IRCALL_lj_buf_putstr_upper)
660
LJFOLD(CALLL CARG IRCALL_lj_buf_putstr_lower)
661
LJFOLD(CALLL CARG IRCALL_lj_strfmt_putquoted)
662
LJFOLDF(bufput_kfold_op)
33✔
663
{
664
  if (irref_isk(fleft->op2)) {
33✔
665
    const CCallInfo *ci = &lj_ir_callinfo[fins->op2];
5✔
666
    SBuf *sb = lj_buf_tmp_(J->L);
5✔
667
    sb = ((SBuf * (LJ_FASTCALL *)(SBuf *, GCstr *))ci->func)(sb,
10✔
668
                                                       ir_kstr(IR(fleft->op2)));
5✔
669
    fins->o = IR_BUFPUT;
5✔
670
    fins->op1 = fleft->op1;
5✔
671
    fins->op2 = lj_ir_kstr(J, lj_buf_tostr(sb));
5✔
672
    return RETRYFOLD;
5✔
673
  }
674
  return EMITFOLD;  /* Always emit, CSE later. */
28✔
675
}
676

677
LJFOLD(CALLL CARG IRCALL_lj_buf_putstr_rep)
678
LJFOLDF(bufput_kfold_rep)
54✔
679
{
680
  if (irref_isk(fleft->op2)) {
54✔
681
    IRIns *irc = IR(fleft->op1);
8✔
682
    if (irref_isk(irc->op2)) {
8✔
683
      SBuf *sb = lj_buf_tmp_(J->L);
6✔
684
      sb = lj_buf_putstr_rep(sb, ir_kstr(IR(irc->op2)), IR(fleft->op2)->i);
6✔
685
      fins->o = IR_BUFPUT;
6✔
686
      fins->op1 = irc->op1;
6✔
687
      fins->op2 = lj_ir_kstr(J, lj_buf_tostr(sb));
6✔
688
      return RETRYFOLD;
6✔
689
    }
690
  }
691
  return EMITFOLD;  /* Always emit, CSE later. */
48✔
692
}
693

694
LJFOLD(CALLL CARG IRCALL_lj_strfmt_putfxint)
695
LJFOLD(CALLL CARG IRCALL_lj_strfmt_putfnum_int)
696
LJFOLD(CALLL CARG IRCALL_lj_strfmt_putfnum_uint)
697
LJFOLD(CALLL CARG IRCALL_lj_strfmt_putfnum)
698
LJFOLD(CALLL CARG IRCALL_lj_strfmt_putfstr)
699
LJFOLD(CALLL CARG IRCALL_lj_strfmt_putfchar)
700
LJFOLDF(bufput_kfold_fmt)
239✔
701
{
702
  IRIns *irc = IR(fleft->op1);
239✔
703
  lj_assertJ(irref_isk(irc->op2), "SFormat must be const");
239✔
704
  if (irref_isk(fleft->op2)) {
239✔
705
    SFormat sf = (SFormat)IR(irc->op2)->i;
44✔
706
    IRIns *ira = IR(fleft->op2);
44✔
707
    SBuf *sb = lj_buf_tmp_(J->L);
44✔
708
    switch (fins->op2) {
44✔
709
    case IRCALL_lj_strfmt_putfxint:
1✔
710
      sb = lj_strfmt_putfxint(sb, sf, ir_k64(ira)->u64);
1✔
711
      break;
1✔
712
    case IRCALL_lj_strfmt_putfstr:
×
713
      sb = lj_strfmt_putfstr(sb, sf, ir_kstr(ira));
×
714
      break;
×
715
    case IRCALL_lj_strfmt_putfchar:
×
716
      sb = lj_strfmt_putfchar(sb, sf, ira->i);
×
717
      break;
×
718
    case IRCALL_lj_strfmt_putfnum_int:
43✔
719
    case IRCALL_lj_strfmt_putfnum_uint:
720
    case IRCALL_lj_strfmt_putfnum:
721
    default: {
722
      const CCallInfo *ci = &lj_ir_callinfo[fins->op2];
43✔
723
      sb = ((SBuf * (*)(SBuf *, SFormat, lua_Number))ci->func)(sb, sf,
43✔
724
                                                         ir_knum(ira)->n);
725
      break;
43✔
726
      }
727
    }
728
    fins->o = IR_BUFPUT;
44✔
729
    fins->op1 = irc->op1;
44✔
730
    fins->op2 = lj_ir_kstr(J, lj_buf_tostr(sb));
44✔
731
    return RETRYFOLD;
44✔
732
  }
733
  return EMITFOLD;  /* Always emit, CSE later. */
195✔
734
}
735

736
/* -- Constant folding of pointer arithmetic ------------------------------ */
737

738
LJFOLD(ADD KGC KINT)
739
LJFOLD(ADD KGC KINT64)
740
LJFOLDF(kfold_add_kgc)
32✔
741
{
742
  GCobj *o = ir_kgc(fleft);
32✔
743
#if LJ_64
744
  ptrdiff_t ofs = (ptrdiff_t)ir_kint64(fright)->u64;
32✔
745
#else
746
  ptrdiff_t ofs = fright->i;
747
#endif
748
#if LJ_HASFFI
749
  if (irt_iscdata(fleft->t)) {
32✔
750
    CType *ct = ctype_raw(ctype_ctsG(J2G(J)), gco2cd(o)->ctypeid);
30✔
751
    if (ctype_isnum(ct->info) || ctype_isenum(ct->info) ||
30✔
752
        ctype_isptr(ct->info) || ctype_isfunc(ct->info) ||
30✔
753
        ctype_iscomplex(ct->info) || ctype_isvector(ct->info))
30✔
754
      return lj_ir_kkptr(J, (char *)o + ofs);
1✔
755
  }
756
#endif
757
  return lj_ir_kptr(J, (char *)o + ofs);
31✔
758
}
759

760
LJFOLD(ADD KPTR KINT)
761
LJFOLD(ADD KPTR KINT64)
762
LJFOLD(ADD KKPTR KINT)
763
LJFOLD(ADD KKPTR KINT64)
764
LJFOLDF(kfold_add_kptr)
24✔
765
{
766
  void *p = ir_kptr(fleft);
24✔
767
#if LJ_64
768
  ptrdiff_t ofs = (ptrdiff_t)ir_kint64(fright)->u64;
24✔
769
#else
770
  ptrdiff_t ofs = fright->i;
771
#endif
772
  return lj_ir_kptr_(J, fleft->o, (char *)p + ofs);
24✔
773
}
774

775
LJFOLD(ADD any KGC)
776
LJFOLD(ADD any KPTR)
777
LJFOLD(ADD any KKPTR)
778
LJFOLDF(kfold_add_kright)
22✔
779
{
780
  if (fleft->o == IR_KINT || fleft->o == IR_KINT64) {
22✔
781
    IRRef1 tmp = fins->op1; fins->op1 = fins->op2; fins->op2 = tmp;
22✔
782
    return RETRYFOLD;
22✔
783
  }
784
  return NEXTFOLD;
785
}
786

787
/* -- Constant folding of conversions ------------------------------------- */
788

789
LJFOLD(TOBIT KNUM KNUM)
790
LJFOLDF(kfold_tobit)
2✔
791
{
792
  return INTFOLD(lj_num2bit(knumleft));
2✔
793
}
794

795
LJFOLD(CONV KINT IRCONV_NUM_INT)
796
LJFOLDF(kfold_conv_kint_num)
2,992✔
797
{
798
  return lj_ir_knum(J, (lua_Number)fleft->i);
2,992✔
799
}
800

801
LJFOLD(CONV KINT IRCONV_NUM_U32)
802
LJFOLDF(kfold_conv_kintu32_num)
×
803
{
804
  return lj_ir_knum(J, (lua_Number)(uint32_t)fleft->i);
×
805
}
806

807
LJFOLD(CONV KINT IRCONV_INT_I8)
808
LJFOLD(CONV KINT IRCONV_INT_U8)
809
LJFOLD(CONV KINT IRCONV_INT_I16)
810
LJFOLD(CONV KINT IRCONV_INT_U16)
811
LJFOLDF(kfold_conv_kint_ext)
3✔
812
{
813
  int32_t k = fleft->i;
3✔
814
  if ((fins->op2 & IRCONV_SRCMASK) == IRT_I8) k = (int8_t)k;
3✔
815
  else if ((fins->op2 & IRCONV_SRCMASK) == IRT_U8) k = (uint8_t)k;
3✔
816
  else if ((fins->op2 & IRCONV_SRCMASK) == IRT_I16) k = (int16_t)k;
×
817
  else k = (uint16_t)k;
×
818
  return INTFOLD(k);
3✔
819
}
820

821
LJFOLD(CONV KINT IRCONV_I64_INT)
822
LJFOLD(CONV KINT IRCONV_U64_INT)
823
LJFOLD(CONV KINT IRCONV_I64_U32)
824
LJFOLD(CONV KINT IRCONV_U64_U32)
825
LJFOLDF(kfold_conv_kint_i64)
113✔
826
{
827
  if ((fins->op2 & IRCONV_SEXT))
113✔
828
    return INT64FOLD((uint64_t)(int64_t)fleft->i);
109✔
829
  else
830
    return INT64FOLD((uint64_t)(int64_t)(uint32_t)fleft->i);
4✔
831
}
832

833
LJFOLD(CONV KINT64 IRCONV_NUM_I64)
834
LJFOLDF(kfold_conv_kint64_num_i64)
×
835
{
836
  return lj_ir_knum(J, (lua_Number)(int64_t)ir_kint64(fleft)->u64);
×
837
}
838

839
LJFOLD(CONV KINT64 IRCONV_NUM_U64)
840
LJFOLDF(kfold_conv_kint64_num_u64)
×
841
{
842
  return lj_ir_knum(J, (lua_Number)ir_kint64(fleft)->u64);
×
843
}
844

845
LJFOLD(CONV KINT64 IRCONV_INT_I64)
846
LJFOLD(CONV KINT64 IRCONV_U32_I64)
847
LJFOLDF(kfold_conv_kint64_int_i64)
×
848
{
849
  return INTFOLD((int32_t)ir_kint64(fleft)->u64);
×
850
}
851

852
LJFOLD(CONV KNUM IRCONV_INT_NUM)
853
LJFOLDF(kfold_conv_knum_int_num)
7✔
854
{
855
  lua_Number n = knumleft;
7✔
856
  int32_t k = lj_num2int(n);
7✔
857
  if (irt_isguard(fins->t) && n != (lua_Number)k) {
7✔
858
    /* We're about to create a guard which always fails, like CONV +1.5.
859
    ** Some pathological loops cause this during LICM, e.g.:
860
    **   local x,k,t = 0,1.5,{1,[1.5]=2}
861
    **   for i=1,200 do x = x+ t[k]; k = k == 1 and 1.5 or 1 end
862
    **   assert(x == 300)
863
    */
864
    return FAILFOLD;
865
  }
866
  return INTFOLD(k);
7✔
867
}
868

869
LJFOLD(CONV KNUM IRCONV_U32_NUM)
870
LJFOLDF(kfold_conv_knum_u32_num)
×
871
{
872
#ifdef _MSC_VER
873
  {  /* Workaround for MSVC bug. */
874
    volatile uint32_t u = (uint32_t)knumleft;
875
    return INTFOLD((int32_t)u);
876
  }
877
#else
878
  return INTFOLD((int32_t)(uint32_t)knumleft);
×
879
#endif
880
}
881

882
LJFOLD(CONV KNUM IRCONV_I64_NUM)
883
LJFOLDF(kfold_conv_knum_i64_num)
×
884
{
885
  return INT64FOLD((uint64_t)(int64_t)knumleft);
×
886
}
887

888
LJFOLD(CONV KNUM IRCONV_U64_NUM)
889
LJFOLDF(kfold_conv_knum_u64_num)
×
890
{
891
  return INT64FOLD(lj_num2u64(knumleft));
×
892
}
893

894
LJFOLD(TOSTR KNUM any)
895
LJFOLDF(kfold_tostr_knum)
1✔
896
{
897
  return lj_ir_kstr(J, lj_strfmt_num(J->L, ir_knum(fleft)));
1✔
898
}
899

900
LJFOLD(TOSTR KINT any)
901
LJFOLDF(kfold_tostr_kint)
9✔
902
{
903
  return lj_ir_kstr(J, fins->op2 == IRTOSTR_INT ?
9✔
904
                       lj_strfmt_int(J->L, fleft->i) :
905
                       lj_strfmt_char(J->L, fleft->i));
906
}
907

908
LJFOLD(STRTO KGC)
909
LJFOLDF(kfold_strto)
35✔
910
{
911
  TValue n;
35✔
912
  if (lj_strscan_num(ir_kstr(fleft), &n))
35✔
913
    return lj_ir_knum(J, numV(&n));
35✔
914
  return FAILFOLD;
915
}
916

917
/* -- Constant folding of equality checks --------------------------------- */
918

919
/* Don't constant-fold away FLOAD checks against KNULL. */
920
LJFOLD(EQ FLOAD KNULL)
921
LJFOLD(NE FLOAD KNULL)
922
LJFOLDX(lj_opt_cse)
923

924
/* But fold all other KNULL compares, since only KNULL is equal to KNULL. */
925
LJFOLD(EQ any KNULL)
926
LJFOLD(NE any KNULL)
927
LJFOLD(EQ KNULL any)
928
LJFOLD(NE KNULL any)
929
LJFOLD(EQ KINT KINT)  /* Constants are unique, so same refs <==> same value. */
930
LJFOLD(NE KINT KINT)
931
LJFOLD(EQ KINT64 KINT64)
932
LJFOLD(NE KINT64 KINT64)
933
LJFOLD(EQ KGC KGC)
934
LJFOLD(NE KGC KGC)
935
LJFOLDF(kfold_kref)
6,333✔
936
{
937
  return CONDFOLD((fins->op1 == fins->op2) ^ (fins->o == IR_NE));
6,333✔
938
}
939

940
/* -- Algebraic shortcuts ------------------------------------------------- */
941

942
LJFOLD(FPMATH FPMATH IRFPM_FLOOR)
943
LJFOLD(FPMATH FPMATH IRFPM_CEIL)
944
LJFOLD(FPMATH FPMATH IRFPM_TRUNC)
945
LJFOLDF(shortcut_round)
×
946
{
947
  IRFPMathOp op = (IRFPMathOp)fleft->op2;
×
948
  if (op == IRFPM_FLOOR || op == IRFPM_CEIL || op == IRFPM_TRUNC)
×
949
    return LEFTFOLD;  /* round(round_left(x)) = round_left(x) */
×
950
  return NEXTFOLD;
951
}
952

953
LJFOLD(ABS ABS FLOAD)
954
LJFOLDF(shortcut_left)
×
955
{
956
  return LEFTFOLD;  /* f(g(x)) ==> g(x) */
×
957
}
958

959
LJFOLD(ABS NEG FLOAD)
960
LJFOLDF(shortcut_dropleft)
×
961
{
962
  PHIBARRIER(fleft);
×
963
  fins->op1 = fleft->op1;  /* abs(neg(x)) ==> abs(x) */
×
964
  return RETRYFOLD;
×
965
}
966

967
/* Note: no safe shortcuts with STRTO and TOSTR ("1e2" ==> +100 ==> "100"). */
968
LJFOLD(NEG NEG any)
969
LJFOLD(BNOT BNOT)
970
LJFOLD(BSWAP BSWAP)
971
LJFOLDF(shortcut_leftleft)
×
972
{
973
  PHIBARRIER(fleft);  /* See above. Fold would be ok, but not beneficial. */
×
974
  return fleft->op1;  /* f(g(x)) ==> x */
×
975
}
976

977
/* -- FP algebraic simplifications ---------------------------------------- */
978

979
/* FP arithmetic is tricky -- there's not much to simplify.
980
** Please note the following common pitfalls before sending "improvements":
981
**   x+0 ==> x  is INVALID for x=-0
982
**   0-x ==> -x is INVALID for x=+0
983
**   x*0 ==> 0  is INVALID for x=-0, x=+-Inf or x=NaN
984
*/
985

986
LJFOLD(ADD NEG any)
987
LJFOLDF(simplify_numadd_negx)
×
988
{
989
  PHIBARRIER(fleft);
×
990
  fins->o = IR_SUB;  /* (-a) + b ==> b - a */
×
991
  fins->op1 = fins->op2;
×
992
  fins->op2 = fleft->op1;
×
993
  return RETRYFOLD;
×
994
}
995

996
LJFOLD(ADD any NEG)
997
LJFOLDF(simplify_numadd_xneg)
×
998
{
999
  PHIBARRIER(fright);
×
1000
  fins->o = IR_SUB;  /* a + (-b) ==> a - b */
×
1001
  fins->op2 = fright->op1;
×
1002
  return RETRYFOLD;
×
1003
}
1004

1005
LJFOLD(SUB any KNUM)
1006
LJFOLDF(simplify_numsub_k)
384✔
1007
{
1008
  lua_Number n = knumright;
384✔
1009
  if (n == 0.0)  /* x - (+-0) ==> x */
384✔
1010
    return LEFTFOLD;
25✔
1011
  return NEXTFOLD;
1012
}
1013

1014
LJFOLD(SUB NEG KNUM)
1015
LJFOLDF(simplify_numsub_negk)
×
1016
{
1017
  PHIBARRIER(fleft);
×
1018
  fins->op2 = fleft->op1;  /* (-x) - k ==> (-k) - x */
×
1019
  fins->op1 = (IRRef1)lj_ir_knum(J, -knumright);
×
1020
  return RETRYFOLD;
×
1021
}
1022

1023
LJFOLD(SUB any NEG)
1024
LJFOLDF(simplify_numsub_xneg)
×
1025
{
1026
  PHIBARRIER(fright);
×
1027
  fins->o = IR_ADD;  /* a - (-b) ==> a + b */
×
1028
  fins->op2 = fright->op1;
×
1029
  return RETRYFOLD;
×
1030
}
1031

1032
LJFOLD(MUL any KNUM)
1033
LJFOLD(DIV any KNUM)
1034
LJFOLDF(simplify_nummuldiv_k)
224✔
1035
{
1036
  lua_Number n = knumright;
224✔
1037
  if (n == 1.0) {  /* x o 1 ==> x */
224✔
1038
    return LEFTFOLD;
×
1039
  } else if (n == -1.0) {  /* x o -1 ==> -x */
224✔
1040
    IRRef op1 = fins->op1;
×
1041
    fins->op2 = (IRRef1)lj_ir_ksimd(J, LJ_KSIMD_NEG);  /* Modifies fins. */
×
1042
    fins->op1 = op1;
×
1043
    fins->o = IR_NEG;
×
1044
    return RETRYFOLD;
×
1045
  } else if (fins->o == IR_MUL && n == 2.0) {  /* x * 2 ==> x + x */
224✔
1046
    fins->o = IR_ADD;
2✔
1047
    fins->op2 = fins->op1;
2✔
1048
    return RETRYFOLD;
2✔
1049
  } else if (fins->o == IR_DIV) {  /* x / 2^k ==> x * 2^-k */
222✔
1050
    uint64_t u = ir_knum(fright)->u64;
129✔
1051
    uint32_t ex = ((uint32_t)(u >> 52) & 0x7ff);
129✔
1052
    if ((u & U64x(000fffff,ffffffff)) == 0 && ex - 1 < 0x7fd) {
129✔
1053
      u = (u & ((uint64_t)1 << 63)) | ((uint64_t)(0x7fe - ex) << 52);
6✔
1054
      fins->o = IR_MUL;  /* Multiply by exact reciprocal. */
6✔
1055
      fins->op2 = lj_ir_knum_u64(J, u);
6✔
1056
      return RETRYFOLD;
6✔
1057
    }
1058
  }
1059
  return NEXTFOLD;
1060
}
1061

1062
LJFOLD(MUL NEG KNUM)
1063
LJFOLD(DIV NEG KNUM)
1064
LJFOLDF(simplify_nummuldiv_negk)
×
1065
{
1066
  PHIBARRIER(fleft);
×
1067
  fins->op1 = fleft->op1;  /* (-a) o k ==> a o (-k) */
×
1068
  fins->op2 = (IRRef1)lj_ir_knum(J, -knumright);
×
1069
  return RETRYFOLD;
×
1070
}
1071

1072
LJFOLD(MUL NEG NEG)
1073
LJFOLD(DIV NEG NEG)
1074
LJFOLDF(simplify_nummuldiv_negneg)
×
1075
{
1076
  PHIBARRIER(fleft);
×
1077
  PHIBARRIER(fright);
×
1078
  fins->op1 = fleft->op1;  /* (-a) o (-b) ==> a o b */
×
1079
  fins->op2 = fright->op1;
×
1080
  return RETRYFOLD;
×
1081
}
1082

1083
LJFOLD(POW any KNUM)
1084
LJFOLDF(simplify_numpow_k)
4✔
1085
{
1086
  if (knumright == 0)  /* x ^ 0 ==> 1 */
4✔
1087
    return lj_ir_knum_one(J);  /* Result must be a number, not an int. */
×
1088
  else if (knumright == 1)  /* x ^ 1 ==> x */
4✔
1089
    return LEFTFOLD;
×
1090
  else if (knumright == 2)  /* x ^ 2 ==> x * x */
4✔
1091
    return emitir(IRTN(IR_MUL), fins->op1, fins->op1);
×
1092
  else
1093
    return NEXTFOLD;
1094
}
1095

1096
/* -- Simplify conversions ------------------------------------------------ */
1097

1098
LJFOLD(CONV CONV IRCONV_NUM_INT)  /* _NUM */
1099
LJFOLDF(shortcut_conv_num_int)
9✔
1100
{
1101
  PHIBARRIER(fleft);
9✔
1102
  /* Only safe with a guarded conversion to int. */
1103
  if ((fleft->op2 & IRCONV_SRCMASK) == IRT_NUM && irt_isguard(fleft->t))
9✔
1104
    return fleft->op1;  /* f(g(x)) ==> x */
3✔
1105
  return NEXTFOLD;
1106
}
1107

1108
LJFOLD(CONV CONV IRCONV_INT_NUM)  /* _INT */
1109
LJFOLD(CONV CONV IRCONV_U32_NUM)  /* _U32*/
1110
LJFOLDF(simplify_conv_int_num)
22✔
1111
{
1112
  /* Fold even across PHI to avoid expensive num->int conversions in loop. */
1113
  if ((fleft->op2 & IRCONV_SRCMASK) ==
22✔
1114
      ((fins->op2 & IRCONV_DSTMASK) >> IRCONV_DSH))
22✔
1115
    return fleft->op1;
19✔
1116
  return NEXTFOLD;
1117
}
1118

1119
LJFOLD(CONV CONV IRCONV_I64_NUM)  /* _INT or _U32 */
1120
LJFOLD(CONV CONV IRCONV_U64_NUM)  /* _INT or _U32 */
1121
LJFOLDF(simplify_conv_i64_num)
2✔
1122
{
1123
  PHIBARRIER(fleft);
2✔
1124
  if ((fleft->op2 & IRCONV_SRCMASK) == IRT_INT) {
2✔
1125
    /* Reduce to a sign-extension. */
1126
    fins->op1 = fleft->op1;
×
1127
    fins->op2 = ((IRT_I64<<5)|IRT_INT|IRCONV_SEXT);
×
1128
    return RETRYFOLD;
×
1129
  } else if ((fleft->op2 & IRCONV_SRCMASK) == IRT_U32) {
2✔
1130
#if LJ_TARGET_X64
1131
    return fleft->op1;
1✔
1132
#else
1133
    /* Reduce to a zero-extension. */
1134
    fins->op1 = fleft->op1;
1135
    fins->op2 = (IRT_I64<<5)|IRT_U32;
1136
    return RETRYFOLD;
1137
#endif
1138
  }
1139
  return NEXTFOLD;
1140
}
1141

1142
LJFOLD(CONV CONV IRCONV_INT_I64)  /* _INT or _U32 */
1143
LJFOLD(CONV CONV IRCONV_INT_U64)  /* _INT or _U32 */
1144
LJFOLD(CONV CONV IRCONV_U32_I64)  /* _INT or _U32 */
1145
LJFOLD(CONV CONV IRCONV_U32_U64)  /* _INT or _U32 */
1146
LJFOLDF(simplify_conv_int_i64)
2✔
1147
{
1148
  int src;
2✔
1149
  PHIBARRIER(fleft);
2✔
1150
  src = (fleft->op2 & IRCONV_SRCMASK);
2✔
1151
  if (src == IRT_INT || src == IRT_U32) {
2✔
1152
    if (src == ((fins->op2 & IRCONV_DSTMASK) >> IRCONV_DSH)) {
2✔
1153
      return fleft->op1;
×
1154
    } else {
1155
      fins->op2 = ((fins->op2 & IRCONV_DSTMASK) | src);
2✔
1156
      fins->op1 = fleft->op1;
2✔
1157
      return RETRYFOLD;
2✔
1158
    }
1159
  }
1160
  return NEXTFOLD;
1161
}
1162

1163
LJFOLD(CONV CONV IRCONV_FLOAT_NUM)  /* _FLOAT */
1164
LJFOLDF(simplify_conv_flt_num)
5✔
1165
{
1166
  PHIBARRIER(fleft);
5✔
1167
  if ((fleft->op2 & IRCONV_SRCMASK) == IRT_FLOAT)
5✔
1168
    return fleft->op1;
1✔
1169
  return NEXTFOLD;
1170
}
1171

1172
/* Shortcut TOBIT + IRT_NUM <- IRT_INT/IRT_U32 conversion. */
1173
LJFOLD(TOBIT CONV KNUM)
1174
LJFOLDF(simplify_tobit_conv)
3✔
1175
{
1176
  /* Fold even across PHI to avoid expensive num->int conversions in loop. */
1177
  if ((fleft->op2 & IRCONV_SRCMASK) == IRT_INT) {
3✔
1178
    lj_assertJ(irt_isnum(fleft->t), "expected TOBIT number arg");
3✔
1179
    return fleft->op1;
3✔
1180
  } else if ((fleft->op2 & IRCONV_SRCMASK) == IRT_U32) {
×
1181
    lj_assertJ(irt_isnum(fleft->t), "expected TOBIT number arg");
×
1182
    fins->o = IR_CONV;
×
1183
    fins->op1 = fleft->op1;
×
1184
    fins->op2 = (IRT_INT<<5)|IRT_U32;
×
1185
    return RETRYFOLD;
×
1186
  }
1187
  return NEXTFOLD;
1188
}
1189

1190
/* Shortcut floor/ceil/round + IRT_NUM <- IRT_INT/IRT_U32 conversion. */
1191
LJFOLD(FPMATH CONV IRFPM_FLOOR)
1192
LJFOLD(FPMATH CONV IRFPM_CEIL)
1193
LJFOLD(FPMATH CONV IRFPM_TRUNC)
1194
LJFOLDF(simplify_floor_conv)
×
1195
{
1196
  if ((fleft->op2 & IRCONV_SRCMASK) == IRT_INT ||
×
1197
      (fleft->op2 & IRCONV_SRCMASK) == IRT_U32)
1198
    return LEFTFOLD;
×
1199
  return NEXTFOLD;
1200
}
1201

1202
/* Strength reduction of widening. */
1203
LJFOLD(CONV any IRCONV_I64_INT)
1204
LJFOLD(CONV any IRCONV_U64_INT)
1205
LJFOLDF(simplify_conv_sext)
171✔
1206
{
1207
  IRRef ref = fins->op1;
171✔
1208
  int64_t ofs = 0;
171✔
1209
  if (!(fins->op2 & IRCONV_SEXT))
171✔
1210
    return NEXTFOLD;
1211
  PHIBARRIER(fleft);
151✔
1212
  if (fleft->o == IR_XLOAD && (irt_isu8(fleft->t) || irt_isu16(fleft->t)))
149✔
1213
    goto ok_reduce;
×
1214
  if (fleft->o == IR_ADD && irref_isk(fleft->op2)) {
149✔
1215
    ofs = (int64_t)IR(fleft->op2)->i;
×
1216
    ref = fleft->op1;
×
1217
  }
1218
  /* Use scalar evolution analysis results to strength-reduce sign-extension. */
1219
  if (ref == J->scev.idx) {
149✔
1220
    IRRef lo = J->scev.dir ? J->scev.start : J->scev.stop;
140✔
1221
    lj_assertJ(irt_isint(J->scev.t), "only int SCEV supported");
140✔
1222
    if (lo && IR(lo)->o == IR_KINT && IR(lo)->i + ofs >= 0) {
140✔
1223
    ok_reduce:
138✔
1224
#if LJ_TARGET_X64
1225
      /* Eliminate widening. All 32 bit ops do an implicit zero-extension. */
1226
      return LEFTFOLD;
138✔
1227
#else
1228
      /* Reduce to a (cheaper) zero-extension. */
1229
      fins->op2 &= ~IRCONV_SEXT;
1230
      return RETRYFOLD;
1231
#endif
1232
    }
1233
  }
1234
  return NEXTFOLD;
1235
}
1236

1237
/* Strength reduction of narrowing. */
1238
LJFOLD(CONV ADD IRCONV_INT_I64)
1239
LJFOLD(CONV SUB IRCONV_INT_I64)
1240
LJFOLD(CONV MUL IRCONV_INT_I64)
1241
LJFOLD(CONV ADD IRCONV_INT_U64)
1242
LJFOLD(CONV SUB IRCONV_INT_U64)
1243
LJFOLD(CONV MUL IRCONV_INT_U64)
1244
LJFOLD(CONV ADD IRCONV_U32_I64)
1245
LJFOLD(CONV SUB IRCONV_U32_I64)
1246
LJFOLD(CONV MUL IRCONV_U32_I64)
1247
LJFOLD(CONV ADD IRCONV_U32_U64)
1248
LJFOLD(CONV SUB IRCONV_U32_U64)
1249
LJFOLD(CONV MUL IRCONV_U32_U64)
1250
LJFOLDF(simplify_conv_narrow)
2✔
1251
{
1252
  IROp op = (IROp)fleft->o;
2✔
1253
  IRType t = irt_type(fins->t);
2✔
1254
  IRRef op1 = fleft->op1, op2 = fleft->op2, mode = fins->op2;
2✔
1255
  PHIBARRIER(fleft);
2✔
1256
  op1 = emitir(IRT(IR_CONV, t), op1, mode);
2✔
1257
  op2 = emitir(IRT(IR_CONV, t), op2, mode);
2✔
1258
  fins->ot = IRT(op, t);
2✔
1259
  fins->op1 = op1;
2✔
1260
  fins->op2 = op2;
2✔
1261
  return RETRYFOLD;
2✔
1262
}
1263

1264
/* Special CSE rule for CONV. */
1265
LJFOLD(CONV any any)
1266
LJFOLDF(cse_conv)
1,757✔
1267
{
1268
  if (LJ_LIKELY(J->flags & JIT_F_OPT_CSE)) {
1,757✔
1269
    IRRef op1 = fins->op1, op2 = (fins->op2 & IRCONV_MODEMASK);
1,757✔
1270
    uint8_t guard = irt_isguard(fins->t);
1,757✔
1271
    IRRef ref = J->chain[IR_CONV];
1,757✔
1272
    while (ref > op1) {
2,053✔
1273
      IRIns *ir = IR(ref);
616✔
1274
      /* Commoning with stronger checks is ok. */
1275
      if (ir->op1 == op1 && (ir->op2 & IRCONV_MODEMASK) == op2 &&
616✔
1276
          irt_isguard(ir->t) >= guard)
320✔
1277
        return ref;
320✔
1278
      ref = ir->prev;
296✔
1279
    }
1280
  }
1281
  return EMITFOLD;  /* No fallthrough to regular CSE. */
1,437✔
1282
}
1283

1284
/* FP conversion narrowing. */
1285
LJFOLD(TOBIT ADD KNUM)
1286
LJFOLD(TOBIT SUB KNUM)
1287
LJFOLD(CONV ADD IRCONV_INT_NUM)
1288
LJFOLD(CONV SUB IRCONV_INT_NUM)
1289
LJFOLD(CONV ADD IRCONV_I64_NUM)
1290
LJFOLD(CONV SUB IRCONV_I64_NUM)
1291
LJFOLDF(narrow_convert)
237✔
1292
{
1293
  PHIBARRIER(fleft);
237✔
1294
  /* Narrowing ignores PHIs and repeating it inside the loop is not useful. */
1295
  if (J->chain[IR_LOOP])
228✔
1296
    return NEXTFOLD;
1297
  lj_assertJ(fins->o != IR_CONV || (fins->op2&IRCONV_CONVMASK) != IRCONV_TOBIT,
183✔
1298
             "unexpected CONV TOBIT");
1299
  return lj_opt_narrow_convert(J);
183✔
1300
}
1301

1302
/* -- Integer algebraic simplifications ----------------------------------- */
1303

1304
LJFOLD(ADD any KINT)
1305
LJFOLD(ADDOV any KINT)
1306
LJFOLD(SUBOV any KINT)
1307
LJFOLDF(simplify_intadd_k)
3,445✔
1308
{
1309
  if (fright->i == 0)  /* i o 0 ==> i */
3,445✔
1310
    return LEFTFOLD;
135✔
1311
  return NEXTFOLD;
1312
}
1313

1314
LJFOLD(MULOV any KINT)
1315
LJFOLDF(simplify_intmul_k)
×
1316
{
1317
  if (fright->i == 0)  /* i * 0 ==> 0 */
×
1318
    return RIGHTFOLD;
×
1319
  if (fright->i == 1)  /* i * 1 ==> i */
×
1320
    return LEFTFOLD;
×
1321
  if (fright->i == 2) {  /* i * 2 ==> i + i */
×
1322
    fins->o = IR_ADDOV;
×
1323
    fins->op2 = fins->op1;
×
1324
    return RETRYFOLD;
×
1325
  }
1326
  return NEXTFOLD;
1327
}
1328

1329
LJFOLD(SUB any KINT)
1330
LJFOLDF(simplify_intsub_k)
83✔
1331
{
1332
  if (fright->i == 0)  /* i - 0 ==> i */
83✔
1333
    return LEFTFOLD;
52✔
1334
  fins->o = IR_ADD;  /* i - k ==> i + (-k) */
31✔
1335
  fins->op2 = (IRRef1)lj_ir_kint(J, -fright->i);  /* Overflow for -2^31 ok. */
31✔
1336
  return RETRYFOLD;
31✔
1337
}
1338

1339
LJFOLD(SUB KINT any)
1340
LJFOLD(SUB KINT64 any)
1341
LJFOLDF(simplify_intsub_kleft)
20✔
1342
{
1343
  if (fleft->o == IR_KINT ? (fleft->i == 0) : (ir_kint64(fleft)->u64 == 0)) {
20✔
1344
    fins->o = IR_NEG;  /* 0 - i ==> -i */
10✔
1345
    fins->op1 = fins->op2;
10✔
1346
    return RETRYFOLD;
10✔
1347
  }
1348
  return NEXTFOLD;
1349
}
1350

1351
LJFOLD(ADD any KINT64)
1352
LJFOLDF(simplify_intadd_k64)
726✔
1353
{
1354
  if (ir_kint64(fright)->u64 == 0)  /* i + 0 ==> i */
726✔
1355
    return LEFTFOLD;
19✔
1356
  return NEXTFOLD;
1357
}
1358

1359
LJFOLD(SUB any KINT64)
1360
LJFOLDF(simplify_intsub_k64)
13✔
1361
{
1362
  uint64_t k = ir_kint64(fright)->u64;
13✔
1363
  if (k == 0)  /* i - 0 ==> i */
13✔
1364
    return LEFTFOLD;
×
1365
  fins->o = IR_ADD;  /* i - k ==> i + (-k) */
13✔
1366
  fins->op2 = (IRRef1)lj_ir_kint64(J, (uint64_t)-(int64_t)k);
13✔
1367
  return RETRYFOLD;
13✔
1368
}
1369

1370
static TRef simplify_intmul_k(jit_State *J, int32_t k)
144✔
1371
{
1372
  /* Note: many more simplifications are possible, e.g. 2^k1 +- 2^k2.
1373
  ** But this is mainly intended for simple address arithmetic.
1374
  ** Also it's easier for the backend to optimize the original multiplies.
1375
  */
1376
  if (k == 0) {  /* i * 0 ==> 0 */
144✔
1377
    return RIGHTFOLD;
×
1378
  } else if (k == 1) {  /* i * 1 ==> i */
144✔
1379
    return LEFTFOLD;
47✔
1380
  } else if ((k & (k-1)) == 0) {  /* i * 2^k ==> i << k */
97✔
1381
    fins->o = IR_BSHL;
95✔
1382
    fins->op2 = lj_ir_kint(J, lj_fls((uint32_t)k));
95✔
1383
    return RETRYFOLD;
95✔
1384
  }
1385
  return NEXTFOLD;
1386
}
1387

1388
LJFOLD(MUL any KINT)
1389
LJFOLDF(simplify_intmul_k32)
×
1390
{
1391
  if (fright->i >= 0)
×
1392
    return simplify_intmul_k(J, fright->i);
×
1393
  return NEXTFOLD;
1394
}
1395

1396
LJFOLD(MUL any KINT64)
1397
LJFOLDF(simplify_intmul_k64)
146✔
1398
{
1399
#if LJ_HASFFI
1400
  if (ir_kint64(fright)->u64 < 0x80000000u)
146✔
1401
    return simplify_intmul_k(J, (int32_t)ir_kint64(fright)->u64);
144✔
1402
  return NEXTFOLD;
1403
#else
1404
  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
1405
#endif
1406
}
1407

1408
LJFOLD(MOD any KINT)
1409
LJFOLDF(simplify_intmod_k)
65✔
1410
{
1411
  int32_t k = fright->i;
65✔
1412
  lj_assertJ(k != 0, "integer mod 0");
65✔
1413
  if (k > 0 && (k & (k-1)) == 0) {  /* i % (2^k) ==> i & (2^k-1) */
65✔
1414
    fins->o = IR_BAND;
3✔
1415
    fins->op2 = lj_ir_kint(J, k-1);
3✔
1416
    return RETRYFOLD;
3✔
1417
  }
1418
  return NEXTFOLD;
1419
}
1420

1421
LJFOLD(MOD KINT any)
1422
LJFOLDF(simplify_intmod_kleft)
2✔
1423
{
1424
  if (fleft->i == 0)
2✔
1425
    return INTFOLD(0);
×
1426
  return NEXTFOLD;
1427
}
1428

1429
LJFOLD(SUB any any)
1430
LJFOLD(SUBOV any any)
1431
LJFOLDF(simplify_intsub)
1,062✔
1432
{
1433
  if (fins->op1 == fins->op2 && !irt_isnum(fins->t))  /* i - i ==> 0 */
1,062✔
1434
    return irt_is64(fins->t) ? INT64FOLD(0) : INTFOLD(0);
×
1435
  return NEXTFOLD;
1436
}
1437

1438
LJFOLD(SUB ADD any)
1439
LJFOLDF(simplify_intsubadd_leftcancel)
27✔
1440
{
1441
  if (!irt_isnum(fins->t)) {
27✔
1442
    PHIBARRIER(fleft);
21✔
1443
    if (fins->op2 == fleft->op1)  /* (i + j) - i ==> j */
20✔
1444
      return fleft->op2;
17✔
1445
    if (fins->op2 == fleft->op2)  /* (i + j) - j ==> i */
3✔
1446
      return fleft->op1;
×
1447
  }
1448
  return NEXTFOLD;
1449
}
1450

1451
LJFOLD(SUB SUB any)
1452
LJFOLDF(simplify_intsubsub_leftcancel)
113✔
1453
{
1454
  if (!irt_isnum(fins->t)) {
113✔
1455
    PHIBARRIER(fleft);
2✔
1456
    if (fins->op2 == fleft->op1) {  /* (i - j) - i ==> 0 - j */
2✔
1457
      fins->op1 = (IRRef1)lj_ir_kint(J, 0);
×
1458
      fins->op2 = fleft->op2;
×
1459
      return RETRYFOLD;
×
1460
    }
1461
  }
1462
  return NEXTFOLD;
1463
}
1464

1465
LJFOLD(SUB any SUB)
1466
LJFOLDF(simplify_intsubsub_rightcancel)
×
1467
{
1468
  if (!irt_isnum(fins->t)) {
×
1469
    PHIBARRIER(fright);
×
1470
    if (fins->op1 == fright->op1)  /* i - (i - j) ==> j */
×
1471
      return fright->op2;
×
1472
  }
1473
  return NEXTFOLD;
1474
}
1475

1476
LJFOLD(SUB any ADD)
1477
LJFOLDF(simplify_intsubadd_rightcancel)
46✔
1478
{
1479
  if (!irt_isnum(fins->t)) {
46✔
1480
    PHIBARRIER(fright);
45✔
1481
    if (fins->op1 == fright->op1) {  /* i - (i + j) ==> 0 - j */
43✔
1482
      fins->op2 = fright->op2;
6✔
1483
      fins->op1 = (IRRef1)lj_ir_kint(J, 0);
6✔
1484
      return RETRYFOLD;
6✔
1485
    }
1486
    if (fins->op1 == fright->op2) {  /* i - (j + i) ==> 0 - j */
37✔
1487
      fins->op2 = fright->op1;
5✔
1488
      fins->op1 = (IRRef1)lj_ir_kint(J, 0);
5✔
1489
      return RETRYFOLD;
5✔
1490
    }
1491
  }
1492
  return NEXTFOLD;
1493
}
1494

1495
LJFOLD(SUB ADD ADD)
1496
LJFOLDF(simplify_intsubaddadd_cancel)
15✔
1497
{
1498
  if (!irt_isnum(fins->t)) {
15✔
1499
    PHIBARRIER(fleft);
15✔
1500
    PHIBARRIER(fright);
14✔
1501
    if (fleft->op1 == fright->op1) {  /* (i + j1) - (i + j2) ==> j1 - j2 */
13✔
1502
      fins->op1 = fleft->op2;
12✔
1503
      fins->op2 = fright->op2;
12✔
1504
      return RETRYFOLD;
12✔
1505
    }
1506
    if (fleft->op1 == fright->op2) {  /* (i + j1) - (j2 + i) ==> j1 - j2 */
1✔
1507
      fins->op1 = fleft->op2;
×
1508
      fins->op2 = fright->op1;
×
1509
      return RETRYFOLD;
×
1510
    }
1511
    if (fleft->op2 == fright->op1) {  /* (j1 + i) - (i + j2) ==> j1 - j2 */
1✔
1512
      fins->op1 = fleft->op1;
×
1513
      fins->op2 = fright->op2;
×
1514
      return RETRYFOLD;
×
1515
    }
1516
    if (fleft->op2 == fright->op2) {  /* (j1 + i) - (j2 + i) ==> j1 - j2 */
1✔
1517
      fins->op1 = fleft->op1;
×
1518
      fins->op2 = fright->op1;
×
1519
      return RETRYFOLD;
×
1520
    }
1521
  }
1522
  return NEXTFOLD;
1523
}
1524

1525
LJFOLD(BAND any KINT)
1526
LJFOLD(BAND any KINT64)
1527
LJFOLDF(simplify_band_k)
199✔
1528
{
1529
  int64_t k = fright->o == IR_KINT ? (int64_t)fright->i :
199✔
1530
                                     (int64_t)ir_k64(fright)->u64;
9✔
1531
  if (k == 0)  /* i & 0 ==> 0 */
199✔
1532
    return RIGHTFOLD;
1✔
1533
  if (k == -1)  /* i & -1 ==> i */
198✔
1534
    return LEFTFOLD;
×
1535
  return NEXTFOLD;
1536
}
1537

1538
LJFOLD(BOR any KINT)
1539
LJFOLD(BOR any KINT64)
1540
LJFOLDF(simplify_bor_k)
56✔
1541
{
1542
  int64_t k = fright->o == IR_KINT ? (int64_t)fright->i :
56✔
1543
                                     (int64_t)ir_k64(fright)->u64;
6✔
1544
  if (k == 0)  /* i | 0 ==> i */
56✔
1545
    return LEFTFOLD;
×
1546
  if (k == -1)  /* i | -1 ==> -1 */
56✔
1547
    return RIGHTFOLD;
×
1548
  return NEXTFOLD;
1549
}
1550

1551
LJFOLD(BXOR any KINT)
1552
LJFOLD(BXOR any KINT64)
1553
LJFOLDF(simplify_bxor_k)
7✔
1554
{
1555
  int64_t k = fright->o == IR_KINT ? (int64_t)fright->i :
7✔
1556
                                     (int64_t)ir_k64(fright)->u64;
6✔
1557
  if (k == 0)  /* i xor 0 ==> i */
7✔
1558
    return LEFTFOLD;
1✔
1559
  if (k == -1) {  /* i xor -1 ==> ~i */
6✔
1560
    fins->o = IR_BNOT;
×
1561
    fins->op2 = 0;
×
1562
    return RETRYFOLD;
×
1563
  }
1564
  return NEXTFOLD;
1565
}
1566

1567
LJFOLD(BSHL any KINT)
1568
LJFOLD(BSHR any KINT)
1569
LJFOLD(BSAR any KINT)
1570
LJFOLD(BROL any KINT)
1571
LJFOLD(BROR any KINT)
1572
LJFOLDF(simplify_shift_ik)
236✔
1573
{
1574
  int32_t mask = irt_is64(fins->t) ? 63 : 31;
236✔
1575
  int32_t k = (fright->i & mask);
236✔
1576
  if (k == 0)  /* i o 0 ==> i */
236✔
1577
    return LEFTFOLD;
×
1578
  if (k == 1 && fins->o == IR_BSHL) {  /* i << 1 ==> i + i */
236✔
1579
    fins->o = IR_ADD;
15✔
1580
    fins->op2 = fins->op1;
15✔
1581
    return RETRYFOLD;
15✔
1582
  }
1583
  if (k != fright->i) {  /* i o k ==> i o (k & mask) */
221✔
1584
    fins->op2 = (IRRef1)lj_ir_kint(J, k);
×
1585
    return RETRYFOLD;
×
1586
  }
1587
#ifndef LJ_TARGET_UNIFYROT
1588
  if (fins->o == IR_BROR) {  /* bror(i, k) ==> brol(i, (-k)&mask) */
221✔
1589
    fins->o = IR_BROL;
×
1590
    fins->op2 = (IRRef1)lj_ir_kint(J, (-k)&mask);
×
1591
    return RETRYFOLD;
×
1592
  }
1593
#endif
1594
  return NEXTFOLD;
1595
}
1596

1597
LJFOLD(BSHL any BAND)
1598
LJFOLD(BSHR any BAND)
1599
LJFOLD(BSAR any BAND)
1600
LJFOLD(BROL any BAND)
1601
LJFOLD(BROR any BAND)
1602
LJFOLDF(simplify_shift_andk)
×
1603
{
1604
  IRIns *irk = IR(fright->op2);
×
1605
  PHIBARRIER(fright);
×
1606
  if ((fins->o < IR_BROL ? LJ_TARGET_MASKSHIFT : LJ_TARGET_MASKROT) &&
×
1607
      irk->o == IR_KINT) {  /* i o (j & mask) ==> i o j */
×
1608
    int32_t mask = irt_is64(fins->t) ? 63 : 31;
×
1609
    int32_t k = irk->i & mask;
×
1610
    if (k == mask) {
×
1611
      fins->op2 = fright->op1;
×
1612
      return RETRYFOLD;
×
1613
    }
1614
  }
1615
  return NEXTFOLD;
1616
}
1617

1618
LJFOLD(BSHL KINT any)
1619
LJFOLD(BSHR KINT any)
1620
LJFOLD(BSHL KINT64 any)
1621
LJFOLD(BSHR KINT64 any)
1622
LJFOLDF(simplify_shift1_ki)
2✔
1623
{
1624
  int64_t k = fleft->o == IR_KINT ? (int64_t)fleft->i :
2✔
1625
                                    (int64_t)ir_k64(fleft)->u64;
2✔
1626
  if (k == 0)  /* 0 o i ==> 0 */
2✔
1627
    return LEFTFOLD;
×
1628
  return NEXTFOLD;
1629
}
1630

1631
LJFOLD(BSAR KINT any)
1632
LJFOLD(BROL KINT any)
1633
LJFOLD(BROR KINT any)
1634
LJFOLD(BSAR KINT64 any)
1635
LJFOLD(BROL KINT64 any)
1636
LJFOLD(BROR KINT64 any)
1637
LJFOLDF(simplify_shift2_ki)
2✔
1638
{
1639
  int64_t k = fleft->o == IR_KINT ? (int64_t)fleft->i :
2✔
1640
                                    (int64_t)ir_k64(fleft)->u64;
2✔
1641
  if (k == 0 || k == -1)  /* 0 o i ==> 0; -1 o i ==> -1 */
2✔
1642
    return LEFTFOLD;
×
1643
  return NEXTFOLD;
1644
}
1645

1646
LJFOLD(BSHL BAND KINT)
1647
LJFOLD(BSHR BAND KINT)
1648
LJFOLD(BROL BAND KINT)
1649
LJFOLD(BROR BAND KINT)
1650
LJFOLDF(simplify_shiftk_andk)
6✔
1651
{
1652
  IRIns *irk = IR(fleft->op2);
6✔
1653
  PHIBARRIER(fleft);
6✔
1654
  if (irk->o == IR_KINT) {  /* (i & k1) o k2 ==> (i o k2) & (k1 o k2) */
6✔
1655
    int32_t k = kfold_intop(irk->i, fright->i, (IROp)fins->o);
5✔
1656
    fins->op1 = fleft->op1;
5✔
1657
    fins->op1 = (IRRef1)lj_opt_fold(J);
5✔
1658
    fins->op2 = (IRRef1)lj_ir_kint(J, k);
5✔
1659
    fins->ot = IRTI(IR_BAND);
5✔
1660
    return RETRYFOLD;
5✔
1661
  } else if (irk->o == IR_KINT64) {
1✔
1662
    uint64_t k = kfold_int64arith(J, ir_k64(irk)->u64, fright->i,
1✔
1663
                                  (IROp)fins->o);
1✔
1664
    IROpT ot = fleft->ot;
1✔
1665
    fins->op1 = fleft->op1;
1✔
1666
    fins->op1 = (IRRef1)lj_opt_fold(J);
1✔
1667
    fins->op2 = (IRRef1)lj_ir_kint64(J, k);
1✔
1668
    fins->ot = ot;
1✔
1669
    return RETRYFOLD;
1✔
1670
  }
1671
  return NEXTFOLD;
1672
}
1673

1674
LJFOLD(BAND BSHL KINT)
1675
LJFOLD(BAND BSHR KINT)
1676
LJFOLDF(simplify_andk_shiftk)
14✔
1677
{
1678
  IRIns *irk = IR(fleft->op2);
14✔
1679
  if (irk->o == IR_KINT &&
14✔
1680
      kfold_intop(-1, irk->i, (IROp)fleft->o) == fright->i)
14✔
1681
    return LEFTFOLD;  /* (i o k1) & k2 ==> i, if (-1 o k1) == k2 */
2✔
1682
  return NEXTFOLD;
1683
}
1684

1685
LJFOLD(BAND BOR KINT)
1686
LJFOLD(BOR BAND KINT)
1687
LJFOLDF(simplify_andor_k)
×
1688
{
1689
  IRIns *irk = IR(fleft->op2);
×
1690
  PHIBARRIER(fleft);
×
1691
  if (irk->o == IR_KINT) {
×
1692
    int32_t k = kfold_intop(irk->i, fright->i, (IROp)fins->o);
×
1693
    /* (i | k1) & k2 ==> i & k2, if (k1 & k2) == 0. */
1694
    /* (i & k1) | k2 ==> i | k2, if (k1 | k2) == -1. */
1695
    if (k == (fins->o == IR_BAND ? 0 : -1)) {
×
1696
      fins->op1 = fleft->op1;
×
1697
      return RETRYFOLD;
×
1698
    }
1699
  }
1700
  return NEXTFOLD;
1701
}
1702

1703
LJFOLD(BAND BOR KINT64)
1704
LJFOLD(BOR BAND KINT64)
1705
LJFOLDF(simplify_andor_k64)
×
1706
{
1707
#if LJ_HASFFI
1708
  IRIns *irk = IR(fleft->op2);
×
1709
  PHIBARRIER(fleft);
×
1710
  if (irk->o == IR_KINT64) {
×
1711
    uint64_t k = kfold_int64arith(J, ir_k64(irk)->u64, ir_k64(fright)->u64,
×
1712
                                  (IROp)fins->o);
×
1713
    /* (i | k1) & k2 ==> i & k2, if (k1 & k2) == 0. */
1714
    /* (i & k1) | k2 ==> i | k2, if (k1 | k2) == -1. */
1715
    if (k == (fins->o == IR_BAND ? (uint64_t)0 : ~(uint64_t)0)) {
×
1716
      fins->op1 = fleft->op1;
×
1717
      return RETRYFOLD;
×
1718
    }
1719
  }
1720
  return NEXTFOLD;
1721
#else
1722
  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
1723
#endif
1724
}
1725

1726
/* -- Reassociation ------------------------------------------------------- */
1727

1728
LJFOLD(ADD ADD KINT)
1729
LJFOLD(MUL MUL KINT)
1730
LJFOLD(BAND BAND KINT)
1731
LJFOLD(BOR BOR KINT)
1732
LJFOLD(BXOR BXOR KINT)
1733
LJFOLDF(reassoc_intarith_k)
1,406✔
1734
{
1735
  IRIns *irk = IR(fleft->op2);
1,406✔
1736
  if (irk->o == IR_KINT) {
1,406✔
1737
    int32_t k = kfold_intop(irk->i, fright->i, (IROp)fins->o);
1,385✔
1738
    if (k == irk->i)  /* (i o k1) o k2 ==> i o k1, if (k1 o k2) == k1. */
1,385✔
1739
      return LEFTFOLD;
14✔
1740
    PHIBARRIER(fleft);
1,371✔
1741
    fins->op1 = fleft->op1;
102✔
1742
    fins->op2 = (IRRef1)lj_ir_kint(J, k);
102✔
1743
    return RETRYFOLD;  /* (i o k1) o k2 ==> i o (k1 o k2) */
102✔
1744
  }
1745
  return NEXTFOLD;
1746
}
1747

1748
LJFOLD(ADD ADD KINT64)
1749
LJFOLD(MUL MUL KINT64)
1750
LJFOLD(BAND BAND KINT64)
1751
LJFOLD(BOR BOR KINT64)
1752
LJFOLD(BXOR BXOR KINT64)
1753
LJFOLDF(reassoc_intarith_k64)
376✔
1754
{
1755
#if LJ_HASFFI
1756
  IRIns *irk = IR(fleft->op2);
376✔
1757
  if (irk->o == IR_KINT64) {
376✔
1758
    uint64_t k = kfold_int64arith(J, ir_k64(irk)->u64, ir_k64(fright)->u64,
101✔
1759
                                  (IROp)fins->o);
101✔
1760
    PHIBARRIER(fleft);
101✔
1761
    fins->op1 = fleft->op1;
94✔
1762
    fins->op2 = (IRRef1)lj_ir_kint64(J, k);
94✔
1763
    return RETRYFOLD;  /* (i o k1) o k2 ==> i o (k1 o k2) */
94✔
1764
  }
1765
  return NEXTFOLD;
1766
#else
1767
  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
1768
#endif
1769
}
1770

1771
LJFOLD(BAND BAND any)
1772
LJFOLD(BOR BOR any)
1773
LJFOLDF(reassoc_dup)
×
1774
{
1775
  if (fins->op2 == fleft->op1 || fins->op2 == fleft->op2)
×
1776
    return LEFTFOLD;  /* (a o b) o a ==> a o b; (a o b) o b ==> a o b */
×
1777
  return NEXTFOLD;
1778
}
1779

1780
LJFOLD(MIN MIN any)
1781
LJFOLD(MAX MAX any)
1782
LJFOLDF(reassoc_dup_minmax)
6✔
1783
{
1784
  if (fins->op2 == fleft->op2)
6✔
1785
    return LEFTFOLD;  /* (a o b) o b ==> a o b */
×
1786
  return NEXTFOLD;
1787
}
1788

1789
LJFOLD(BXOR BXOR any)
1790
LJFOLDF(reassoc_bxor)
1✔
1791
{
1792
  PHIBARRIER(fleft);
1✔
1793
  if (fins->op2 == fleft->op1)  /* (a xor b) xor a ==> b */
1✔
1794
    return fleft->op2;
×
1795
  if (fins->op2 == fleft->op2)  /* (a xor b) xor b ==> a */
1✔
1796
    return fleft->op1;
×
1797
  return NEXTFOLD;
1798
}
1799

1800
LJFOLD(BSHL BSHL KINT)
1801
LJFOLD(BSHR BSHR KINT)
1802
LJFOLD(BSAR BSAR KINT)
1803
LJFOLD(BROL BROL KINT)
1804
LJFOLD(BROR BROR KINT)
1805
LJFOLDF(reassoc_shift)
1✔
1806
{
1807
  IRIns *irk = IR(fleft->op2);
1✔
1808
  PHIBARRIER(fleft);  /* The (shift any KINT) rule covers k2 == 0 and more. */
1✔
1809
  if (irk->o == IR_KINT) {  /* (i o k1) o k2 ==> i o (k1 + k2) */
1✔
1810
    int32_t mask = irt_is64(fins->t) ? 63 : 31;
1✔
1811
    int32_t k = (irk->i & mask) + (fright->i & mask);
1✔
1812
    if (k > mask) {  /* Combined shift too wide? */
1✔
1813
      if (fins->o == IR_BSHL || fins->o == IR_BSHR)
×
1814
        return mask == 31 ? INTFOLD(0) : INT64FOLD(0);
×
1815
      else if (fins->o == IR_BSAR)
×
1816
        k = mask;
1817
      else
1818
        k &= mask;
×
1819
    }
1820
    fins->op1 = fleft->op1;
1✔
1821
    fins->op2 = (IRRef1)lj_ir_kint(J, k);
1✔
1822
    return RETRYFOLD;
1✔
1823
  }
1824
  return NEXTFOLD;
1825
}
1826

1827
LJFOLD(MIN MIN KINT)
1828
LJFOLD(MAX MAX KINT)
1829
LJFOLDF(reassoc_minmax_k)
×
1830
{
1831
  IRIns *irk = IR(fleft->op2);
×
1832
  if (irk->o == IR_KINT) {
×
1833
    int32_t a = irk->i;
×
1834
    int32_t y = kfold_intop(a, fright->i, fins->o);
×
1835
    if (a == y)  /* (x o k1) o k2 ==> x o k1, if (k1 o k2) == k1. */
×
1836
      return LEFTFOLD;
×
1837
    PHIBARRIER(fleft);
×
1838
    fins->op1 = fleft->op1;
×
1839
    fins->op2 = (IRRef1)lj_ir_kint(J, y);
×
1840
    return RETRYFOLD;  /* (x o k1) o k2 ==> x o (k1 o k2) */
×
1841
  }
1842
  return NEXTFOLD;
1843
}
1844

1845
/* -- Array bounds check elimination -------------------------------------- */
1846

1847
/* Eliminate ABC across PHIs to handle t[i-1] forwarding case.
1848
** ABC(asize, (i+k)+(-k)) ==> ABC(asize, i), but only if it already exists.
1849
** Could be generalized to (i+k1)+k2 ==> i+(k1+k2), but needs better disambig.
1850
*/
1851
LJFOLD(ABC any ADD)
1852
LJFOLDF(abc_fwd)
604✔
1853
{
1854
  if (LJ_LIKELY(J->flags & JIT_F_OPT_ABC)) {
604✔
1855
    if (irref_isk(fright->op2)) {
604✔
1856
      IRIns *add2 = IR(fright->op1);
603✔
1857
      if (add2->o == IR_ADD && irref_isk(add2->op2) &&
603✔
1858
          IR(fright->op2)->i == -IR(add2->op2)->i) {
27✔
1859
        IRRef ref = J->chain[IR_ABC];
5✔
1860
        IRRef lim = add2->op1;
5✔
1861
        if (fins->op1 > lim) lim = fins->op1;
5✔
1862
        while (ref > lim) {
14✔
1863
          IRIns *ir = IR(ref);
13✔
1864
          if (ir->op1 == fins->op1 && ir->op2 == add2->op1)
13✔
1865
            return DROPFOLD;
1866
          ref = ir->prev;
9✔
1867
        }
1868
      }
1869
    }
1870
  }
1871
  return NEXTFOLD;
1872
}
1873

1874
/* Eliminate ABC for constants.
1875
** ABC(asize, k1), ABC(asize k2) ==> ABC(asize, max(k1, k2))
1876
** Drop second ABC if k2 is lower. Otherwise patch first ABC with k2.
1877
*/
1878
LJFOLD(ABC any KINT)
1879
LJFOLDF(abc_k)
1,042✔
1880
{
1881
  if (LJ_LIKELY(J->flags & JIT_F_OPT_ABC)) {
1,042✔
1882
    IRRef ref = J->chain[IR_ABC];
1,042✔
1883
    IRRef asize = fins->op1;
1,042✔
1884
    while (ref > asize) {
1,175✔
1885
      IRIns *ir = IR(ref);
921✔
1886
      if (ir->op1 == asize && irref_isk(ir->op2)) {
921✔
1887
        int32_t k = IR(ir->op2)->i;
788✔
1888
        if (fright->i > k)
788✔
1889
          ir->op2 = fins->op2;
42✔
1890
        return DROPFOLD;
788✔
1891
      }
1892
      ref = ir->prev;
133✔
1893
    }
1894
    return EMITFOLD;  /* Already performed CSE. */
254✔
1895
  }
1896
  return NEXTFOLD;
1897
}
1898

1899
/* Eliminate invariant ABC inside loop. */
1900
LJFOLD(ABC any any)
1901
LJFOLDF(abc_invar)
4,307✔
1902
{
1903
  /* Invariant ABC marked as PTR. Drop if op1 is invariant, too. */
1904
  if (!irt_isint(fins->t) && fins->op1 < J->chain[IR_LOOP] &&
4,307✔
1905
      !irt_isphi(IR(fins->op1)->t))
6✔
1906
    return DROPFOLD;
6✔
1907
  return NEXTFOLD;
1908
}
1909

1910
/* -- Commutativity ------------------------------------------------------- */
1911

1912
/* The refs of commutative ops are canonicalized. Lower refs go to the right.
1913
** Rationale behind this:
1914
** - It (also) moves constants to the right.
1915
** - It reduces the number of FOLD rules (e.g. (BOR any KINT) suffices).
1916
** - It helps CSE to find more matches.
1917
** - The assembler generates better code with constants at the right.
1918
*/
1919

1920
LJFOLD(ADD any any)
1921
LJFOLD(MUL any any)
1922
LJFOLD(ADDOV any any)
1923
LJFOLD(MULOV any any)
1924
LJFOLDF(comm_swap)
24,985✔
1925
{
1926
  if (fins->op1 < fins->op2) {  /* Move lower ref to the right. */
11,010✔
1927
    IRRef1 tmp = fins->op1;
1,864✔
1928
    fins->op1 = fins->op2;
1,864✔
1929
    fins->op2 = tmp;
1,864✔
1930
    return RETRYFOLD;
1,864✔
1931
  }
1932
  return NEXTFOLD;
1933
}
1934

1935
LJFOLD(EQ any any)
1936
LJFOLD(NE any any)
1937
LJFOLDF(comm_equal)
13,766✔
1938
{
1939
  /* For non-numbers only: x == x ==> drop; x ~= x ==> fail */
1940
  if (fins->op1 == fins->op2 && !irt_isnum(fins->t))
13,766✔
1941
    return CONDFOLD(fins->o == IR_EQ);
93✔
1942
  return fold_comm_swap(J);
13,673✔
1943
}
1944

1945
LJFOLD(LT any any)
1946
LJFOLD(GE any any)
1947
LJFOLD(LE any any)
1948
LJFOLD(GT any any)
1949
LJFOLD(ULT any any)
1950
LJFOLD(UGE any any)
1951
LJFOLD(ULE any any)
1952
LJFOLD(UGT any any)
1953
LJFOLDF(comm_comp)
5,908✔
1954
{
1955
  /* For non-numbers only: x <=> x ==> drop; x <> x ==> fail */
1956
  if (fins->op1 == fins->op2 && !irt_isnum(fins->t))
5,908✔
1957
    return CONDFOLD((fins->o ^ (fins->o >> 1)) & 1);
×
1958
  if (fins->op1 < fins->op2) {  /* Move lower ref to the right. */
5,908✔
1959
    IRRef1 tmp = fins->op1;
797✔
1960
    fins->op1 = fins->op2;
797✔
1961
    fins->op2 = tmp;
797✔
1962
    fins->o ^= 3; /* GT <-> LT, GE <-> LE, does not affect U */
797✔
1963
    return RETRYFOLD;
797✔
1964
  }
1965
  return NEXTFOLD;
1966
}
1967

1968
LJFOLD(BAND any any)
1969
LJFOLD(BOR any any)
1970
LJFOLDF(comm_dup)
264✔
1971
{
1972
  if (fins->op1 == fins->op2)  /* x o x ==> x */
264✔
1973
    return LEFTFOLD;
×
1974
  return fold_comm_swap(J);
264✔
1975
}
1976

1977
LJFOLD(MIN any any)
1978
LJFOLD(MAX any any)
1979
LJFOLDF(comm_dup_minmax)
539✔
1980
{
1981
  if (fins->op1 == fins->op2)  /* x o x ==> x */
539✔
1982
    return LEFTFOLD;
×
1983
  return NEXTFOLD;
1984
}
1985

1986
LJFOLD(BXOR any any)
1987
LJFOLDF(comm_bxor)
40✔
1988
{
1989
  if (fins->op1 == fins->op2)  /* i xor i ==> 0 */
40✔
1990
    return irt_is64(fins->t) ? INT64FOLD(0) : INTFOLD(0);
2✔
1991
  return fold_comm_swap(J);
38✔
1992
}
1993

1994
/* -- Simplification of compound expressions ------------------------------ */
1995

1996
static TRef kfold_xload(jit_State *J, IRIns *ir, const void *p)
1997
{
1998
  int32_t k;
1999
  switch (irt_type(ir->t)) {
2000
  case IRT_NUM: return lj_ir_knum_u64(J, *(uint64_t *)p);
2001
  case IRT_I8: k = (int32_t)*(int8_t *)p; break;
2002
  case IRT_U8: k = (int32_t)*(uint8_t *)p; break;
2003
  case IRT_I16: k = (int32_t)(int16_t)lj_getu16(p); break;
2004
  case IRT_U16: k = (int32_t)(uint16_t)lj_getu16(p); break;
2005
  case IRT_INT: case IRT_U32: k = (int32_t)lj_getu32(p); break;
2006
  case IRT_I64: case IRT_U64: return lj_ir_kint64(J, *(uint64_t *)p);
2007
  default: return 0;
2008
  }
2009
  return lj_ir_kint(J, k);
2010
}
2011

2012
/* Turn: string.sub(str, a, b) == kstr
2013
** into: string.byte(str, a) == string.byte(kstr, 1) etc.
2014
** Note: this creates unaligned XLOADs on x86/x64.
2015
*/
2016
LJFOLD(EQ SNEW KGC)
2017
LJFOLD(NE SNEW KGC)
2018
LJFOLDF(merge_eqne_snew_kgc)
81✔
2019
{
2020
  GCstr *kstr = ir_kstr(fright);
81✔
2021
  int32_t len = (int32_t)kstr->len;
81✔
2022
  lj_assertJ(irt_isstr(fins->t), "bad equality IR type");
81✔
2023

2024
#if LJ_TARGET_UNALIGNED
2025
#define FOLD_SNEW_MAX_LEN        4  /* Handle string lengths 0, 1, 2, 3, 4. */
2026
#define FOLD_SNEW_TYPE8                IRT_I8        /* Creates shorter immediates. */
2027
#else
2028
#define FOLD_SNEW_MAX_LEN        1  /* Handle string lengths 0 or 1. */
2029
#define FOLD_SNEW_TYPE8                IRT_U8  /* Prefer unsigned loads. */
2030
#endif
2031

2032
  PHIBARRIER(fleft);
81✔
2033
  if (len <= FOLD_SNEW_MAX_LEN) {
77✔
2034
    IROp op = (IROp)fins->o;
77✔
2035
    IRRef strref = fleft->op1;
77✔
2036
    if (IR(strref)->o != IR_STRREF)
77✔
2037
      return NEXTFOLD;
2038
    if (op == IR_EQ) {
77✔
2039
      emitir(IRTGI(IR_EQ), fleft->op2, lj_ir_kint(J, len));
19✔
2040
      /* Caveat: fins/fleft/fright is no longer valid after emitir. */
2041
    } else {
2042
      /* NE is not expanded since this would need an OR of two conds. */
2043
      if (!irref_isk(fleft->op2))  /* Only handle the constant length case. */
58✔
2044
        return NEXTFOLD;
2045
      if (IR(fleft->op2)->i != len)
56✔
2046
        return DROPFOLD;
2047
    }
2048
    if (len > 0) {
44✔
2049
      /* A 4 byte load for length 3 is ok -- all strings have an extra NUL. */
2050
      uint16_t ot = (uint16_t)(len == 1 ? IRT(IR_XLOAD, FOLD_SNEW_TYPE8) :
43✔
2051
                               len == 2 ? IRT(IR_XLOAD, IRT_U16) :
2052
                               IRTI(IR_XLOAD));
2053
      TRef tmp = emitir(ot, strref,
83✔
2054
                        IRXLOAD_READONLY | (len > 1 ? IRXLOAD_UNALIGNED : 0));
2055
      TRef val = kfold_xload(J, IR(tref_ref(tmp)), strdata(kstr));
43✔
2056
      if (len == 3)
43✔
2057
        tmp = emitir(IRTI(IR_BAND), tmp,
1✔
2058
                     lj_ir_kint(J, LJ_ENDIAN_SELECT(0x00ffffff, 0xffffff00)));
2059
      fins->op1 = (IRRef1)tmp;
43✔
2060
      fins->op2 = (IRRef1)val;
43✔
2061
      fins->ot = (IROpT)IRTGI(op);
43✔
2062
      return RETRYFOLD;
43✔
2063
    } else {
2064
      return DROPFOLD;
2065
    }
2066
  }
2067
  return NEXTFOLD;
2068
}
2069

2070
/* -- Loads --------------------------------------------------------------- */
2071

2072
/* Loads cannot be folded or passed on to CSE in general.
2073
** Alias analysis is needed to check for forwarding opportunities.
2074
**
2075
** Caveat: *all* loads must be listed here or they end up at CSE!
2076
*/
2077

2078
LJFOLD(ALOAD any)
2079
LJFOLDX(lj_opt_fwd_aload)
2080

2081
/* From HREF fwd (see below). Must eliminate, not supported by fwd/backend. */
2082
LJFOLD(HLOAD KKPTR)
2083
LJFOLDF(kfold_hload_kkptr)
3✔
2084
{
2085
  UNUSED(J);
3✔
2086
  lj_assertJ(ir_kptr(fleft) == niltvg(J2G(J)), "expected niltv");
3✔
2087
  return TREF_NIL;
3✔
2088
}
2089

2090
LJFOLD(HLOAD any)
2091
LJFOLDX(lj_opt_fwd_hload)
2092

2093
LJFOLD(ULOAD any)
2094
LJFOLDX(lj_opt_fwd_uload)
2095

2096
LJFOLD(CALLL any IRCALL_lj_tab_len)
2097
LJFOLDX(lj_opt_fwd_tab_len)
2098

2099
/* Upvalue refs are really loads, but there are no corresponding stores.
2100
** So CSE is ok for them, except for UREFO across a GC step (see below).
2101
** If the referenced function is const, its upvalue addresses are const, too.
2102
** This can be used to improve CSE by looking for the same address,
2103
** even if the upvalues originate from a different function.
2104
*/
2105
LJFOLD(UREFO KGC any)
2106
LJFOLD(UREFC KGC any)
2107
LJFOLDF(cse_uref)
639✔
2108
{
2109
  if (LJ_LIKELY(J->flags & JIT_F_OPT_CSE)) {
639✔
2110
    IRRef ref = J->chain[fins->o];
639✔
2111
    GCfunc *fn = ir_kfunc(fleft);
639✔
2112
    GCupval *uv = gco2uv(gcref(fn->l.uvptr[(fins->op2 >> 8)]));
639✔
2113
    while (ref > 0) {
1,066✔
2114
      IRIns *ir = IR(ref);
716✔
2115
      if (irref_isk(ir->op1)) {
716✔
2116
        GCfunc *fn2 = ir_kfunc(IR(ir->op1));
682✔
2117
        if (gco2uv(gcref(fn2->l.uvptr[(ir->op2 >> 8)])) == uv) {
682✔
2118
          if (fins->o == IR_UREFO && gcstep_barrier(J, ref))
289✔
2119
            break;
2120
          return ref;
2121
        }
2122
      }
2123
      ref = ir->prev;
427✔
2124
    }
2125
  }
2126
  return EMITFOLD;
358✔
2127
}
2128

2129
LJFOLD(HREFK any any)
2130
LJFOLDX(lj_opt_fwd_hrefk)
2131

2132
LJFOLD(HREF TNEW any)
2133
LJFOLDF(fwd_href_tnew)
13✔
2134
{
2135
  if (lj_opt_fwd_href_nokey(J))
13✔
2136
    return lj_ir_kkptr(J, niltvg(J2G(J)));
9✔
2137
  return NEXTFOLD;
2138
}
2139

2140
LJFOLD(HREF TDUP KPRI)
2141
LJFOLD(HREF TDUP KGC)
2142
LJFOLD(HREF TDUP KNUM)
2143
LJFOLDF(fwd_href_tdup)
3✔
2144
{
2145
  TValue keyv;
3✔
2146
  lj_ir_kvalue(J->L, &keyv, fright);
3✔
2147
  if (lj_tab_get(J->L, ir_ktab(IR(fleft->op1)), &keyv) == niltvg(J2G(J)) &&
6✔
2148
      lj_opt_fwd_href_nokey(J))
3✔
2149
    return lj_ir_kkptr(J, niltvg(J2G(J)));
3✔
2150
  return NEXTFOLD;
2151
}
2152

2153
/* We can safely FOLD/CSE array/hash refs and field loads, since there
2154
** are no corresponding stores. But we need to check for any NEWREF with
2155
** an aliased table, as it may invalidate all of the pointers and fields.
2156
** Only HREF needs the NEWREF check -- AREF and HREFK already depend on
2157
** FLOADs. And NEWREF itself is treated like a store (see below).
2158
** LREF is constant (per trace) since coroutine switches are not inlined.
2159
*/
2160
LJFOLD(FLOAD TNEW IRFL_TAB_ASIZE)
2161
LJFOLDF(fload_tab_tnew_asize)
864✔
2162
{
2163
  if (LJ_LIKELY(J->flags & JIT_F_OPT_FOLD) && lj_opt_fwd_tptr(J, fins->op1))
864✔
2164
    return INTFOLD(fleft->op1);
174✔
2165
  return NEXTFOLD;
2166
}
2167

2168
LJFOLD(FLOAD TNEW IRFL_TAB_HMASK)
2169
LJFOLDF(fload_tab_tnew_hmask)
57✔
2170
{
2171
  if (LJ_LIKELY(J->flags & JIT_F_OPT_FOLD) && lj_opt_fwd_tptr(J, fins->op1))
57✔
2172
    return INTFOLD((1 << fleft->op2)-1);
41✔
2173
  return NEXTFOLD;
2174
}
2175

2176
LJFOLD(FLOAD TDUP IRFL_TAB_ASIZE)
2177
LJFOLDF(fload_tab_tdup_asize)
43✔
2178
{
2179
  if (LJ_LIKELY(J->flags & JIT_F_OPT_FOLD) && lj_opt_fwd_tptr(J, fins->op1))
43✔
2180
    return INTFOLD((int32_t)ir_ktab(IR(fleft->op1))->asize);
31✔
2181
  return NEXTFOLD;
2182
}
2183

2184
LJFOLD(FLOAD TDUP IRFL_TAB_HMASK)
2185
LJFOLDF(fload_tab_tdup_hmask)
120✔
2186
{
2187
  if (LJ_LIKELY(J->flags & JIT_F_OPT_FOLD) && lj_opt_fwd_tptr(J, fins->op1))
120✔
2188
    return INTFOLD((int32_t)ir_ktab(IR(fleft->op1))->hmask);
103✔
2189
  return NEXTFOLD;
2190
}
2191

2192
LJFOLD(HREF any any)
2193
LJFOLD(FLOAD any IRFL_TAB_ARRAY)
2194
LJFOLD(FLOAD any IRFL_TAB_NODE)
2195
LJFOLD(FLOAD any IRFL_TAB_ASIZE)
2196
LJFOLD(FLOAD any IRFL_TAB_HMASK)
2197
LJFOLDF(fload_tab_ah)
29,560✔
2198
{
2199
  TRef tr = lj_opt_cse(J);
29,560✔
2200
  return lj_opt_fwd_tptr(J, tref_ref(tr)) ? tr : EMITFOLD;
29,560✔
2201
}
2202

2203
/* Strings are immutable, so we can safely FOLD/CSE the related FLOAD. */
2204
LJFOLD(FLOAD KGC IRFL_STR_LEN)
2205
LJFOLDF(fload_str_len_kgc)
84✔
2206
{
2207
  if (LJ_LIKELY(J->flags & JIT_F_OPT_FOLD))
84✔
2208
    return INTFOLD((int32_t)ir_kstr(fleft)->len);
84✔
2209
  return NEXTFOLD;
2210
}
2211

2212
LJFOLD(FLOAD SNEW IRFL_STR_LEN)
2213
LJFOLDF(fload_str_len_snew)
3✔
2214
{
2215
  if (LJ_LIKELY(J->flags & JIT_F_OPT_FOLD)) {
3✔
2216
    PHIBARRIER(fleft);
3✔
2217
    return fleft->op2;
3✔
2218
  }
2219
  return NEXTFOLD;
2220
}
2221

2222
LJFOLD(FLOAD TOSTR IRFL_STR_LEN)
2223
LJFOLDF(fload_str_len_tostr)
×
2224
{
2225
  if (LJ_LIKELY(J->flags & JIT_F_OPT_FOLD) && fleft->op2 == IRTOSTR_CHAR)
×
2226
    return INTFOLD(1);
×
2227
  return NEXTFOLD;
2228
}
2229

2230
/* The C type ID of cdata objects is immutable. */
2231
LJFOLD(FLOAD KGC IRFL_CDATA_CTYPEID)
2232
LJFOLDF(fload_cdata_typeid_kgc)
87✔
2233
{
2234
  if (LJ_LIKELY(J->flags & JIT_F_OPT_FOLD))
87✔
2235
    return INTFOLD((int32_t)ir_kcdata(fleft)->ctypeid);
87✔
2236
  return NEXTFOLD;
2237
}
2238

2239
/* Get the contents of immutable cdata objects. */
2240
LJFOLD(FLOAD KGC IRFL_CDATA_PTR)
2241
LJFOLD(FLOAD KGC IRFL_CDATA_INT)
2242
LJFOLD(FLOAD KGC IRFL_CDATA_INT64)
2243
LJFOLDF(fload_cdata_int64_kgc)
57✔
2244
{
2245
  if (LJ_LIKELY(J->flags & JIT_F_OPT_FOLD)) {
57✔
2246
    void *p = cdataptr(ir_kcdata(fleft));
57✔
2247
    if (irt_is64(fins->t))
57✔
2248
      return INT64FOLD(*(uint64_t *)p);
53✔
2249
    else
2250
      return INTFOLD(*(int32_t *)p);
4✔
2251
  }
2252
  return NEXTFOLD;
2253
}
2254

2255
LJFOLD(FLOAD CNEW IRFL_CDATA_CTYPEID)
2256
LJFOLD(FLOAD CNEWI IRFL_CDATA_CTYPEID)
2257
LJFOLDF(fload_cdata_typeid_cnew)
232✔
2258
{
2259
  if (LJ_LIKELY(J->flags & JIT_F_OPT_FOLD))
232✔
2260
    return fleft->op1;  /* No PHI barrier needed. CNEW/CNEWI op1 is const. */
232✔
2261
  return NEXTFOLD;
2262
}
2263

2264
/* Pointer, int and int64 cdata objects are immutable. */
2265
LJFOLD(FLOAD CNEWI IRFL_CDATA_PTR)
2266
LJFOLD(FLOAD CNEWI IRFL_CDATA_INT)
2267
LJFOLD(FLOAD CNEWI IRFL_CDATA_INT64)
2268
LJFOLDF(fload_cdata_ptr_int64_cnew)
192✔
2269
{
2270
  if (LJ_LIKELY(J->flags & JIT_F_OPT_FOLD))
192✔
2271
    return fleft->op2;  /* Fold even across PHI to avoid allocations. */
192✔
2272
  return NEXTFOLD;
2273
}
2274

2275
LJFOLD(FLOAD any IRFL_STR_LEN)
2276
LJFOLD(FLOAD any IRFL_FUNC_ENV)
2277
LJFOLD(FLOAD any IRFL_THREAD_ENV)
2278
LJFOLD(FLOAD any IRFL_CDATA_CTYPEID)
2279
LJFOLD(FLOAD any IRFL_CDATA_PTR)
2280
LJFOLD(FLOAD any IRFL_CDATA_INT)
2281
LJFOLD(FLOAD any IRFL_CDATA_INT64)
2282
LJFOLD(VLOAD any any)  /* Vararg loads have no corresponding stores. */
2283
LJFOLDX(lj_opt_cse)
2284

2285
/* All other field loads need alias analysis. */
2286
LJFOLD(FLOAD any any)
2287
LJFOLDX(lj_opt_fwd_fload)
2288

2289
/* This is for LOOP only. Recording handles SLOADs internally. */
2290
LJFOLD(SLOAD any any)
2291
LJFOLDF(fwd_sload)
4,418✔
2292
{
2293
  if ((fins->op2 & IRSLOAD_FRAME)) {
4,418✔
2294
    TRef tr = lj_opt_cse(J);
38✔
2295
    return tref_ref(tr) < J->chain[IR_RETF] ? EMITFOLD : tr;
38✔
2296
  } else {
2297
    lj_assertJ(J->slot[fins->op1] != 0, "uninitialized slot accessed");
4,380✔
2298
    return J->slot[fins->op1];
4,380✔
2299
  }
2300
}
2301

2302
/* Only fold for KKPTR. The pointer _and_ the contents must be const. */
2303
LJFOLD(XLOAD KKPTR any)
2304
LJFOLDF(xload_kptr)
41✔
2305
{
2306
  TRef tr = kfold_xload(J, fins, ir_kptr(fleft));
41✔
2307
  return tr ? tr : NEXTFOLD;
41✔
2308
}
2309

2310
LJFOLD(XLOAD any any)
2311
LJFOLDX(lj_opt_fwd_xload)
2312

2313
/* -- Write barriers ------------------------------------------------------ */
2314

2315
/* Write barriers are amenable to CSE, but not across any incremental
2316
** GC steps.
2317
**
2318
** The same logic applies to open upvalue references, because a stack
2319
** may be resized during a GC step (not the current stack, but maybe that
2320
** of a coroutine).
2321
*/
2322
LJFOLD(TBAR any)
2323
LJFOLD(OBAR any any)
2324
LJFOLD(UREFO any any)
2325
LJFOLDF(barrier_tab)
1,651✔
2326
{
2327
  TRef tr = lj_opt_cse(J);
1,651✔
2328
  if (gcstep_barrier(J, tref_ref(tr)))  /* CSE across GC step? */
1,651✔
2329
    return EMITFOLD;  /* Raw emit. Assumes fins is left intact by CSE. */
48✔
2330
  return tr;
2331
}
2332

2333
LJFOLD(TBAR TNEW)
2334
LJFOLD(TBAR TDUP)
2335
LJFOLDF(barrier_tnew_tdup)
121✔
2336
{
2337
  /* New tables are always white and never need a barrier. */
2338
  if (fins->op1 < J->chain[IR_LOOP])  /* Except across a GC step. */
121✔
2339
    return NEXTFOLD;
×
2340
  return DROPFOLD;
2341
}
2342

2343
/* -- Profiling ----------------------------------------------------------- */
2344

2345
LJFOLD(PROF any any)
2346
LJFOLDF(prof)
1✔
2347
{
2348
  IRRef ref = J->chain[IR_PROF];
1✔
2349
  if (ref+1 == J->cur.nins)  /* Drop neighbouring IR_PROF. */
1✔
2350
    return ref;
2351
  return EMITFOLD;
1✔
2352
}
2353

2354
/* -- Stores and allocations ---------------------------------------------- */
2355

2356
/* Stores and allocations cannot be folded or passed on to CSE in general.
2357
** But some stores can be eliminated with dead-store elimination (DSE).
2358
**
2359
** Caveat: *all* stores and allocs must be listed here or they end up at CSE!
2360
*/
2361

2362
LJFOLD(ASTORE any any)
2363
LJFOLD(HSTORE any any)
2364
LJFOLDX(lj_opt_dse_ahstore)
2365

2366
LJFOLD(USTORE any any)
2367
LJFOLDX(lj_opt_dse_ustore)
2368

2369
LJFOLD(FSTORE any any)
2370
LJFOLDX(lj_opt_dse_fstore)
2371

2372
LJFOLD(XSTORE any any)
2373
LJFOLDX(lj_opt_dse_xstore)
2374

2375
LJFOLD(NEWREF any any)  /* Treated like a store. */
2376
LJFOLD(CALLA any any)
2377
LJFOLD(CALLL any any)  /* Safeguard fallback. */
2378
LJFOLD(CALLS any any)
2379
LJFOLD(CALLXS any any)
2380
LJFOLD(XBAR)
2381
LJFOLD(RETF any any)  /* Modifies BASE. */
2382
LJFOLD(TNEW any any)
2383
LJFOLD(TDUP any)
2384
LJFOLD(CNEW any any)
2385
LJFOLD(XSNEW any any)
2386
LJFOLD(BUFHDR any any)
2387
LJFOLDX(lj_ir_emit)
2388

2389
/* ------------------------------------------------------------------------ */
2390

2391
/* Every entry in the generated hash table is a 32 bit pattern:
2392
**
2393
** xxxxxxxx iiiiiii lllllll rrrrrrrrrr
2394
**
2395
**   xxxxxxxx = 8 bit index into fold function table
2396
**    iiiiiii = 7 bit folded instruction opcode
2397
**    lllllll = 7 bit left instruction opcode
2398
** rrrrrrrrrr = 8 bit right instruction opcode or 10 bits from literal field
2399
*/
2400

2401
#include "lj_folddef.h"
2402

2403
/* ------------------------------------------------------------------------ */
2404

2405
/* Fold IR instruction. */
2406
TRef LJ_FASTCALL lj_opt_fold(jit_State *J)
136,234✔
2407
{
2408
  uint32_t key, any;
136,234✔
2409
  IRRef ref;
136,234✔
2410

2411
  if (LJ_UNLIKELY((J->flags & JIT_F_OPT_MASK) != JIT_F_OPT_DEFAULT)) {
136,234✔
2412
    lj_assertJ(((JIT_F_OPT_FOLD|JIT_F_OPT_FWD|JIT_F_OPT_CSE|JIT_F_OPT_DSE) |
2,780✔
2413
                JIT_F_OPT_DEFAULT) == JIT_F_OPT_DEFAULT,
2414
               "bad JIT_F_OPT_DEFAULT");
2415
    /* Folding disabled? Chain to CSE, but not for loads/stores/allocs. */
2416
    if (!(J->flags & JIT_F_OPT_FOLD) && irm_kind(lj_ir_mode[fins->o]) == IRM_N)
2,780✔
2417
      return lj_opt_cse(J);
454✔
2418

2419
    /* No FOLD, forwarding or CSE? Emit raw IR for loads, except for SLOAD. */
2420
    if ((J->flags & (JIT_F_OPT_FOLD|JIT_F_OPT_FWD|JIT_F_OPT_CSE)) !=
2,326✔
2421
                    (JIT_F_OPT_FOLD|JIT_F_OPT_FWD|JIT_F_OPT_CSE) &&
723✔
2422
        irm_kind(lj_ir_mode[fins->o]) == IRM_L && fins->o != IR_SLOAD)
723✔
2423
      return lj_ir_emit(J);
512✔
2424

2425
    /* No FOLD or DSE? Emit raw IR for stores. */
2426
    if ((J->flags & (JIT_F_OPT_FOLD|JIT_F_OPT_DSE)) !=
1,814✔
2427
                    (JIT_F_OPT_FOLD|JIT_F_OPT_DSE) &&
55✔
2428
        irm_kind(lj_ir_mode[fins->o]) == IRM_S)
55✔
2429
      return lj_ir_emit(J);
24✔
2430
  }
2431

2432
  /* Fold engine start/retry point. */
2433
retry:
135,244✔
2434
  /* Construct key from opcode and operand opcodes (unless literal/none). */
2435
  key = ((uint32_t)fins->o << 17);
138,439✔
2436
  if (fins->op1 >= J->cur.nk) {
138,439✔
2437
    key += (uint32_t)IR(fins->op1)->o << 10;
133,694✔
2438
    *fleft = *IR(fins->op1);
133,694✔
2439
    if (fins->op1 < REF_TRUE)
133,694✔
2440
      fleft[1] = IR(fins->op1)[1];
15,939✔
2441
  }
2442
  if (fins->op2 >= J->cur.nk) {
138,439✔
2443
    key += (uint32_t)IR(fins->op2)->o;
75,280✔
2444
    *fright = *IR(fins->op2);
75,280✔
2445
    if (fins->op2 < REF_TRUE)
75,280✔
2446
      fright[1] = IR(fins->op2)[1];
50,612✔
2447
  } else {
2448
    key += (fins->op2 & 0x3ffu);  /* Literal mask. Must include IRCONV_*MASK. */
63,159✔
2449
  }
2450

2451
  /* Check for a match in order from most specific to least specific. */
2452
  any = 0;
138,439✔
2453
  for (;;) {
690,471✔
2454
    uint32_t k = key | (any & 0x1ffff);
414,455✔
2455
    uint32_t h = fold_hashkey(k);
414,455✔
2456
    uint32_t fh = fold_hash[h];  /* Lookup key in semi-perfect hash table. */
414,455✔
2457
    if ((fh & 0xffffff) == k || (fh = fold_hash[h+1], (fh & 0xffffff) == k)) {
414,455✔
2458
      ref = (IRRef)tref_ref(fold_func[fh >> 24](J));
137,541✔
2459
      if (ref != NEXTFOLD)
137,541✔
2460
        break;
2461
    }
2462
    if (any == 0xfffff)  /* Exhausted folding. Pass on to CSE. */
319,489✔
2463
      return lj_opt_cse(J);
43,473✔
2464
    any = (any | (any >> 10)) ^ 0xffc00;
276,016✔
2465
  }
2466

2467
  /* Return value processing, ordered by frequency. */
2468
  if (LJ_LIKELY(ref >= MAX_FOLD))
94,966✔
2469
    return TREF(ref, irt_t(IR(ref)->t));
82,769✔
2470
  if (ref == RETRYFOLD)
12,197✔
2471
    goto retry;
3,195✔
2472
  if (ref == KINTFOLD)
9,002✔
2473
    return lj_ir_kint(J, fins->i);
1,020✔
2474
  if (ref == FAILFOLD)
7,982✔
2475
    lj_trace_err(J, LJ_TRERR_GFAIL);
1✔
2476
  lj_assertJ(ref == DROPFOLD, "bad fold result");
2477
  return REF_DROP;
2478
}
2479

2480
/* -- Common-Subexpression Elimination ------------------------------------ */
2481

2482
/* CSE an IR instruction. This is very fast due to the skip-list chains. */
2483
TRef LJ_FASTCALL lj_opt_cse(jit_State *J)
89,191✔
2484
{
2485
  /* Avoid narrow to wide store-to-load forwarding stall */
2486
  IRRef2 op12 = (IRRef2)fins->op1 + ((IRRef2)fins->op2 << 16);
89,191✔
2487
  IROp op = fins->o;
89,191✔
2488
  if (LJ_LIKELY(J->flags & JIT_F_OPT_CSE)) {
89,191✔
2489
    /* Limited search for same operands in per-opcode chain. */
2490
    IRRef ref = J->chain[op];
88,737✔
2491
    IRRef lim = fins->op1;
88,737✔
2492
    if (fins->op2 > lim) lim = fins->op2;  /* Relies on lit < REF_BIAS. */
88,737✔
2493
    while (ref > lim) {
181,635✔
2494
      if (IR(ref)->op12 == op12)
127,196✔
2495
        return TREF(ref, irt_t(IR(ref)->t));  /* Common subexpression found. */
34,298✔
2496
      ref = IR(ref)->prev;
92,898✔
2497
    }
2498
  }
2499
  /* Otherwise emit IR (inlined for speed). */
2500
  {
2501
    IRRef ref = lj_ir_nextins(J);
54,893✔
2502
    IRIns *ir = IR(ref);
54,893✔
2503
    ir->prev = J->chain[op];
54,893✔
2504
    ir->op12 = op12;
54,893✔
2505
    J->chain[op] = (IRRef1)ref;
54,893✔
2506
    ir->o = fins->o;
54,893✔
2507
    J->guardemit.irt |= fins->t.irt;
54,893✔
2508
    return TREF(ref, irt_t((ir->t = fins->t)));
54,893✔
2509
  }
2510
}
2511

2512
/* CSE with explicit search limit. */
2513
TRef LJ_FASTCALL lj_opt_cselim(jit_State *J, IRRef lim)
3,679✔
2514
{
2515
  IRRef ref = J->chain[fins->o];
3,679✔
2516
  IRRef2 op12 = (IRRef2)fins->op1 + ((IRRef2)fins->op2 << 16);
3,679✔
2517
  while (ref > lim) {
13,250✔
2518
    if (IR(ref)->op12 == op12)
11,148✔
2519
      return ref;
1,577✔
2520
    ref = IR(ref)->prev;
9,571✔
2521
  }
2522
  return lj_ir_emit(J);
2,102✔
2523
}
2524

2525
/* ------------------------------------------------------------------------ */
2526

2527
#undef IR
2528
#undef fins
2529
#undef fleft
2530
#undef fright
2531
#undef knumleft
2532
#undef knumright
2533
#undef emitir
2534

2535
#endif
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2025 Coveralls, Inc